Note that a file having all octects below #x80 does not ensure it is ASCII, just that it does not have any UTF-8 specific codes.
Indeed, but that's a good guess, and it works in practice.
The default should still be :default because it may be an encoding the CL implementation is aware of.
I vehemently disagree: my policy is to always favor a deterministic behaviour of either working or breaking everywhere, over working in some places but breaking in other places. This makes debugging things much easier, and is the whole point of an abstraction layer. It's a principle I've strictly adhered to while developing ASDF 2, and will enforce as long as I'm ASDF maintainer. See the ASDF 2 paper: http://common-lisp.net/project/asdf/ilc2010draft.pdf
No one actually wants :default. Everyone knows (or should know) which encoding they are using, and if it's not standard, they should explicitly specify it. Everyone wants to actually use said an implicit or explicit encoding, or if not available, a predictable fallback that only depends on the implementation on not on any configurable setting. The only case where :default should ever be used is when it is the implementation's only option.
From the lambda-reader source code:
;;; Note that this file uses UTF-8. […]
Making code dependent on the file encoding is not recommended,
Lambda-reader is not actually dependend on the file encoding. Actually I went to great pains to make it work independently from the encoding. All it depends on is that client libraries either use the same encoding, or otherwise use another encoding that ends up interning symbols using the same lambda character(s) as used by the library. i.e. transcoding the library and/or its clients so one uses a lambda from utf-8 and the other a lambda iso-8859-7, where each file is read with the appropriate external-format, should work just fine. The comment is still valid assuming you use the pristine utf-8 source code.
I updated the comment to make it more painfully clear what the actual constraints are.
and writing a library that requires code that uses it to be in the same encoding is hardly defensible.
All anyone ever requires is that the characters read by client and server match. I'm making no other demand.
And yet, I see no problem about a file including comments that are aware of what is the encoding it is authored and distributed in. If someone recodes the file, he certainly cannot blame the original author for the file not working anymore, or including outdated comments; whoever modifies the file, including transcoding it, is taking responsibility for it.
If another author decides to do this then the Quicklisp releases become fractured. Please do not let this into a Quicklisp release.
Why would you transcode files in Quicklisp? Then again, if you do, you're taking responsibility for the results. Don't blame authors for what you do.
A tool to automatically add the coding file option has been written. There is no need to contact library authors any further, requesting them to recode their files, as I am confident we can work with their code as it is. The tool can also recode files to UTF-8 or attempt to recode to ISO-8859-1.
Recoding everything to use UTF-8 is an interesting approach for sure. (Although — will you avoid doing that on MCL-specific files?) What does Zach think of it?
For the sake of people getting their source upstream, I still think it's useful to encourage authors to use UTF-8 everywhere (without BOM).
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org Procrastination is great. It gives me a lot more time to do things that I'm never going to do.