A draft version adding support for reading the encoding from the file options header is available at: http://www.scieneer.com/files/asdf-encoding-file-option.lisp
It has a bias towards UTF-8 which is used if other encodings are not detected or declared in the file options and if the file is valid UTF-8 with UTF-8 specific sequences. I don't expect too many false positives from the UTF-8 detector. I would not suggest trying to detect any further encodings.
For UTF-8 files, no action needs to be taken when upgrading ASDF - ASDF will reliably detect them and load and compile them as UTF-8 rather than using the :default CL external-format. Files with other encodings, that are not detected, will load and compile using the :default external-format as is currently the case - library authors can add file options headers in order for such files to load and compile reliably across systems with a range of default external-formats. There would not appear to be any migration loss or inconvenience for anyone, except if there incorrect encoding file options that need to be fixed.
For 8 bit CL implementations, the encoding detection and file options reading could probably just be disabled, or perhaps it could remain and issue warnings for clearly incompatible encodings.
This may offer a solution to the problem of defining the system definition file encoding, is convenient for UTF-8 users, and provides a reliable mechanism for writing portable libraries in other encodings.
An encoding file option could also be handy for other tools, such as editors, web servers, tools for recoding lisp source files, etc. I think it warrants some consideration.
You do a lot for ASDF and deserve thanks.
Regards Douglas Crosher
On 04/15/2012 11:00 AM, Faré wrote:
On Fri, Apr 13, 2012 at 02:44, Douglas Crosher dtc-asdf@scieneer.com wrote:
The only practical solution seems to be to detect the encoding from the file. I could write portable code for ASDF to read an ASCII header line and look for encoding declarations, and handle a few common headers (emacs has 'coding', LispWorks seems to use 'encoding' or 'external-format'). Auto-detection could handle some of the common codings, but could be a big chunk of code. The quicklisp project may be prepared to patch in headers to system definition file using non-ASCII encodings, and this could be largely automated.
Yes, this is a valid approach, though it is somewhat heavy in coding and will grow ASDF by a few hundred more lines of code. Don't forget to support the way Emacs detects encoding, etc. It is certainly more than I am willing to code, and making the semantics of loading more complex than I am comfortable with. Before you code it yourself, I'd like to hear about other users here what they think.
An additional small thing I don't like about the approach is that you have to open a file twice, once to detect encoding, the other time to load or compile-file it, which is not atomic and can be slightly nasty (if e.g. the file is actually a URL or mounted on a weird filesystem or whatnot). But that's secondary.
Also, I'm not sure how big the market for such support is. There again, I'd like to hear from potential users.
If infrastructure is added for the system definition files then it would be only a small step to also use this for the lisp source files.
Indeed.
Alternatively, this could be an :automatic mode added to asdf-encodings, rather than a part of ASDF itself, at which point it would be available to source files, but not system files.
Lispworks appears to be able to automatically detect file coding, and it would be interesting to know if the ASDF encoding problems are not an issue for LispWorks users? If so then this would appear to add more support to making the default :default. http://www.lispworks.com/documentation/lw61/LW/html/lw-659.htm#39723
If you want your code to be portable, you can't rely on users using LispWorks. Deterministic well-defined semantics require that the meaning of your code should not depend on magic that may or may not happen.
PS: This long discussion on a relatively minor topic reminds me of Parkinson's Law of Triviality. What color should the bikeshed be painted?
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org Classical Liberalism: the only truly subversive ideology.