Faré fahree@gmail.com writes:
On Fri, Apr 13, 2012 at 02:44, Douglas Crosher dtc-asdf@scieneer.com wrote:
The only practical solution seems to be to detect the encoding from the file. I could write portable code for ASDF to read an ASCII header line and look for encoding declarations, and handle a few common headers (emacs has 'coding', LispWorks seems to use 'encoding' or 'external-format'). Auto-detection could handle some of the common codings, but could be a big chunk of code. The quicklisp project may be prepared to patch in headers to system definition file using non-ASCII encodings, and this could be largely automated.
Yes, this is a valid approach, though it is somewhat heavy in coding and will grow ASDF by a few hundred more lines of code. Don't forget to support the way Emacs detects encoding, etc. It is certainly more than I am willing to code, and making the semantics of loading more complex than I am comfortable with. Before you code it yourself,
There's no need to code it yourself, I've already done it: https://gitorious.org/com-informatimago/com-informatimago/blobs/master/tools...
I'd like to hear about other users here what they think.
An additional small thing I don't like about the approach is that you have to open a file twice, once to detect encoding, the other time to load or compile-file it, which is not atomic and can be slightly nasty (if e.g. the file is actually a URL or mounted on a weird filesystem or whatnot). But that's secondary.
You have to do what you have to do. FILE-LENGTH has to read the whole file to compute the number of characters in an UTF-8 file.
Also, I'm not sure how big the market for such support is. There again, I'd like to hear from potential users.
Having written the above, I'd tend to be in favor of this approach. On the other hand, nowadays I put -*- coding:utf-8 -*- in all my files…
PS: This long discussion on a relatively minor topic reminds me of Parkinson's Law of Triviality. What color should the bikeshed be painted?
It's not entirely trivial, you can spend quite some time on encoding problems.