29.01.2011, 19:36, "Faré" fahree@gmail.com:
Dear Anton,
Sorry, I see no trace of *compile-file-external-format* ... it seems to rely on some local patch to ASDF that was never merged upstream.
You are right! Now I remember, when I worked on that project several years ago I just opened asdf.lisp, found the compile-file call and introduced the *compile-file-external-format* there, and then passed the encoding via this variable.
I am not undertaking the patch now, because the project I am working on will only be started on my development machine and my server, and I can use some easy workaround, e.g. most lisps accept default encoding as a command line argument.
My first letter was to ensure I am not overlooking a standard way for specifying the encoding.
Anyway, thank you for the info, it's interesting to know.
Also, several notes, which may be useful later, when someone will implement the patch eventually.
In 99.9% of cases it is enough to specify encoding for the whole system, not for separate files. Only in some extraordinary case the system author would chose to store source files in different encodings.
Also it might or might not be a good idea to store the external-format in a slot of cl-source-file, and to have a proper :initform in it with a valid default value to be used when upgrading ASDF.
How the slots are populated from the defsystem expression?
E.g. if I have
(:file "package" :enc :utf-8)
will the :enc :utf-8 be passed as initargs to (make-instance 'cl-source-file)?
Or for
(defsystem :mysystem :version "0.1.0" :serial t :enc :utf-8 ....
Are these attributes passed to the component instantiation as initargs?
The problem for you will be to reasonably support 11 implementations existing implementations or so.
Actually, not a big problem. We will just create a mapping from the encoding specifications allowed in .asd files to the encoding specification of the underlying compiler.
Like
(defun enc (enc) (case enc ((:utf8 :utf-8) #+:clisp 'charset:utf-8 #+:sbcl :utf8 #+ccl :utf-8 ....) ((:cp1251 :cp-1251) #+:clisp 'charset:cp1251 #+:sbcl :cp1251 #+ccl :cp-1251) ...) ... )
Would you accept a patch with support only 7-10 the most important encodings (all unicodes + several the most frequent single-byte encodings)?
29.01.2011, 20:15, "Cyrus Harmon" ch-lisp@bobobeach.com:
asdf:*load-external-format* perhaps?
Does the problem with national characters in .asd files really exits? Do you use non ASCII characters in .asd files?
asdf:*load-external-format* would be more flexible than a hard-coded encoding, but it still doesn't solve the problem you mentioned: handling several .asd files with different encodings.
If start improvements, IMHO enforcing UTF-8 is a good start and should be enough (the option 4 listed by Fare).
If more is needed, a complete solution allowing per .asd encoding specification is better. We need to chose a good notation, that will allow reasonably simple implementation.
It might be either Emacs comment in the first line ;;; -*- coding: utf-8; -*-
Or special lisp form: (asdf:asd-file-encoding :utf-8)
But interpretation of that form will require switching encoding of the lisp reader stream, which I believe will be problematic on some Lisps. Therefore it will require feeding the reader from our custom input stream implementation, like flexi-streams. And still it will be not good enough, because only ASDF will create that special stream for the .asd files, when you execute it from REPL/SLIME, the meaning of that expression is unclear.
Another alternative, is naming conventions for .asd files: mysystem.utf-8.asd. It's simple to implement, and after some thinking, it seems better than the two suggestions above.
But again, we should decide if the problem really exists and avoid solving problems that we don't have. I personally never use national characters in .asd files.
Best regards, - Anton