On Tue, Mar 20, 2012 at 20:04, Orivej Desh orivej@gmx.fr wrote:
Now, the topic of supporting specifying source encoding is a year away. Should I have not replied to it and rather started a new one?
I think you did well to reply.
You need to send me a patch to ASDF that modifies (defmethod perform ((operation compile-op) (c cl-source-file)) ...) and (defmethod perform ((operation load-source-op) (c cl-source-file)) ...) to do something about external-format.
I propose the attached file.
Thanks.
I know that Stelian Ionescu was also working on it, so I'm giving him an opportunity to chime in before I merge that.
Also, I agree with Stelian that it's better to standardize on one default encoding for all files to be loaded by ASDF. If we do, then there's a chance that things will work without user configuration. If we don't, we're pushing configuration onto the user, and guaranteeing misery for newbies, and hard-to-debug situations even for seasoned users. These days, UTF-8 looks like the obvious encoding to standardize on. And on implementations that don't support UTF-8, some 8-bit-clean encoding that will at least accept UTF-8 encoded comments and has a chance of doing the right things with strings and symbols.
Therefore, we'd use something like that:
(defparameter *utf-8-external-format* #+sbcl :utf-8 ... #-(or sbcl ...) :default "external-format argument to pass for CL:OPEN to accept UTF-8 encoded source code")
Also it might or might not be a good idea to store the external-format in a slot of cl-source-file, and to have a proper :initform in it with a valid default value to be used when upgrading ASDF.
It stores encoding in a property of the component, the component being a system or a source file. This allows for both per system and per source file component encoding, the latter taking precedence, without additional effort. In my implementation default :initform would not have helped because #'component-encoding switches between per component system encoding and per component encoding based on the former being specified or not. Hence the default (:default) is embedded in #'component-encoding.
I think you're doing the right thing, except that (1) we should probably use "external-format" instead of "encoding", since that's what the CLHS calls it, and that (2) the default should be *utf-8-external-format*. Then there's the whole horror of CR/LF that I'm trying to not think about.
The problem for you will be to reasonably support 11 existing implementations or so.
Since making single specification portable requires comparing all external formats of all supported implementations, I think it is reasonable to leave it to the author of a system definition to research by which name his preferred encoding is accessible in different implementations he wants to support, and to specify appropriate read-time conditionals.
I think it's OK to require authors who want non-default settings to do their research on how to do it on each and every platform they want to support (or depend on a library that does it for them). But I think it's a mistake to fail to provide a sensible default, which in effect forces EVERYONE to do to the research or face crazy error situations in some of their users.
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org The Constitution may not be perfect, but it's a lot better than what we've got!