On Sun, Mar 25, 2012 at 22:47, Orivej Desh c@orivej.org wrote:
Is there some reason why we must put the external-format into the property list instead of just giving it a slot in the component class definition?
It was my ignorance.
I thought it was to allow a backwards-compatible syntax of (:file "foo" :properties (:encoding :latin1)) If instead we want to encourage people to use (:file "foo" :encoding :latin1) and force users to upgrade ASDF, then it makes no sense using properties instead of a slot.
Are we set in requiring that this new encoding specification will require ASDF 2.21 to work? If so, we should ask people to not actually start using it in libraries until a few months from now, when ASDF 2.21 is more widely available (i.e. has made it to Quicklisp, SBCL, etc.).
Note that in the end, I prefer :encoding if we're going to add an implicit translation layer between that and the actual :external-format option of CL:LOAD, so the user understands there's a difference. If we're going to NOT going to add a translation layer, and instead require users to use #.(foo:encoding-to-external-format :latin1).
Also, what sort of an entity are the external format values? Is it always a keyword symbol? Can we say that it should always be a keyword and that we will massage it to something else, if necessary, for the benefit of the implementation when reading a file?
Maybe yes. Consider that e.g. some implementations accept more options (mostly to control line terminators) — CLISP as instances of ext:encoding, LispWorks as lists like '(:latin-1 :eol-style :lf); but then CLISP explicitly says that line terminators don't matter during input.
Oh yeah, I had tried to blank out on line terminators. Hopefully, they won't matter much indeed, since ASDF only cares about input encoding, and line terminators are an output option. That's one more reason to call our thing :encoding instead of :external-format.
In that case we could have an accessor that will do the implementation-specific massaging for us (e.g., we could store :utf-8, but on clisp we would present charset:utf-8 when reading...). That seems somehow tidier to me, rather than changing the value behind the programmer's back as we do here. OTOH, we do quietly change symbols to strings, so maybe I'm just talking through my hat.
I'd appreciate if you explain in a more detail what happens when and how. Is it like in the attached patch, but with logic moved from the setf'er to the accessor?
I suppose we'll have something like that:
(defun trivial-encoding-to-external-format-hook (encoding) (declare (ignore encoding)) *utf-8-external-format*) (defvar *encoding-to-external-format-hook* #'trivial-encoding-to-external-format-hook)
... (load ... :external-format (funcall *encoding-to-external-format-hook* encoding) ...) ...
Then you'll have to :defsystem-depends-on (:asdf-encodings) or some such to be able to use different encodings.
If one goes beyond ASCII, saves not in UTF-8 (as expected on MS Windows, but even on Linux LispWorks Personal IDE tried to save a file in Latin-1), manages local projects with ASDF and upgrades ASDF, he will be affected.
Ouch. I'd say that in this case, LW personal has obsolete default settings. I still think that ASDF should assume utf-8 by default.
I've committed something along those lines as 2.20.3. Only minimal testing for no obvious breakage (make test, using sbcl).
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org Time and money spent in helping men to do more for themselves is far better than mere giving. — Henry Ford