On Sat, Mar 31, 2012 at 10:38, Orivej Desh c@orivej.org wrote:
Douglas' main point may be transformed as follows, which is a legitimate question: if the task is to extend the supported character set to UTF-8, is not it solved by accepting :encoding option and defining default #'encoding-external-format which understands (nothing but) :utf-8?
Yes, that's what we have now with 2.20.x.
Given that, should the default be UTF-8 rather then :default? Answering `yes' might cause more or less trouble to some people, answering `no' will provide for a gradual transition. I think we should ask Zach Beane about issues with unspecified and discerned external formats.
Source code that uses more than the ASCII character set wasn't portably supported previously, but in practice, utf-8 worked everywhere and was backhandedly enforced by a lot of people using SBCL and utf-8 and sending reports to authors so they make their packages compatible. This change therefore only formalizes a de facto standard, and allows for extension and customization where no such thing was previously possible.
In the future, maybe we should distinguish between :default that is :utf-8 where supported and falls back where not supported, and :utf-8 that means "I really really want utf-8", e.g. for lambda-reader? I think it'll be better solved as using :utf-8 in all cases and #-asdf-unicode (error ...) in the source code when it's not available.
Another issue which somewhat bothers me: is such kind of a hook right? It seems to be inherently unmanaged (just like *macroexpand-hook*), i.e. setting it in a system affects future loaded systems, unless it is set lexically in around-compile. But then, it might as well be another ASDF option (say, either a package designator which exports #'encoding-external-format, or a list of a package and a keyworded symbol designating desired function).
Good suggestion: I've refactored the external-format extraction to happen inside the around-compile hook. But yes, the hook is intended as a global hook to be used once, by a global asdf extension called asdf-encodings, to be written. The reason to make it an extension rather than put it all in asdf is that I expect external-format support to be a long and painful thing to write to support all encodings on all implementations; I'd rather that be done outside of ASDF, because it's a lot of code I'd rather not put in ASDF, the development cycles are different, and it shouldn't matter for the vast majority of us who'll use the default settings (i.e. UTF-8).
(By the way, I wouldn't call a hooked function a hook, so that #'default-encoding-external-format-hook would be #'default-encoding-external-format.)
Good suggestion. Renamed in 2.20.7.
The last issue relates to the strictness of the default-encoding-external-format. Probably it's all right, but then wouldn't it be good to define a permissive alternative which behaves like in 2.20.2?
I'm not sure who's to gain what with that. If you're writing a .asd, you know what charset your code is using. If it's UTF-8 indeed, why would you want to reduce the number of cases in which your charset is correctly recognized? And if it's not UTF-8, you're probably having trouble with "bug" reports from all those SBCL + utf-8 users around today. Or maybe you don't have end-users, and want to force your local encoding; if such cases exist around the world, we might need a solution quicker than expected; and so you've convinced me to add support for an explicit :default as a valid encoding for backwards-compatibility purposes only.
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org Apparently a government can prevent itself and its successors indefinite from doing bad things, just by writing a note to itself that says "don't do bad things." — Mencius Moldbug on constitutions