Agreed, and until the dust settles, "strict mode" cannot be the default. I'd argue it should become the default eventually (i.e. next year).
I doubt that's even feasible, since the strict mode will have to cross systems. It will be easier to have systems say "I know I behave according to the strict policy, so I will LOCALLY ask to be built and loaded strictly."
The problem is that strictness is a GLOBAL thing. Unless every single library is strict, then any readtable pollution will put the whole build at risk, and we're back to the tight readtable restrictions (i.e. no changing a standard character, no two libraries changing the same character). If I want my system to go beyond these restrictions, I have to ensure that all other libraries are protected by default.
It's like changing the default encoding from :default to :utf-8: if I change the implementation's current *external-format* to :latin1, and all those utf-8 libraries are not marked :encoding :utf-8 and the default is still :default, then there's going to be mojibake. Worse: if I need to change the current encoding to EBCDIC because that's what my program does, then all those libraries are going to not even compile correctly. The only sane thing is if files get compiled in the encoding they are intended to be compiled with, rather than "whatever the user happens to be using right now".
So yes, we could *demand* that every single system should specify its readtable, but that'd be *more* disruptive than providing a sane default, and *more* error-prone if it goes unenforced.
I think one reason you think that this change is feasible and I don't is that you have a stronger sense that there's a kind of "one library/system == one ASDF file."
No. I have a sense that it's trivial for those who depend on the current behavior to fix their systems, just like it was trivial for people using non-utf-8 encodings to specify :encoding :latin1 or :encoding :shift-jis, etc. I understand that it's breaking compatibility, which is a big migration matter even for a small change; but that's for the sake of making the build much more robust and of enabling things that were not possible before.
In our bigger multi-component systems that are based on ASDF, I find that we typically end up with multiple ASDF systems, because we have subsystems that use different libraries, and want to specify their own dependencies.
AFAIK you can't do this without introducing new top level systems, since modules cannot depend on libraries.
A few simple solutions:
1- tight knit systems can go in the same .asd file using secondary/system syntax, and they'll all inherit the same *readtable* from the .asd file.
2- otherwise, use named-readtables or cl-syntax and have each system change the syntax in its first file. That's the sane thing to do, and it's a two line change per user system, and a cleanup in whichever system defines the new readtable.
I guess we could even automatically detect the offenders:
1- check that the *readtable* is one of the initial readtable or the standard one, at the end of the build. If not, then the system has leaked readtable state.
2- (harder) check that the initial readtable hasn't changed. That's easier if we always use the standard read-only readtable initially, but that is already a form of strictness that might break a lot of clients — we shouldn't enable that by default until after we get a clean cl-test-grid and give users a lot of notice.
That's why, for example, we have systems that bleed readtable across their boundaries: the systems really aren't stand alone entities, but limitations on ASDF expressiveness, and the desire to get more modularity than the (necessarily in-line) :MODULE will permit, causes a proliferation of systems that are NOT libraries, and have meaning only in context.
I'm not going to move ASDF towards breaking such systems.
These systems can be trivially fixed with a two line change, and this will enable a world where every system, library or not, can freely pick their favorite readtable and not be afraid of breaking other things.
Unhappily, strict mode is a global flag: the question is "which readtable is this system going to be read with?". The only reasonable answer is: the readtable it was meant to be read with, which the author knows, and should be the standard readtable by default, unless explicitly overridden by the author. The backward-compatible (if it's not backward, it's not compatible) is "whichever readtable was active at the time", with sometimes comical consequences, especially when the user was using a non-standard one at the REPL.
I don't see that.
If you know you aren't going to want to bleed readtable entries out of your library, and you don't want stuff creeping in, it seems to me eminently possible to mark your system as strict-mode wrt the readtable.
Why is that impossible?
Nobody EVER wants "stuff creeping in", just like a program written in utf-8 NEVER, EVER wants to be compiled in EBCDIC mode, or even latin1 for the matter.
Whatever readtable you designed your code to be read with, by definition, you don't want your code to be read with another one. At no point, does anyone ever want the current modified REPL readtable to leak into the code he is compiling from the REPL. If I use EBCDIC, I still want my libraries to compile using whatever encoding they were designed to be read with. And I do want to be able to use EBCDIC or a new readtable.
There are 700 libraries in Quicklisp, and not a single of them "wants stuff creeping in"; yet if safety is an "opt in" feature, then every single one you use needs to opt in safety so you may safely change the readtable at the REPL and call anything that might load a system.
The entire readtable feature is crippled if the coding conventions preclude any significant modification and make any concurrent use of the same character an error.
I think the point you don't see is that the readtable *of the REPL* is going to affect every single system being compiled, and there is no way to opt out.
OK, I have two (separable or combinable) proposals that might provide both enough enough hygiene to allow for radical readtable modification while allowing for traditional unhygienic style of development.
My main constraint is as follows:
0- what readtable a system is compiled with must not depend on anything but its declared dependencies, and in particular MUST NOT EVER be affected by whichever readtable the user is currently using at the REPL.
enforcing the constraint that the readtable used to compile a system only depends on its declared dependencies.
A compromise situation might be as follows:
1- ASDF maintains an asdf:*global-readtable*, which is the *readtable* object at the time it was loaded.
2- This *global-readtable* is subject to the current restrictions: A- no modifying any standard character, B- no two dependencies assigning different meaning to the same non-standard character. C- libraries need to document any change to the readtable D- free software libraries will register these changes on the page on cliki.
3- Unhappily, there is no cheap way to enforce these restrictions, but that's no regression with respect to the current situation.
4- ASDF wraps any compile-op and load-source-op in this asdf:*global-readtable*, but probably not load-op, to preserve combine-fasl linking semantics.
5- Systems that want to do crazier things with the readtable that may violate (2) must arrange to use their own private readtable, but can otherwise do it safely. It is an error (unhappily not enforceable) to modify the current readtable in these ways.
6A- ASDF binds *read-table* to the *global-readtable* at the start of every system's compilation (and loading?), and around the entire asdf:operate, leaving the *readtable* unchanged at the end.
This easily supports systems that "modify the current readtable data structure".
However, that doesn't systems that "bind *readtable* to a new value", because the changes they make will shadow the changes that other systems following this style make and depend on. To allow such, an idiom, we must also do the following:
6B- ASDF binds *read-table* to a proper "entry readtable" at the start of every system's compilation, and record an "exit readtable" at the end of the system's loading.
7- maintain a partial order on these readtable objects, assuming that each system's exit readtable supersedes the entry readtable. The least readtable is the *global-readtable*. It's enough to store for each new exit readtable, identified by the name of system that created it, the set of its inferior readtables, as a list or eq-hash-table, or an equal hash-table, with each readtable being identified by the name of the system that created it.
8- before a system is compiled or loaded, compute the maximum readtable of all the exit readtables of its dependencies. If this maximum is unique, then it will be the entry readtable of the system. If there is not a unique readtable that is more than all the other ones, that's an error, and we refuse to load the system.
9- after a system is loaded, check its exit readtable, if it already exists, check that this doesn't create a cycle or issue an error; it it doesn't already exist, add it to the set of all known exit readtables.
10- ASDF either A- binds the *read-table* to the *global-readtable* around the entire asdf:operate, leaving the *readtable* unchanged at the end, or B- always side-effects the *read-table* to correspond to the exit readtable of the loaded system, or C- operate does the binding around thing, but load-system does the side-effect after it's done operate'ing.
Does that strike you as complex? Because it is. That's the price of *safely* supporting this "systems can bind a new value to *readtable*" style. Unhappily some of the constraints are not enforceable (2A and 2B), but that's the very same as now.
So my next question is: do you want to safely support these conventions? Do your systems modify the current *readtable* structure, or do they bind *readtable* to a new value?
PS: thank you for making me come up with better solutions. I care about enough hygiene to use readtables safely, and I also care about supporting legacy systems if possible.
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org An insult may sometimes adequately fit the person who is insulted. However, it can only ever possibly tarnish but the person who insults.