Request for comments. Including grammatical and stylistic corrections.
-T.
Issue: WITH-READTABLE-ITERATOR
Forum: Common Lisp Document Repository (CDR)
Status: Draft.
References
* CLHS 2.1.1 Readtables
* X3J13 Issue #188
* `with-package-iterator' (macro)
* `with-hash-table-iterator' (macro)
Category: Addition.
Edit History: 02-Oct-2008 by Rittweiler (Draft)
Problem Description:
Even though the ANSI Common Lisp standard provides simple getters for readtables (`get-macro-character', `get-dispatch-macro-character'), the standard does not provide any means to efficiently get at all the macro characters and dispatch macro characters defined in a readtable.
The omission of any iteration facility for readtables makes readtables unneccessarily opaque, and keep users from writing libraries that try to deal with readtables in a general way. For example, the author discovered that this lack of an iteration form is the main obstacle to writing an otherwise portable library that establishes an organizational namespace for readtables akin to the namespace that is provided for packages.
Proposal (WITH-READTABLE-ITERATOR:ADD-GENERATOR)
Add a macro `with-readtable-iterator' that establishes a generator in its scope; each invocation of this generator returns a macro character or a dispatch macro character from the readtable the generator was established for--along side some additional information.
A detailed specification of `with-readtable-iterator' can be found in the appendix of this document.
Rationale
The proposed macro `with-readtable-iterator' represents a general iteration facility for readtables that can be used to implement a variety of iteration forms.
The proposal is closely modelled on the macros `with-hash-table-iterator' and `with-package-iterator'. The semantics of `with-readtable-iterator' should thus be intuitive to any experienced Common Lisp programmer.
Notes
The proposal does deliberately say nothing about the home-package of the symbol `with-readtable-iterator'. However, implementors are encouraged to export this symbol from their extensions package (often called "`EXT'") or another appropriate package--unless a later CDR document specifies a more explicit location.
Current Practice
No implementation the author is aware of provides a way to iterate through a readtable.
The author implemented the proposal for SBCL (http://www.sbcl.org/), and sent the relevant patches upstream; the patches are currently waiting to be integrated into mainline. Ariel Badichi implemented the proposal for CLISP (http://clisp.cons.org/), and is going to send his work upstream shortly. Stephen Compall did an implementation for Clozure CL (http://ccl.clozure.com/) which needs to be somewhat revised to fully conform to the proposal as presented in this document.
Cost to implementators
The macro `with-readtable-iterator' should be straightforwardly implementable. Extrapolating from actual experience, people--who were previously not acquainted with the relevant code sections--were able to implement it in a couple of hours.
Discussion
Ariel Badichi proposed coalescing a generator's fourth return value (indicating if the returned character is a dispatch macro character) with its first return value (indicating if the generator is exhausted.)
There is technically no reason that speaks against doing so; in fact, a generator would return one value less this way--which may lead to positive performance characteristics on register-anemic processor architectures.
Stephen Compall and the author opposed such a change mostly for idiomatic reasons, as both `with-hash-table-iterator' and `with-package-iterator', the generator-establishing macros specified by the ANSI standard, return a purely boolean exhaustion flag as first value. In particular, `with-package-iterator' does _not_ coalesce the accessibility type (third return value) with the exhaustion flag (first return value.)
The author notes that allowing `:terminating', and `:non-terminating' as valid MACRO-CHAR-TYPES was considered, but rejected for reasons of simplicity. It is not apparent that there is a real necessity for supporting these out of the box.
Acknowledgements
The author wants to specially credit Ariel Badichi and Stephen Compall which in spirit of true hackerism promptly agreed to hack an early version of the proposal into the implementations of their choice.
Appendix
-- Macro: with-readtable-iterator (name readtable &rest macro-char-types) declaration* form* => results
Arguments and Values ....................
NAME - A symbol.
READTABLE - A form, evaluated once to produce a readtable.
MACRO-CHAR-TYPE - One of the symbols `:macro-char', or `:dispatch-macro-char'.
DECLARATION - A `declare' expression; not evaluated.
FORMS - An implicit progn.
RESULTS - The values of the FORMS.
Description ...........
Within the lexical scope of the body FORMS, the NAME is defined via `macrolet' such that successive invocations of `(name)' will return the macro characters, one by one, from the READTABLE. The order of the macro characters returned is implementation-dependent.
An invocation of `(name)' returns the following five values:
1. A generalized boolean that is true if a macro character is returned.
2. A macro character that is defined in READTABLE.
3. A reader macro function of the macro character returned.
4. A generalized boolean that is true if the macro character is a dispatch macro character.
5. An association list between the "sub-characters" of the dispatch macro character and their reader macro functions.
After all macro characters have been returned by successive invocations of `(name)', only one value is returned, namely `nil'.
It is unspecified what happens if any of the implicit interior state of an iteration is returned outside the dynamic extent of the `with-readtable-iterator' form such as by returning some closure over the invocation form
In spirit of CLHS 3.6, consequences are undefined if READTABLE is modified except for modification of the current macro character under traversal.
Exceptional Situations ......................
Signals an error of type `program-error' if a MACRO-CHAR-TYPE is supplied that is not recognized by the implementation.
See Also ........
Traversal Rules and Side Effects (CLHS 3.6), `with-package-iterator'
"Tobias C. Rittweiler" tcr-0l3vezLb4dgb1SvskN2V4Q@public.gmane.org writes:
Ariel Badichi proposed coalescing a generator's fourth return value (indicating if the returned character is a dispatch macro character) with its first return value (indicating if the generator is exhausted.)
There is technically no reason that speaks against doing so; in fact, a generator would return one value less this way--which may lead to positive performance characteristics on register-anemic processor architectures.
Stephen Compall and the author opposed such a change mostly for idiomatic reasons, as both `with-hash-table-iterator' and `with-package-iterator', the generator-establishing macros specified by the ANSI standard, return a purely boolean exhaustion flag as first value. In particular, `with-package-iterator' does _not_ coalesce the accessibility type (third return value) with the exhaustion flag (first return value.)
I will briefly summarize my full position on the subissue of whether the fourth value's type should be boolean rather than (member :macro-char :dispatch-macro-char).
There is no antonym in common use for "dispatching" with respect to macro characters among Lisp programmers. As such, the only way to say that a macro character is not dispatching while ensuring that your meaning is clear to others who are working with readtables is to say "not dispatching". So our partition of macro character types is "dispatching" and "not dispatching", which has a linguistically natural mapping to boolean true and false.
In spirit of CLHS 3.6, consequences are undefined if READTABLE is modified except for modification of the current macro character under traversal.
What rules are there for modification when modifying the current macro character? Under the CCL implementation:
unbind the character¹ No value changes. make non-dispatching macro character dispatching No value changes. add new sub-character No value changes. set existing sub-character to new function Mutates relevant CDR in the fifth value to new function. unbind existing sub-character¹ Mutates relevant CDR in the fifth value to NIL.
Likewise, there should also be a rule about modifying structure obtained via the generator.
¹ ANSI for set-macro-character and set-dispatch-macro-character does not seem to specify what passing NIL as the function means, by my brief research, or even to allow it (as NIL is not a function designator).
Stephen Compall writes:
[Issue WITH-READTABLE-ITERATOR:]
In spirit of CLHS 3.6, consequences are undefined if READTABLE is modified except for modification of the current macro character under traversal.
What rules are there for modification when modifying the current macro character? Under the CCL implementation:
Good call. SET-MACRO-CHARACTER, and SET-DISPATCH-MACRO-CHARACTER are allowed to be invoked on the current macro char (second return value.)
All other operations may alter the inners of the readtable in a way that could lead to strange behaviour.
I'll also substitute "consequences are undefined" with "consequences are unspecified"; this makes it more clear that implementations can loose the constraints, if it suits their representation of readtables.
Likewise, there should also be a rule about modifying structure obtained via the generator.
Indeed, it should!
¹ ANSI for set-macro-character and set-dispatch-macro-character does not seem to specify what passing NIL as the function means, by my brief research, or even to allow it (as NIL is not a function designator).
That is true. The portable way to "disable" a reader macro character is to use
(set-syntax-from-char to-char #\A to-readtable (copy-readtable nil))
-T.
"Tobias C. Rittweiler" tcr@freebits.de writes:
Ariel Badichi proposed coalescing a generator's fourth return value (indicating if the returned character is a dispatch macro character) with its first return value (indicating if the generator is exhausted.)
An example of the results returned by the generator, according to my proposal, would be:
:MACRO-DISPATCH-CHAR, ##, #<FN>, (...)
Rather than:
T, ##, #<FN>, T, (...)
I find the former more intelligible. I also find, however, that Stephen Compall has made a strong argument in support of the interface proposed in the draft. If I understand him correctly, it is that there are (and always will be) just two kinds of macro characters, dispatching and non-dispatching, and that by using a boolean to discriminate between them, we save the user the work of mapping them herself in many cases. In light of this argument, I am ready to cope with the unpleasant consequences of the draft's proposal.
Exceptional Situations ......................
Signals an error of type `program-error' if a MACRO-CHAR-TYPE is supplied that is not recognized by the implementation.
The outcome of having a non-symbol given as a generator name is not specified. Might it also be the signaling of a `program-error' condition, or should it remain unspecified?
The outcome of having a non-readtable given for iteration is not specified. Should it be the signaling of an error, or should it remain unspecified?
Ariel
Ariel Badichi abadichi-XgcMedQSbuTk1uMJSBkQmQ@public.gmane.org writes:
The outcome of having a non-symbol given as a generator name is not specified. Might it also be the signaling of a `program-error' condition, or should it remain unspecified?
If it were specified at all, program-error would be best. But I don't think it's likely to be a problem regardless.
The outcome of having a non-readtable given for iteration is not specified. Should it be the signaling of an error, or should it remain unspecified?
I offer two extensions to the proposed standard:
1. Make READTABLE a "readtable designator", having its meaning from the CLHS (where NIL indicates the initial readtable).
2. Specify that a type-error shall be signaled when READTABLE does not evaluate to a readtable designator.
Stephen Compall s11@member.fsf.org writes:
I offer two extensions to the proposed standard:
Make READTABLE a "readtable designator", having its meaning from the CLHS (where NIL indicates the initial readtable).
Specify that a type-error shall be signaled when READTABLE does not evaluate to a readtable designator.
I changed the proposal accordingly.
-T.
Ariel Badichi abadichi@bezeqint.net writes:
The outcome of having a non-symbol given as a generator name is not specified. Might it also be the signaling of a `program-error' condition, or should it remain unspecified?
I'll leave it unspecified. Type violations of the "Arguments and Values" sections result in undefined behaviour, and fall under implementation realm.
The outcome of having a non-readtable given for iteration is not specified. Should it be the signaling of an error, or should it remain unspecified?
I'll add that explicitly---it should signal a type error---, as it deems to be useful that you can rely on it.
-T.
The updated specification of WITH-READTABLE-ITERATOR looks now as follows.
-T.
-- Macro: with-readtable-iterator (name readtable &rest macro-char-types) declaration* form* => results
Arguments and Values ....................
NAME A symbol.
READTABLE A form, evaluated once to produce a readtable designator.
MACRO-CHAR-TYPE One of the symbols `:macro-char', or `:dispatch-macro-char'.
DECLARATION A `declare' expression; not evaluated.
FORMS An implicit progn.
RESULTS The values of the FORMS.
Description ...........
Within the lexical scope of the body FORMS, the NAME is defined via `macrolet' such that successive invocations of `(name)' will return the macro characters, one by one, from the READTABLE. The order of the macro characters returned is implementation-dependent.
The variable MACRO-CHAR-TYPES controls which macro characters are returned:
`:macro-char' All macro characters in READTABLE that are _not_ dispatch macro characters.
`:dispatch-macro-char' All dispatch macro characters in READTABLE.
Multiple occurences of the same symbol are allowed in MACRO-CHAR-TYPES. If MACRO-CHAR-TYPES is null, both `:macro-char' and `:dispatch-macro-char' is assumed.
An invocation of `(name)' returns the following five values:
1. A generalized boolean that is true if a macro character is returned.
2. A macro character that is defined in READTABLE.
3. A reader macro function of the macro character returned.
4. A generalized boolean that is true if the macro character is a dispatch macro character.
5. An association list between the "sub-characters" of the dispatch macro character and their reader macro functions.
After all macro characters have been returned by successive invocations of `(name)', only one value is returned, namely `nil'.
Consequences are undefined if the association list returned as fifth value is modified.
Consequences are undefined if READTABLE is modified in a way that might affect an ongoing traversal operation. Yet conforming programs may modify the current macro character in the readtable under traversal by means of `set-macro-character', and `set-dispatch-macro-character'. [This does not entail the permission to modify the standard readtable; CLHS 2.1.1.2 prevails.]
It is unspecified what happens if any of the implicit interior state of an iteration is returned outside the dynamic extent of the `with-readtable-iterator' form such as by returning some closure over the invocation form.
Any number of invocations of `with-readtable-iterator' can be nested, and the body of the innermost one can invoke all of the locally established macros, provided all those macros have distinct names.
Exceptional Situations ......................
Signals an error of type `program-error' if a MACRO-CHAR-TYPE is supplied that is not recognized by the implementation.
An error of type `type-error' is signalled if READTABLE does not evaluate to a readtable designator.
See Also ........
Readtables (CLHS 2.1.1), Traversal Rules and Side Effects (CLHS 3.6)
Notes .....
Implementations may extend the syntax of `with-readtable-iterator' by recognizing additional macro character types.
"Tobias C. Rittweiler" writes:
The variable MACRO-CHAR-TYPES controls which macro characters are returned:
`:macro-char' All macro characters in READTABLE that are _not_ dispatch macro characters.
`:dispatch-macro-char' All dispatch macro characters in READTABLE.
I'm not perfectly happy with these names. Especially the first one is somewhat misleading. But I don't think I'd be happy with :NON-DISPATCH-MACRO-CHAR either.
But what I wanted to ask is if these names should be in plural perhaps rather than in singular?
(with-readtable-iterator (next-entry *readtable* :dispatch-macro-chars) ...)
looks better to my eyes than
(with-readtable-iterator (next-entry *readtable* :dispatch-macro-char) ...)
What do you think?
-T.
"Tobias C. Rittweiler" tcr@freebits.de writes:
The variable MACRO-CHAR-TYPES controls which macro characters are returned: `:macro-char' All macro characters in READTABLE that are _not_ dispatch macro characters. `:dispatch-macro-char' All dispatch macro characters in READTABLE.
I'm not perfectly happy with these names. Especially the first one is somewhat misleading. But I don't think I'd be happy with :NON-DISPATCH-MACRO-CHAR either.
What about using :DISPATCH and :NON-DISPATCH? The absence of a -MACRO-CHAR(S) suffix is analogous to the absence of a -SYMBOL(S) suffix in symbol-types accepted by WITH-PACKAGE-ITERATOR.
Ariel
Some minor corrections:
"Tobias C. Rittweiler" tcr@freebits.de writes:
Multiple occurences of the same symbol are allowed in
Misspelled "occurrences".
MACRO-CHAR-TYPES. If MACRO-CHAR-TYPES is null, both `:macro-char' and
I take it you meant "nil" (or better yet, "the empty list") rather than "null".
- An association list between the "sub-characters" of the dispatch
macro character and their reader macro functions.
"An association list with the sub-chars of the dispatch macro character as keys and their corresponding reader macro functions as values."
An error of type `type-error' is signalled if READTABLE does not
The CLHS uses the alternative spelling, "signaled". Why is this passage in the passive, while the previous one was in the active?
Ariel
Thanks for the corrections!
Ariel Badichi writes:
MACRO-CHAR-TYPES. If MACRO-CHAR-TYPES is null, both `:macro-char' and
I take it you meant "nil" (or better yet, "the empty list") rather than "null".
I used "If M-C-T are not supplied, ...".
An error of type `type-error' is signalled if READTABLE does not
The CLHS uses the alternative spelling, "signaled". Why is this passage in the passive, while the previous one was in the active?
No particular reason.
-T.