Hi,
(I can't remember if this has been suggested or shot down before; I might have a need for something like this soon, so I'd appreciate guidance on it.)
How hard would it be, do people think, to provide from Drei something like a regexp-token-syntax class, such that it becomes straightforward for people to define simple tokenizing syntaxes?
An example might help; I'd like to be able to do
(define-token-syntax chord-ontology () ("(([A-G])(s|ss|b|bb)?):(([0-9](,[0-9])*))" ((root pitch accidental) pitches)) (t error-token))
and have that be essentially all that's necessary to be able to use the syntax in interesting ways, such as:
* being able to do (accept 'chord-ontology-token) and have the user input coloured in red unless it's valid, and be given the parse tree on return;
* being able to use Chord Ontology syntax in a buffer, where there is no parse per se, but the buffer is lexed according to the token syntax.
Is this sensible, worthwhile, possible, easy? I would have thought that we would have essentially all the components already, and it was a matter of gluing them together, but maybe I'm forgetting a snag?
Cheers,
Christophe
[ In my example, the nesting in the symbolic names is intended to correspond to register groups; in that example, "As:(1,3)" is intended to generate the equivalent of
(make-instance 'chord-ontology-token 'root (make-instance 'root 'pitch "A" 'accidental "s") 'pitches "1,3")
I hope that my intent is reasonably clear. ]
Christophe Rhodes csr21@cantab.net writes:
- being able to do (accept 'chord-ontology-token) and have the user input coloured in red unless it's valid, and be given the parse tree on return;
This sounds very useful for all the various kinds of special-purpose textual input formats applications are likely to use.
- being able to use Chord Ontology syntax in a buffer, where there is no parse per se, but the buffer is lexed according to the token syntax.
I don't understand this. How would you have a buffer syntax with "no parse per se"?
Is this sensible, worthwhile, possible, easy? I would have thought that we would have essentially all the components already, and it was a matter of gluing them together, but maybe I'm forgetting a snag?
I would think this is possible, we already have a state machine, cl-automaton, I think it can be adapted for this purpose.
Troels Henriksen athas@sigkill.dk writes:
Christophe Rhodes csr21@cantab.net writes:
- being able to use Chord Ontology syntax in a buffer, where there is no parse per se, but the buffer is lexed according to the token syntax.
I don't understand this. How would you have a buffer syntax with "no parse per se"?
Sorry, I meant that there would be no parse for the buffer beyond a sequence of tokens.
Is this sensible, worthwhile, possible, easy? I would have thought that we would have essentially all the components already, and it was a matter of gluing them together, but maybe I'm forgetting a snag?
I would think this is possible, we already have a state machine, cl-automaton, I think it can be adapted for this purpose.
OK, good.
(Annoyingly, I realised after I posted that some of the mini languages that I have a need to support include elements that are not regular: the common case is properly-nesting balanced comments. I don't know the capabilities of cl-automaton; can it represent such things?)
Cheers,
Christophe
Christophe Rhodes csr21@cantab.net writes:
Sorry, I meant that there would be no parse for the buffer beyond a sequence of tokens.
Oh yeah, this should be trivial. A parse tree of some sort is always necessary for redisplay and other things, hence my question.
(Annoyingly, I realised after I posted that some of the mini languages that I have a need to support include elements that are not regular: the common case is properly-nesting balanced comments. I don't know the capabilities of cl-automaton; can it represent such things?)
I'm pretty sure cl-automaton as it stands only implements regular expressions. I do not know whether it can reasonably be extended or not, but for simplish things like balancing, I'd wager it can.