On Mon, 14 Feb 2011, William Halliburton wrote:
I know I'm late to the parade but if you need to write grammars that can be easily augmented by other people without them needing to know much, if any, lisp I can recommend the ebnf parser written by Daniel Herring. Its onion of macros expands into pretty understandable code also.
Thank you for the kind words. That was one of my first real CL projects. A great attraction of CL was that I would no longer need to write parser code for DSLs...
I always meant to revisit that library, clean it up, and add some more features (e.g. support for infinite streams and source locations) but got distracted in the land of continuations and never made it back.
Here's a reply to an earlier post. It was stuck in my drafts folder.
Re the OP: Yes, it sounded like you were abusing the YACC paradigm.
On Sat, 5 Feb 2011, Scott L. Burson wrote:
Boolean grammars solve both these problems. Let's look at how they
work.
...
A few years ago when I was parsing a lot (a motivator towards lisp), I grew to like recursive-descent parsers and EBNF (the ISO 14977 variant).
AFAICT, the tokenization/parsing divide was mostly a performance hack. The tokenizer uses a minimal syntax and thus can run quickly. The parser has more conditions, but EQL is faster than STRING=. Thus the two put together ran fast, and the factorization also simplified specification.
Today, recursive-descent parsers are fast enough for most uses. Add a form that prevents backtracks, and they can handle infinite streams of data. It is easy to convert most formalisms into this form, and nothing is more flexible than "write whatever function you want here".
If people are looking for inspiration, ANTLR is a good framework to borrow ideas from. Parsing is only half the battle. Then you reach trees, walkers, output generation, error handling, etc. Boost::spirit also has some good features.
Later, Daniel