Update of /project/climacs/cvsroot/papers/ilc2005/syntax In directory common-lisp.net:/tmp/cvs-serv32521
Modified Files: climacssyntax.tex Log Message: Clean up the COLON of DOOM!
Date: Tue May 24 23:01:56 2005 Author: bmastenbrook
Index: papers/ilc2005/syntax/climacssyntax.tex diff -u papers/ilc2005/syntax/climacssyntax.tex:1.30 papers/ilc2005/syntax/climacssyntax.tex:1.31 --- papers/ilc2005/syntax/climacssyntax.tex:1.30 Tue May 24 21:41:35 2005 +++ papers/ilc2005/syntax/climacssyntax.tex Tue May 24 23:01:56 2005 @@ -114,18 +114,18 @@ Climacs' syntax analysis is a flexible protocol which can be implemented with a full language lexer and parser. GNU Emacs, the most commonly used Emacs-like editor, uses regular expressions for its -syntax analysis. As well as the issue that regular expressions cannot -be used to parse the general case of non-regular constructs such as -Common Lisp's nestable \verb+#| |#+ block comments, the lazy -application of those regular expressions can lead to additional -erroneous parses: if the parser starts after the opening \verb+#|+ -then the closing \verb+|#+ will be treated as the start of an escaped -symbol name. Even if the regular expression parses the whole block -comment correctly, other expressions can still match on the contents -of the comment, leading to issues when the first character in a column -in the block comment is the start of a definition: Emacs users quickly -learn to insert a space before the open parenthesis to work around -Emacs' font-lock deficiencies. +syntax analysis. However regular expressions cannot be used to parse +the general case of non-regular constructs such as Common Lisp's +nestable \verb+#| |#+ block comments. The lazy application of those +regular expressions will also lead to additional erroneous parses even +when nesting is not taken into account when the parser starts after +the opening \verb+#|+ then the closing \verb+|#+ will be treated as +the start of an escaped symbol name. Even if the regular expression +parses the whole block comment correctly, other expressions can still +match on the contents of the comment, leading to issues when the first +character in a column in the block comment is the start of a +definition. Emacs users quickly learn to insert a space before the +open parenthesis to work around Emacs' font-lock deficiencies.
The Climacs text editor is a combination of frameworks for buffer representation and parsing, loosely coupled with a display engine @@ -246,7 +246,7 @@ Climacs uses an unspecialised vector for its storage, which uses one machine word per element, either as an immediate value or as a pointer to a larger element. More space-efficient buffer implementations are -possible, should it be necessary: for instance, it is conceivable that +possible, should it be necessary. For instance, it is conceivable that a buffer implementation might choose to compress sections of the buffer which are not in use.
@@ -296,7 +296,7 @@ Climacs includes a parser generator that implements the Earley \cite{earley} algorithm. There are many advantages of this algorithm in the context of text editing. Perhaps most importantly, no grammar -preprocessing is required: so it is not necessary for the entire +preprocessing is required, so it is not necessary for the entire grammar to be known ahead of time. This means that the user can load Lisp files containing additional syntax rules to complete the existing ones without having to apply any costly grammar analysis. Other @@ -305,10 +305,9 @@ context-free grammars. This feature is crucial in certain applications, for instance in a grammar checker for natural languages. Implementations of the Climacs syntax protocol may, but are not -required to, use the provided Earley parser: any algorithm with an -explicit representation of the parser state (which is a necessary, but -not sufficient, requirement for making the parsing algorithm -incremental) is potentially suitable. +required to, use the provided Earley parser. Any algorithm with an +explicit representation of the parser state is suitable for use by an +incremental parse system like that of Climacs' syntax protocol.
It should be noted that the Earley parsing algorithm is relatively slow compared to table-based algorithms such as the LR shift/reduce @@ -350,7 +349,7 @@
Implementing the Prolog syntax proved a good test of the established framework. Firstly, and most importantly, ISO Prolog \cite{ISOProlog} -is not a context-free grammar: \textit{terms} have an implicit +is not a context-free grammar; \textit{terms} have an implicit priority affecting their parse.\footnote{Formally, the grammar could be made context-free by introducing a large number of new production rules.} The implementation of Earley's algorithm, however, was able @@ -358,7 +357,7 @@
Another area of difficulty is the fact that parsing a Prolog text can change the grammar itself through the use of the \texttt{op/3} -directive: the inclusion of +directive. The inclusion of \begin{verbatim} :- op(100,xfy,<>). \end{verbatim} @@ -465,7 +464,7 @@ some other element of musical notation (such as a barline); figure \ref{fig:besfantlach} shows a fragment of tablature, and demonstrates its \TabCode\ encoding. It is also possible to encode more complex -elements of lute tablature notation in \TabCode: ornaments, fingering +elements of lute tablature notation in \TabCode. Ornaments, fingering marks, beaming, connecting lines and other complex elements can all be accommodated (see figure \ref{fig:barley} for examples of some of these more complex elements). \TabCode\ has been used to produce @@ -476,7 +475,7 @@ The \TabCode\ language itself has developed to provide a terse and intuitive encoding of tablature, rather than a well-formed grammar for parsing. Simple \TabCode, as in figure \ref{fig:besfantlach}, -presents no problems: each chord is an optional rhythm sign (a capital +presents no problems. Each chord is an optional rhythm sign (a capital letter), followed by zero or more notes as fret--string pairs (letter--number combinations). Adding ornaments and fingering marks to this structure is simple, as they are merely optional modifiers to @@ -529,7 +528,7 @@ of the order of 200--300 words, which requires only little time to parse on modern hardware. However, such a parsing scheme would stress the display engine if a complete redraw were forced on every edit, so -we have implemented the obvious optimisations: the extent of the edit, +we have implemented the obvious optimisations. The extent of the edit, along with its typical locality of effect, are used to limit the damaged region as before, so preserving the identity of unaffected tabwords; this identity can then be used in a cache informing CLIM's @@ -544,8 +543,8 @@ for the user's attention.
To assist the editorial process, we have also implemented MIDI audio -feedback: in addition to a command to render the entire tablature in -sound, we provide several gestures to play individual chords: one +feedback. We provide a command to render the entire tablature in +sound, and several gestures to play individual chords: one intended for use during the initial entry of the encoding, to act as a rapid error-detection aid, and a motion command and mouse gesture to assist revision and navigation. At present, this MIDI support is