Update of /project/climacs/cvsroot/papers/ilc2005/syntax In directory common-lisp.net:/tmp/cvs-serv12718
Modified Files: climacssyntax.tex Log Message: Corrected some statements about the current implementation that were incorrect. Added conclusions and future work.
Date: Mon May 23 06:06:40 2005 Author: rstrandh
Index: papers/ilc2005/syntax/climacssyntax.tex diff -u papers/ilc2005/syntax/climacssyntax.tex:1.14 papers/ilc2005/syntax/climacssyntax.tex:1.15 --- papers/ilc2005/syntax/climacssyntax.tex:1.14 Mon May 23 03:05:55 2005 +++ papers/ilc2005/syntax/climacssyntax.tex Mon May 23 06:06:39 2005 @@ -42,7 +42,7 @@ The Climacs text editor is a CLIM implementation of a text editor in the Emacs tradition. Climacs was designed to allow for incremental parsing of the buffer contents, so that a sophisticated analysis of - the buffer contents can be performed without impacting performance. + the contents can be performed without impacting performance. We describe two different syntax modules: a module for a sequentially-defined syntax of a typical programming language, doing the bulk of its parsing in a per-window function; and an interactive @@ -103,20 +103,21 @@ correctly, other expressions can still match on the contents of the comment, leading to issues when the first character in a column in the block comment is the start of a definition. Emacs users quickly learn -to insert a space before the open paren to work around Emacs' +to insert a space before the open parenthesis to work around Emacs' font-lock deficiencies.
The Climacs text editor is a combination of frameworks for buffer representation and parsing, loosely coupled with a CLIM-based display engine. It includes the Flexichain library \cite{flexichain}, which provides an editable sequence representation and mark (cursor) -management using a simple linked lists used for implementing the -buffer protocol; and an implementation of a slight modification of the -Earley parsing algorithm \cite{earley}, to assist in the creation of -syntax-aware editing modes. An application can combine a particular -implementation of the buffer protocol, the syntax protocol, and its -own display methods to produce a sophisticated editor for a particular -language. +management using a simple linked list for implementing the buffer +protocol. Climacs also includes an implementation of a slight +modification of the Earley parsing algorithm \cite{earley}, to assist +in the creation of syntax-aware editing modes, though such modes can +use any appropriate parsing algorithm. An application can combine a +particular implementation of the buffer protocol, the syntax protocol, +and its own display methods to produce a sophisticated editor for a +particular language.
The Climacs buffer protocol, which provides a standard interface to common text editor buffer operations, uses the Flexichain library; we @@ -125,7 +126,7 @@ provides a method for interfacing a lexical analyzer and parser with the text editor, and provides for defining methods to draw syntax objects in the Climacs window. In section \ref{sec:syntaxes} we -discuss the implementation of syntactic analysis for various +discuss the implementation of syntactic analyses for various programming languages, including Common Lisp; in section \ref{sec:tabeditor}, we discuss an application with Climacs at its core to support editing a textual representation of lute tablature. @@ -257,6 +258,36 @@ approach is appropriate when the parse tree will also be used for some other display or analysis of the text in the buffer.
+Climacs includes a parser that uses the Earley \cite{earley} parsing +algorithm. There are many advantages of this algorithm in the context +of text editing. Perhaps most importantly, it does not require any +preprocessing of the grammar, which makes it necessary for the entire +grammar to be known ahead of time. This means that the user can +load Lisp files containing additional syntax rules to complete the +existing ones without having to apply any costly grammar analysis. +Other advantages include the possibility of using ambiguous grammars, +since the Earley parsing algorithm accepts the full class of +context-free grammars. This feature is crucial in certain +applications, for instance in a grammar checker for natural +languages. The Climacs syntax protocol can, but is not required +to, use the provided Earley parser. It can use any algorithm with an +explicit representation of the parser state, which is a necessary, but +not sufficient, requirement for making the parsing algorithm +incremental. + +However, the Earley parsing algorithm is relatively slow compared to +table-based algorithms such as the LR shift/reduce algorithm. +Worst-case complexity is $O(n^3)$ where $n$ is the size of the input. +It drops to $O(n^2)$ for unambiguous grammars and to $O(n)$ for a +large class of grammars suitable for parsing programming langauges. +Even so, the complexity is often proportional to the size of the +grammar (which is considered a constant by Earley), which can be +problematic in a text editor. We have yet to determine whether the +implementation of the Earley algorithm that we provide will turn out +to be sufficiently fast for most Climacs syntax modules. Other +possibilities include the Tomita parsing algorithm which provides more +generality than LR, but which is nearly as fast in most cases. + \section{Syntaxes} \label{sec:syntaxes}
@@ -428,8 +459,55 @@ \section{Future Work and Conclusions} \label{sec:conclusions}
-I like Jello. Jello is good. We should enable people to make better -Jello. +Given the relatively small amount of work (only a few person-months) +that has been put into Climacs so far, it is already a very competent +and stable editor. Using CLIM (and in particular the McCLIM +implementation) as the display engine has allowed the project to +progress much more rapidly than what would have been possible +otherwise. However, Climacs has also revealed some serious +limitations and performance problems of the McCLIM library. We +maintain that using CLIM and McCLIM was the best choice, and in fact +advantageous to other McCLIM users as well, since work is being done +to correct and improve it for use with Climacs. + +Due to its reliance on fairly well-defined protocols, the Climacs text +editor framework is flexible enough to allow for different future +directions. Turning the Common Lisp syntax module into an excellent +programming tool for Lisp programmers is high on the list of +priorities, for many reasons. First, it will encourage further work +on Climacs. Second, the Common Lisp syntax module is likely to become +one of the more advanced ones to exist for Climacs, given that Climacs +has unique and direct access to the features of the underlying Common +Lisp implementation. Thus, the Common Lisp syntax module is likely to +exercise the Climacs protocols to a very high degree. This will allow +us to improve those protocols as well as their corresponding +implementations. + +Another future direction high up on the list of priorites is the +planned implementation of the buffer protocol. Representing a line +being editied as a flexichain can greatly improve the performance of +some crucial operations that currently require looping over each +buffer object until a newline character is found. Other operations +that are currently prohibitive include knowing the line- and column +number of a given mark. + +However, our plans for Climacs go further than creating an improved +implementation of Emacs. We intend to make Climacs a fully-integrated +CLIM application. This implies among other things using the +presentation-type system of CLIM to exchange objects between Climacs +and other CLIM appliations such as an inspector application, a +debugger pane, etc. We also hope that implementors of other CLIM +applications such as mail readers, news readers, etc, will consider +using Climacs for creating messages. + +We are often asked whether applications such as VM and Gnus for GNU +Emacs will be available for Climacs. Our opinion is that such +applications currenltly run as GNU Emacs subsystems, simply because +GNU Emacs does not have an independent substrate such as CLIM for +creating user interfaces. Climacs is itself a CLIM application, and +applications such as mail readers and news readers that do not require +editable buffers should instead be implemented directly as CLIM +applications, perhaps calling Climacs to write messages.
\nocite{*}