Pretty much since the beginning, Climacs redisplay has been quite slow, and this trait was unfortunately ported over when Drei was created. Over the last few days I've been committing my work on a new redisplay engine, one that is much faster and more powerful than the old one.
The old redisplay engine was never much of an actual engine, every syntax had to implement its own redisplay code from the bottom up, and these syntaxes just walked across their parse tree, calling CLIM drawing functions and using incremental output for performance. Sometimes, they would call a function (the infamous `handle-whitespace') to move the cursor as demanded by newlines or space characters, and at various points ugly kludges were added to this function to try to record at least a little of the display structure, such as where lines started and ended on the screen. But in general, everything was pretty much a jungle, and it wasn't all that fast, partially because McCLIM implementation of incremental-redisplay is admittedly not optimal, and partially because I severely doubt CLIM incremental-redisplay is even meant to be used for the kind of real-time performance needed in an editor.
The new redisplay engine was developed based on a key assumption: output records are nice, flexible and useful, but they are too slow and heavy. Hence, the new redisplay engine does not use output records, except for handling cursors and some other exotic cases. Instead, it divides the visible region of the buffer into "strokes". A stroke is a buffer region that can be drawn in a single operation with a single set of drawing options, and that does not cross lines. Put another way, a stroke is a sequence of characters in a line with the same colour and font (strokes can also cover non-characters, but let's ignore that for now). The new redisplay algorithm thus works by fetching strokes from the buffer, starting at the top of the display, and drawing them to the screen until we reach either the end of the buffer or the bottom of the visible part of the output sheet. When strokes are drawn, we remember the location and size of their output.
Due to the constraint that strokes cannot cross lines, we can trivially organise strokes into lines and just check for whether a stroke directly precedes a #\Newline character to figure out when we should go to the next line, and when we do this, look at the strokes of the line and figure out the dimension of the line.
Stroke objects are kept across display, and are mutated by the stroke pump (see below), so we can easily check whether a stroke has changed (we mark it as "dirty" and "modified") since the last redisplay, simply by having the pump check whether it is going store already-existing data in the stroke object. If a stroke is obscured for some reason (for example due to a moving window, or part of the cursor being drawn over it), we also mark it as dirty. Taken together, this gives us incremental redisplay.
The interesting problem is now how to generate strokes for the redisplay engine. This is done through two generic functions, `pump-state-for-offset' and `stroke-pump', that are used to "pump" stroke data into an already existing stroke object. This is both to implement incremental redisplay, as mentioned above, and to avoid consing (if you read the code, you'll notice that I've been obsessed with minimising consing in general, perhaps this was not always necessary). For the most common case, these functions just relay to the syntax of the view, which results in either the simple stroke pumping defined in fundamental-syntax.lisp, or the terrifying horrors of lr-syntax.lisp. There is nothing special about stroke pumping, except that it has to be really fast as it is done in full for every redisplay. All the clever caching and "only handle that which has actually changed"-stuff is done at the higher level by looking at the dirtiness of strokes, and at the lower level by the incremental syntax parsers. The stroke pump just has to be fast (and fortunately, it is for the most part). When the view is a pure buffer-view (that is, has no syntax) there is a simple pump defined in drei-redisplay.lisp that just turns each line into a stroke (possibly chopping it up if it's very long), it's not fast, but it's simple, and you can look at it to get a general idea of how it works.
The major performance issue right now is that a stroke is only considered unmodified when neither its drawing-options (colours, etc) nor its start/end-offsets have been modified. This is suboptimal, because the offsets of subsequent strokes will change when you insert or remove something from a buffer, so every time you insert a character, the buffer from point till the end of the display will be redrawn, while it is most likely not strictly necessary for anything but the current line (unless you significantly modified the syntax parse tree by your change, of course).
Anyway, the new engine is not horribly buggy, though it is obviously new and untested, and it's quite fast on my machine. Not Emacs-speed, but significantly faster than the old one, and significantly easier to optimise further.
Oh yeah, and a final note: implemented and tested with McCLIM-Freetype, has bugs without Freetype, and has never been run on a non-CLX backend.