Hello Slime developers,
For the past while, I've been using a Mac OS X program called Quicksilver, which is sort of a multi-purpose quick access tool. It features a fuzzy completion algorithm that I've quite grown to like. For example I can hit Command-Space and type "liswp<return>", and it will match and launch "LISpWorks Personal".
I decided I wanted to add something like this (only better :) to Slime, to go alongside the existing completion algorithm. It can't really be a replacement as they are good at different things. The existing one is good for incrementally typing and completing a symbol. The fuzzy one is good for typing a short version of the symbol, running the completer, and then picking the right result, which hopefully scored at or near the top.
While I've attached a patch, this is more a request for comments than a request to commit the code. As I'll discuss, the Emacs interface is incredibly lacking, mostly on account of this being my first time using elisp for anything but configuration. It's very ugly. Also, the completion core code in Swank should be cleaned up - it still looks somewhat hacky. Finally, everything needs documentation.
The elisp code has some chunks from isearch.el. Since they are both GPL, I figured this is okay. Besides, it will probably change anyway. :)
----------------------------------------------------------------------
The low level core of the algorithm are implemented in Swank, just like the old completer. The interface looks very similar:
; SLIME 2004-06-10 CL-USER> (setq *print-pretty* t) T CL-USER> (swank::completions "m-v-" "CL-USER") (("multiple-value-bind" "multiple-value-call" "multiple-value-list" "multiple-value-prog1" "multiple-value-setq" "multiple-values-limit") "multiple-value-") CL-USER> (swank::fuzzy-completions "mvb" "CL-USER") (("multiple-value-bind" 26.588236 ((0 "m") (9 "v") (15 "b"))) ("most-negative-double-float" 12.416667 ((0 "m") (11 "v") (17 "b"))) ("most-positive-double-float" 12.416667 ((0 "m") (11 "v") (17 "b"))) ("named-variable" 10.833333 ((2 "m") (6 "v") (11 "b"))) ("*compile-verbose*" 10.666667 ((3 "m") (9 "v") (12 "b"))) ("environment-variable" 10.555555 ((7 "m") (12 "v") (17 "b"))) ("remove-symbol-profiler" 3.5 ((2 "m") (4 "v") (10 "b"))))
The return value is obviously a bit more complicated. :) The fields are (completion score chunks), where "chunks" are a description of where the completion matched.
There's a convenience function in Swank that is not used by the Emacs interface to print this information in a human readable form. I used this to tune the scoring algorithm. This shows a little more clearly how the algorithm matches:
CL-USER> (swank::format-fuzzy-completions (swank::fuzzy-completions "mvb" "CL-USER")) Multiple-Value-Bind score 26.59 ((0 m) (9 v) (15 b)) Most-negatiVe-douBle-float score 12.42 ((0 m) (11 v) (17 b)) Most-positiVe-douBle-float score 12.42 ((0 m) (11 v) (17 b)) naMed-VariaBle score 10.83 ((2 m) (6 v) (11 b)) *coMpile-VerBose* score 10.67 ((3 m) (9 v) (12 b)) environMent-VariaBle score 10.56 ((7 m) (12 v) (17 b)) reMoVe-symBol-profiler score 3.50 ((2 m) (4 v) (10 b)) NIL
Here are some more examples, which might show why I think something like this can be useful. (There is an optional third argument to swank::fuzzy-completions, limit. This limits the number of results, which is important for this completer since completing, say, "a" will match many, many symbols. This isn't currently used from Emacs however.)
CL-USER> (swank::format-fuzzy-completions (swank::fuzzy-completions "stdout" "CL-USER" 5)) *STanDard-OUtpuT* score 34.83 ((1 st) (5 d) (10 ou) (15 t)) moST-negative-DOUble-floaT score 22.48 ((2 st) (14 dou) (25 t)) moST-positive-DOUble-floaT score 22.48 ((2 st) (14 dou) (25 t)) leaST-positive-DOUble-floaT score 22.45 ((3 st) (15 dou) (26 t)) leaST-negative-DOUble-floaT score 22.45 ((3 st) (15 dou) (26 t)) NIL CL-USER> (swank::format-fuzzy-completions (swank::fuzzy-completions "prp" "CL-USER" 5)) *PRint-Pretty* score 23.83 ((1 pr) (7 p)) PRint-Profile-list score 23.62 ((0 pr) (6 p)) *PRint-Pprint-dispatch* score 23.48 ((1 pr) (7 p)) *step-PRint-Pretty* score 20.59 ((6 pr) (12 p)) *trace-PRint-Pretty* score 20.56 ((7 pr) (13 p)) NIL CL-USER> (swank::format-fuzzy-completions (swank::fuzzy-completions "norm-df" "CL-USER" 5)) least-positive-NORMalized-Double-Float score 32.31 ((15 norm) ...) least-negative-NORMalized-Double-Float score 32.31 ((15 norm) ...) NIL CL-USER> (swank::format-fuzzy-completions (swank::fuzzy-completions "udc" "CL-USER" 1)) Update-instance-for-Different-Class score 26.30 ((0 u) (20 d) (30 c)) NIL CL-USER> (swank::format-fuzzy-completions (swank::fuzzy-completions "urc" "CL-USER" 1)) Update-instance-for-Redefined-Class score 26.30 ((0 u) (20 r) (30 c)) NIL
Despite being expensive (it tries all combinations of matches up to a relatively high cutoff per symbol to maximize the score) and not optimized at all, the completion speed doesn't seem bad at all on compiled Lisps on my 500mhz Powerbook. I haven't tried it on interpreted Lisps, but I imagine it wouldn't be too fun.
There are some scoring configuration parameters in swank.lisp that are mostly self-explanitory. Adjust to fit your own symbol naming schemes; these seemed to be the five-minute opinion of #lisp. Some way to set these from Emacs would probably be nice in the future.
(defparameter *fuzzy-completion-symbol-prefixes* "*+-%&?<") (defparameter *fuzzy-completion-symbol-suffixes* "*+->") (defparameter *fuzzy-completion-word-separators* "-/.")
The scoring algorithm is currently completely static. It would probably be a nifty project to implement some machine learning on it.
----------------------------------------------------------------------
Now, the Emacs interface. Well, it kind of sucks. I bound slime-fuzzy-complete-symbol to C-c M-i in the Slime map. The problem is that since the first letters of the short form aren't necessarily the first letters of the matches, the existing Emacs completion system will fail miserably with this algorithm.
What I did was, if there was more than one completion, pop up a window with the results (formatted nicely like above, but with a bold face instead of upcasing). Point starts on the highest scoring result, and the result is immediately placed into the Lisp buffer in place of the short form. Then the user can hit 'n' and 'p' to navigate the list of completions. As they do, the replacement is updated. To pick a completion and keep it, hit Return. To abort and revert to the short form, hit almost anything else. It is a recursive edit, like isearch, so doing anything else in Emacs will abort the completion. It's also implemented very poorly.
I don't really like this, but I'm not sure what a better option is. If you have the completion buffer come up normally, it will have embedded state from the Lisp source buffer. If the Lisp source buffer changes before a completion is picked, it may modify the wrong position. And again, it can't really work like the (stateless) normal completion.
----------------------------------------------------------------------
So anyway, I'd appreciate it if people could try it out. If anyone has good interface ideas (or even better, would implement them :) I'd like to hear them, since I'm not so happy with what I've got.
Thanks, -bcd