Greetings! I was looking into improving the efficiency of MATCH-CASE and while working on that I've ended up with a few questions regarding some implementation decisions, and drafted up some patches both to help you see my thoughts and for consideration of inclusion if you think they're in the right direction.
Patches are numbered in the order created, and generally depend on lower-numbered patches (not always intentionally--it's hard to avoid inter-patch dependencies when making multiple changes to a small area). I thought I made the patches in a reasonable order while I was making them, but I find myself referring to them in a different sequence than what I made them in, so apologies if that causes any confusion.
I've also put up a darcs mirror (in darcs2 format) of the CVS repository plus the attached patches, in case you find that more convenient. You can find it at http://repo.kepibu.org/cl-unification/
** Duplicate code
There's a fair bit of code duplication between MATCH, MATCHF, and MATCHING in the form of GENERATE-VAR-BINDINGS and the production of the let forms that surround FORMS in those macros. (Nevermind that MATCH and MATCHF are identical except in what they do with template.)
Patches 0 and 2 help somewhat.
0) Extract template handling of MATCH[ING] into %TEMPLATE-FOR-MATCH Should be fairly self-explanatory; not much of a win in lines.
2) Extract the bits that wrap forms with bindings for template variables This patch extracts the assorted flet GENERATE-VAR-BINDINGS into a single common function. It's slightly less straightforward than that, however, because it also extracts the (let ...) forms which actually used those bindings.
This results in a decent win for code size, but in some cases it swaps the order of execution of %TEMPLATE-FOR-MATCH and COLLECT-TEMPLATE-VARS. I'm pretty sure this doesn't have any noticable effect, but thorough testing is probably wise.
** (unify* ...) and (ignore-errors (unify ...))
The MATCHING macro produces the form (ignore-errors (unify ...)), presumably because it existed before the convenience function UNIFY*. But I'm curious about UNIFY* and/or the ignore-errors form in MATCHING. Is ignore-errors the proper form in either of these places, or was it just easier to type than (handler-case (unify ...) (unification-failure () nil)) ?
Looking at some of the assorted UNIFY methods, I see UNIFICATION-FAILUREs aren't the only possible condition which might be thrown during unification (e.g., LAMBDA-LIST-PARSING-ERROR is also a possibility). However, while MATCHING treats any error as a failure to unify, MATCH, MATCHF, MATCH-CASE, and MATCHF-CASE all limit themselves to only UNIFICATION-FAILUREs. This inconsistency leads me to wonder which is correct.
I assume only UNIFICATION-FAILUREs should be treated as such in patch 5 because all ERRORs seems a bit heavy-handed in a language with such an excellent condition system, but can see the argument that any error /is/ a failure to unify.
1) Use (unify* ...) rather than (ignore-errors (unify ...)) Same thing, so might as well use the convenience function.
Not particularly exciting. And this bit is changed again in patch 5 to ignore only UNIFICATION-FAILUREs.
5) Make MATCHING agree with MATCH[F][-CASE] about the conditions of failure Rather than skipping to the next clause on any error, UNIFICATION-FAILUREs--and /only/ UNIFICATION-FAILUREs--skip to the next clause.
Under the assumption that UNIFY* is implemented as intended, this creates a new utility function UNIFY** (terrible name, I know!) which is like UNIFY* but ignores /only/ UNIFICATION-FAILUREs rather than all ERRORs.
This of course results in a library-user-visible change to how MATCHING operates, which may or may not be considered acceptable.
** (matching (otherwise ...))
3) Fix (matching (otherwise ...)) (matching (otherwise ...)) expands into (cond (otherwise ...)), which generates an unbound-variable error when executed, because COND does not special-case OTHERWISE as CASE does.
Pretty self-explanatory and not very interesting.
** &body vs. &rest
4) Use &body instead of &rest for (arguably) prettier auto-indentation I prefer the smaller indentation provided with using &body instead of &rest. Not a big deal either way, but I was in there so I figured I'd throw it in the mix. Fortunately, if you disagree, this patch shouldn't affect any of the others. :)
** Multiple #:UNIFICATION-ENVs in MATCHING
6) Only use one variable to store the unification environment in MATCHING Because of the way MATCHING expands, and what UNIFY** returns, each (setf #:env (unify** ...)) call will do one of two things: it will set #:env to NIL or it will set #:env to an ENVIRONMENT structure.
If #:env is set to NIL--the same value it entered the (setf) with!--the COND will continue on to the next clause.
If #:env is set to an ENVIRONMENT structure, none of the remaining (setf) clauses will be evaluated.
Thus, because the variable will only ever be set to a non-nil value once, this is perfectly safe.
** MATCH-CASE
MATCH-CASE has the surprising behavior of (ignore-errors (match-case ("foo") ("foo" (error 'unification-failure)) (t :default))) => :default Hardly CASE-like behavior!
I think it sensible to redefine MATCH-CASE in terms of MATCHING, rather than MATCH. E.g., so the above would expand into something like (ignore-errors (let ((#:object-var "foo")) (matching (:errorp nil :default-substitution nil :matching-named nil) (("foo" #:object-var) (error 'unification-failure)) (t :default)))) which greatly simplifies the definition of MATCH-CASE and makes it much more CASE-like in behavior.
7) Redefine MATCH-CASE in terms of MATCHING This both greatly simplifies the MATCH-CASE macro as well as its expansion.
HOWEVER, this version is *NOT* 100% compatible with the previous version. Specifically, UNIFICATION-FAILUREs signalled from within clause-forms will /not/ cause the next unification clause to be attempted, but will instead propogate outward as the -case name suggests they should.
That is, (ignore-errors (match-case ("foo") ("foo" (error 'unification-failure ...)) (t :default))) => :default ;; before patch => nil, #<unification-failure> ;; after patch
Patch 7 suggests that MATCHING, MATCH-CASE, MATCHF-CASE, and the possible future MATCH-COND and MATCHF-COND are all fairly trivial variations on each other. I'll be happy to explore that possibility in future patches if it's of interest, but I'm already asking you to look at too many patches at once, so I'll refrain for now. :)
** MATCH
Though documented, MATCH's behavior when its body forms signal UNIFICATION-FAILURE seems at odds with expectations, for pretty much the same reason the behavior seems nonsensical in MATCH-CASE. Was it a deliberate design decision, or a side-effect of how MATCH was implemented?
If a side effect of implementation, would it make sense to redefine MATCH to have an expansion more like (match ("foo" "foo" :errorp <e?> :error-value <ev>) <body>) => (m-v-b (#:env #:error) (unify** "foo" "foo") (cond ((and #:error <e?>) (error #:error)) (#:error <ev>) (t (let* (#| template vars |#) <body>)))) so UNIFICATION-FAILUREs inside <body> can be independent of errorp?
(No patch for this one yet.)
** :MATCH-NAMED, :MATCHING-NAMED, :MATCH-CASE-NAMED
It seems a bit redundant to have <macro-name>-named arguments for block names rather than simply having a :named argument as in, say, LOOP. Any particular reason it wasn't done that way?
(No patch for this one, either.)
** Apologies
Sorry about the massive brain dump. I /may/ have gotten carried away. :)