Re: [cl-unification-devel] Questions and patches involving cl-unification's MATCH* macros

18 Jan 2010

      Hello there.

thanks a lot for all the work.  I have read through the messages and I  
agree with your analysis with a few comments.

I think all the changes you propose are worth including.

Let me comment on a few of these.

On Jan 16, 2010, at 06:43 , Pixel // pinterface wrote:
...
Greetings!  I was looking into improving the efficiency of MATCH- 
CASE and
while working on that I've ended up with a few questions regarding  
some
implementation decisions, and drafted up some patches both to help  
you see
my thoughts and for consideration of inclusion if you think they're  
in the
right direction.
Patches are numbered in the order created, and generally depend on
lower-numbered patches (not always intentionally--it's hard to avoid
inter-patch dependencies when making multiple changes to a small  
area).  I
thought I made the patches in a reasonable order while I was making  
them,
but I find myself referring to them in a different sequence than  
what I made
them in, so apologies if that causes any confusion.
I've also put up a darcs mirror (in darcs2 format) of the CVS  
repository
plus the attached patches, in case you find that more convenient.   
You can
find it at
 http://repo.kepibu.org/cl-unification/
** Duplicate code
There's a fair bit of code duplication between MATCH, MATCHF, and
MATCHING in the form of GENERATE-VAR-BINDINGS and the production of
the let forms that surround FORMS in those macros.  (Nevermind that
MATCH and MATCHF are identical except in what they do with template.)
Yep.  Most of the code in the MATCH-BLOCK file is cut'n'paste.
...
Patches 0 and 2 help somewhat.
0) Extract template handling of MATCH[ING] into %TEMPLATE-FOR-MATCH
  Should be fairly self-explanatory; not much of a win in lines.
2) Extract the bits that wrap forms with bindings for template  
variables
  This patch extracts the assorted flet GENERATE-VAR-BINDINGS into a
  single common function.  It's slightly less straightforward than  
that,
  however, because it also extracts the (let ...) forms which actually
  used those bindings.
This results in a decent win for code size, but in some cases it
  swaps the order of execution of %TEMPLATE-FOR-MATCH and
  COLLECT-TEMPLATE-VARS.  I'm pretty sure this doesn't have any
  noticable effect, but thorough testing is probably wise.
This is good.  However, I would set up a bunch of regression tests  
before moving on and change things.  I usually use the test suite from  
Franz, but I'd be willing to change to another test suite.
...
** (unify* ...) and (ignore-errors (unify ...))
The MATCHING macro produces the form (ignore-errors (unify ...)),  
presumably
because it existed before the convenience function UNIFY*.  But I'm  
curious
about UNIFY* and/or the ignore-errors form in MATCHING.  Is ignore- 
errors
the proper form in either of these places, or was it just easier to  
type
than
 (handler-case (unify ...)
   (unification-failure () nil))
?
Most implementations expand IGNORE-ERRORS in something like that.  But  
I agree that (as you point out later on) that UNIFY* should be written  
as

(defmethod unify* (a b &optional (env (make-empty-environment)))
      (handler-case (unify a b env)
           (unification-failure () nil))

In this way other conditions are passed on.
...
Looking at some of the assorted UNIFY methods, I see UNIFICATION- 
FAILUREs
aren't the only possible condition which might be thrown during  
unification
(e.g., LAMBDA-LIST-PARSING-ERROR is also a possibility).  However,  
while
MATCHING treats any error as a failure to unify, MATCH, MATCHF,  
MATCH-CASE,
and MATCHF-CASE all limit themselves to only UNIFICATION-FAILUREs.   
This
inconsistency leads me to wonder which is correct.
I assume only UNIFICATION-FAILUREs should be treated as such in  
patch 5
because all ERRORs seems a bit heavy-handed in a language with such an
excellent condition system, but can see the argument that any error / 
is/ a
failure to unify.
1) Use (unify* ...) rather than (ignore-errors (unify ...))
  Same thing, so might as well use the convenience function.
Not particularly exciting.  And this bit is changed again in patch 5
  to ignore only UNIFICATION-FAILUREs.
Yep.  See above.
...
5) Make MATCHING agree with MATCH[F][-CASE] about the conditions of  
failure
  Rather than skipping to the next clause on any error,
  UNIFICATION-FAILUREs--and /only/ UNIFICATION-FAILUREs--skip to the  
next
  clause.
Under the assumption that UNIFY* is implemented as intended, this  
creates
  a new utility function UNIFY** (terrible name, I know!) which is  
like
  UNIFY* but ignores /only/ UNIFICATION-FAILUREs rather than all  
ERRORs.
This of course results in a library-user-visible change to how  
MATCHING
  operates, which may or may not be considered acceptable.
Yep.
...
** (matching (otherwise ...))
3) Fix (matching (otherwise ...))
  (matching (otherwise ...)) expands into (cond (otherwise ...)),  
which
  generates an unbound-variable error when executed, because COND does
  not special-case OTHERWISE as CASE does.
Pretty self-explanatory and not very interesting.
Good catch.
...
** &body vs. &rest
4) Use &body instead of &rest for (arguably) prettier auto-indentation
  I prefer the smaller indentation provided with using &body instead
  of &rest.  Not a big deal either way, but I was in there so I
  figured I'd throw it in the mix.  Fortunately, if you disagree,
  this patch shouldn't affect any of the others. :)
I agree.  It is better self documentation as well.
...
** Multiple #:UNIFICATION-ENVs in MATCHING
6) Only use one variable to store the unification environment in  
MATCHING
  Because of the way MATCHING expands, and what UNIFY** returns, each
    (setf #:env (unify** ...))
  call will do one of two things: it will set #:env to NIL or it  
will set
  #:env to an ENVIRONMENT structure.
If #:env is set to NIL--the same value it entered the (setf)  
with!--the
  COND will continue on to the next clause.
If #:env is set to an ENVIRONMENT structure, none of the remaining  
(setf)
  clauses will be evaluated.
Thus, because the variable will only ever be set to a non-nil  
value once,
  this is perfectly safe.
I think this is ok as well.
...
** MATCH-CASE
MATCH-CASE has the surprising behavior of
 (ignore-errors
   (match-case ("foo")
     ("foo" (error 'unification-failure))
     (t :default)))
 => :default
Hardly CASE-like behavior!
I think it sensible to redefine MATCH-CASE in terms of MATCHING,  
rather than
MATCH.  E.g., so the above would expand into something like
 (ignore-errors
   (let ((#:object-var "foo"))
     (matching (:errorp nil :default-substitution nil :matching- 
named nil)
       (("foo" #:object-var) (error 'unification-failure))
       (t :default))))
which greatly simplifies the definition of MATCH-CASE and makes it  
much more
CASE-like in behavior.
7) Redefine MATCH-CASE in terms of MATCHING
  This both greatly simplifies the MATCH-CASE macro as well as its
  expansion.
HOWEVER, this version is *NOT* 100% compatible with the previous
  version.  Specifically, UNIFICATION-FAILUREs signalled from within
  clause-forms will /not/ cause the next unification clause to be
  attempted, but will instead propogate outward as the -case name
  suggests they should.
That is,
    (ignore-errors
      (match-case ("foo")
        ("foo" (error 'unification-failure ...))
        (t :default)))
    => :default                    ;; before patch
    => nil, #<unification-failure> ;; after patch
Patch 7 suggests that MATCHING, MATCH-CASE, MATCHF-CASE, and the  
possible
future MATCH-COND and MATCHF-COND are all fairly trivial variations  
on each
other.  I'll be happy to explore that possibility in future patches  
if it's
of interest, but I'm already asking you to look at too many patches  
at once,
so I'll refrain for now. :)
I agree with this as well.  The behavior you suggest is the one that's  
needed.
...
** MATCH
Though documented, MATCH's behavior when its body forms signal
UNIFICATION-FAILURE seems at odds with expectations, for pretty much  
the
same reason the behavior seems nonsensical in MATCH-CASE.  Was it a
deliberate design decision, or a side-effect of how MATCH was  
implemented?
Never attribute to malice what can be attributed to incompetence.  Of  
course it is a side-effect of the implementation :)  I knew I needed a  
test suite :)
...
If a side effect of implementation, would it make sense to redefine  
MATCH to
have an expansion more like
 (match ("foo" "foo" :errorp <e?> :error-value <ev>) <body>) =>
 (m-v-b (#:env #:error) (unify** "foo" "foo")
   (cond
     ((and #:error <e?>) (error #:error))
     (#:error <ev>)
     (t (let* (#| template vars |#)
          <body>))))
so UNIFICATION-FAILUREs inside <body> can be independent of errorp?
(No patch for this one yet.)
** :MATCH-NAMED, :MATCHING-NAMED, :MATCH-CASE-NAMED
It seems a bit redundant to have <macro-name>-named arguments for  
block
names rather than simply having a :named argument as in, say, LOOP.   
Any
particular reason it wasn't done that way?
Having the :NAMED option would be preferable, with NIL or the macro  
name as fall-back.
...
(No patch for this one, either.)
** Apologies
Sorry about the massive brain dump.  I /may/ have gotten carried  
away. :)
Absolutely not.  All of this is most welcome.  I am just a little  
swamped (what an euphemism!) and maybe not all that reactive.  But  
let's keep the discussion going.

all the best

Marco

Re: [cl-unification-devel] Questions and patches involving cl-unification's MATCH* macros

Marco Antoniotti