CDR-8 and hash tables

6 Sep 2017

      Greetings, I feel timid sending this email to this list given that it’s
been dormant since 2014. But after a few queries on #lisp, no better place
was offered, and the currently-published information on CDRs directs
readers to this list.

Here is the substance of my inquiry:

I believe CDR-8’s specification of EQUALS with respect to hash tables is
incomplete, and its design may lead to some confusion. Here is an example,
which uses an implementation of EQUALS provided by the library
:generic-comparability[1]:

CL-USER> (ql:quickload :generic-comparability)

(:GENERIC-COMPARABILITY)

CL-USER> (defvar a (make-hash-table :test 'equalp))

A

CL-USER> (defvar b (make-hash-table :test 'equalp))

B

CL-USER> (setf (gethash "x" a) 1)

1

CL-USER> (setf (gethash "X" b) 1)

1

CL-USER> (loop for k being the hash-keys in a

           always (eql (gethash k a) (gethash k b)))

T

CL-USER> (generic-comparability:equals a b)

NIL

Here we see EQUALS disagreeing with a GETHASH-based equality test[2], which
CDR-8 might argue is by design. EQUALS is its own equality test, so it
should override whatever is set in the hash table it’s given. However, the
spec is not explicit on this point, which may be surprising to the
unsuspecting programmer.

Moreover, let’s continue: let’s assume the :by-key argument to EQUALS is
NIL, so the request is to compare only the hash table values. Well, the
only way to correctly compare the values in two tables is to arrange them
by their keys, and then compare them. To do otherwise would be meaningless.
But what function should be used to arrange the keys? If EQUALS is used,
then the result will be identical to :by-key T, and will be inconsistent
with an EQUALP hash-table as per my example.

It would seem more appropriate to use the function designated by
HASH-TABLE-TEST, but CDR-8 is silent on this question, which makes the
specification incomplete.

A secondary problem with CDR-8 is in the example implementation given for
hash tables. The LOOP presented implicitly depends on HT-KEYS returning
keys in some sort of consistent, total ordering, because otherwise it would
be meaningless to do a key-wise comparison. However, the spec is silent on
this requirement. And it’s not obvious what comparison function CDR-8 would
prefer: its own COMPARE, or the user-specified HASH-TABLE-TEST function.

In practice, at least under SBCL, iterating through a hash table with LOOP
will generate the keys in a different order depending on their insertion
order. (I believe the order in which hash table keys are iterated is
unspecified in the standard.)

Furthermore, it seems nonsensical to provide for disjoint comparison of
keys and values. For example, should a test of the form (EQUALS (:a 1 :b 2)
(:a 2 :b 1) :by-keys T :by-value NIL) return T, and (... :by-keys NIL
:by-value T) also return T?

The most common case for comparing hash tables is to ensure they have the
same number of keys, that the set of keys in each table are equal, and that
the values in the slots referred to by the keys are also equal. The
proposed implementation does not provide this.

I hope this exposition is helpful. I’d accept any suggestions on how to
help improve CDR-8 or :generic-comparability.

Best,

anticrisis

[1] https://github.com/pnathan/generic-comparability

[2] Looping through the keys of A is only part of the equality test. Not
shown is checking the other hash table properties, in particular, the hash
table test function, as returned by HASH-TABLE-TEST.

Anticrisis

Antoniotti Marco

tags

participants (2)