The other day, Alessio and I were talking about how unfortunate it is
that we use the reader to create relatively simple structures, such as
lists in our Java classes. To illustrate:
(defun foo ()
(let ((x '(1 2 3)))
x))
In the above definition, if being file-compiled, the litteral list (1
2 3) gets serialized into a string which is - at load time - converted
back to an object like this:
LispObject theObject = Lisp.readObjectFromString("(1 2 3)");
In our minds, it would be faster to generate:
LispObject theObject = new Cons(new Fixnum(1), new Cons(new
Fixnum(2), new Cons(new Fixnum(3)))));
Because that eliminates the need to set up a stream, reading from a
string, running all the characters through the reader, including
running of dispatch functions, etc.
While that would be a nice optimization, I've run into a correctness
issue investigating our current serialization.
Now consider the following macro:
(defmacro foo (x)
`(defun bar ()
(let ((a ,x)
(b ,x))
(eq a b))))
(foo '(1 2 3))
This macro can be used to generate an example of 2 objects sharing
structure to the extreme: they are the same object. The same object is
supposed to be assigned to A and B. In the interpreter, the result of
a function call to BAR returns T. However, after filecompiling the
above forms and loading the resulting fasl, the outcome of BAR is NIL.
Suddenly, the forms are not the same anymore. The answer to that is
too simple: the object assigned to A and B is being read for A and B
separately with a readObjectFromString call, like this:
LispObject theObjectA = Lisp.readObjectFromString("(1 2 3)");
LispObject theObjectB = Lisp.readObjectFromString("(1 2 3)");
The above is clearly wrong. The solution to this case is very simple:
we should detect "duplicates" being serialized by the
DECLARE-OBJECT-AS-STRING function.
Although the above solution would solve the simple case of object
equality, there are other examples which it won't solve, think about
the following macro:
(defmacro foo (x)
`(defun bar ()
(let ((a ,x)
(b `(0 ,,x)))
(eq a (cdr b)))))
The macro above will generate 2 different values to be serialized by
the DECLARE-OBJECT-AS-STRING function, failing to detect the fact that
the two share structure. Possibly, we could come up with a way to
detect objects with shared structure and generate the right
class-setup code to make the above example work.
But that's not all there is to this problem: the objects sharing
structure can be distributed over multiple Java classes; consider the
fact that our local functions are Java classes of their own, each with
their own constants.
One of the ideas that Alessio and I had was that it would be nice to
have a single pool of constants for an entire FASL, for the separate
function classes to refer to when required. Looking at the above
issue, I'd say we really do need such a facility. The only per-fasl
facility for objects being equal that we currently have, is the vector
with anonymous symbols.
Any ideas on how to tackle this issue? I suppose other lisps don't
have this issue because they don't use separate classes for their
functions and have more influence over the storage format of their
data in the fasl...
Bye,
Erik.