The other day, Alessio and I were talking about how unfortunate it is that we use the reader to create relatively simple structures, such as lists in our Java classes. To illustrate:
(defun foo () (let ((x '(1 2 3))) x))
In the above definition, if being file-compiled, the litteral list (1 2 3) gets serialized into a string which is - at load time - converted back to an object like this:
LispObject theObject = Lisp.readObjectFromString("(1 2 3)");
In our minds, it would be faster to generate:
LispObject theObject = new Cons(new Fixnum(1), new Cons(new Fixnum(2), new Cons(new Fixnum(3)))));
Because that eliminates the need to set up a stream, reading from a string, running all the characters through the reader, including running of dispatch functions, etc.
While that would be a nice optimization, I've run into a correctness issue investigating our current serialization.
Now consider the following macro:
(defmacro foo (x) `(defun bar () (let ((a ,x) (b ,x)) (eq a b))))
(foo '(1 2 3))
This macro can be used to generate an example of 2 objects sharing structure to the extreme: they are the same object. The same object is supposed to be assigned to A and B. In the interpreter, the result of a function call to BAR returns T. However, after filecompiling the above forms and loading the resulting fasl, the outcome of BAR is NIL.
Suddenly, the forms are not the same anymore. The answer to that is too simple: the object assigned to A and B is being read for A and B separately with a readObjectFromString call, like this:
LispObject theObjectA = Lisp.readObjectFromString("(1 2 3)"); LispObject theObjectB = Lisp.readObjectFromString("(1 2 3)");
The above is clearly wrong. The solution to this case is very simple: we should detect "duplicates" being serialized by the DECLARE-OBJECT-AS-STRING function.
Although the above solution would solve the simple case of object equality, there are other examples which it won't solve, think about the following macro:
(defmacro foo (x) `(defun bar () (let ((a ,x) (b `(0 ,,x))) (eq a (cdr b)))))
The macro above will generate 2 different values to be serialized by the DECLARE-OBJECT-AS-STRING function, failing to detect the fact that the two share structure. Possibly, we could come up with a way to detect objects with shared structure and generate the right class-setup code to make the above example work.
But that's not all there is to this problem: the objects sharing structure can be distributed over multiple Java classes; consider the fact that our local functions are Java classes of their own, each with their own constants.
One of the ideas that Alessio and I had was that it would be nice to have a single pool of constants for an entire FASL, for the separate function classes to refer to when required. Looking at the above issue, I'd say we really do need such a facility. The only per-fasl facility for objects being equal that we currently have, is the vector with anonymous symbols.
Any ideas on how to tackle this issue? I suppose other lisps don't have this issue because they don't use separate classes for their functions and have more influence over the storage format of their data in the fasl...
Bye,
Erik.