Update of /project/elephant/cvsroot/elephant
In directory common-lisp.net:/tmp/cvs-serv26826
Modified Files:
NOTES
Log Message:
updates
Date: Sun Sep 19 19:39:59 2004
Author: blee
Index: elephant/NOTES
diff -u elephant/NOTES:1.4 elephant/NOTES:1.5
--- elephant/NOTES:1.4 Mon Aug 30 23:37:36 2004
+++ elephant/NOTES Sun Sep 19 19:39:59 2004
@@ -3,10 +3,10 @@
GENERAL
-------
-this has been optimized for use with CMUCL. it has been
-tested and somewhat optimized for allegro. SBCL and OpenMCL
-are definitely also desired targets. Lispworks is a target
-as well but less so: i don't have access to it.
+this has been optimized for use with CMUCL / SBCL. it has
+been tested and somewhat optimized for allegro. OpenMCL is
+definitely also a target. Lispworks is a target as well but
+less so: i don't have access to it.
Theoretically one can port this to any lisp with a decent
FFI and MOP. However since those are two of the less
@@ -46,6 +46,10 @@
slot-boundp-using-class inside of shared-initialize, which
necessitates some work.
+CMUCL doesn't do non-standard allocation types correctly, so
+we've created our own slot definition keyword :transient.
+In the future this will change.
+
Andrew will add some notes here in the future.
-----------
@@ -89,8 +93,25 @@
over ordinary hash-tables from the point of view of
persistence.
-TODO: programmatic way to create secondary indicies
-(probably Lisp-level, since FFI callbacks are nasty.)
+There is a separate table for BTrees. This is because we
+use a hand coded C function for sorting, which understands a
+little of the serialized data. It can handle numbers (up to
+64-bit bignums -- they are approximated by floats) and
+strings (case-insensitive for 8-bit, code-point-order for
+16-bit Unicode.) It should be fast but we don't want a
+performance penalty on objects.
+
+Secondary indices are mostly handled on the lisp side,
+because of our weird table layout (see below) and to avoid
+crossing FFI boundaries. Some unscientific microbenchmarks
+indicated that there was no performance benefit on CMUCL /
+SBCL, and only minor benefit (asymptotically nil) on
+OpenMCL. They have a separate table. Actually two handles
+are opened on this table: one which is plain, and one which
+is associated to the primary btree table by a no-op indexing
+function. Since we maintain the secondary keys ourselves,
+the associated handle is good for gets / cursor traversals.
+We use the unassociated handle for updates.
----------
CONTROLLER
@@ -142,13 +163,15 @@
OID + Slot ID
-Collections use
+Collections use 2 tables, one for primaries and one for
+secondaries (which supports duplicates.) They are keyed on
OID + key
-the root object is a btree with OID = 0. Since keys are
+The root object is a btree with OID = 0. Since keys are
lexicographically ordered, this will create cache locality
-for items in the same persistent object / collection.
+for items in the same persistent object / collection. We
+use a custom C sorter for the btree tables.
Other layout options:
@@ -214,7 +237,7 @@
CMUCL's consing dpb/ldb arithmetic means serializing bignums
conses (but they shouldn't have to!) Serializing everything
else should not cons (with the exception of maybe symbols
-and pathnames.)
+and pathnames.) SBCL seems much better with this.
Deserialization of fixnums is non-consing. floats appear to
cons on CMUCL, i'm not sure if this is just because of
@@ -300,15 +323,17 @@
pointer-arithmetic is bignum and therefore consing.
TODO: write faster, lispier versions of the
-pointer-arithmetic functions. (Definitely possible under
-OpenMCL; maybe possible using SAP arithmetic under CMUCL.
-Dunno about Allegro, Lispworks.)
+pointer-arithmetic functions. This is done for CMUCL /
+SBCL. (Definitely possible under OpenMCL. Dunno about
+Allegro, Lispworks.)
CMUCL et al can't do dynamic-extent buffers, so we use
globals bound to specials, which should be thread-safe if
properly initialized. While we provide functions talk to
-the DB using strings, Elephant itself only uses foreign char
-buffers.
+the DB using strings, Elephant itself only uses
+"buffer-streams", which are structures which have a
+stream-like interface to foreign char buffers for reading /
+writing C datatypes.
Lispworks is much happier passing back and forth statically
allocated lisp arrays. since the general string will almost