Update of /project/elephant/cvsroot/elephant In directory common-lisp.net:/tmp/cvs-serv26826
Modified Files: NOTES Log Message: updates
Date: Sun Sep 19 19:39:59 2004 Author: blee
Index: elephant/NOTES diff -u elephant/NOTES:1.4 elephant/NOTES:1.5 --- elephant/NOTES:1.4 Mon Aug 30 23:37:36 2004 +++ elephant/NOTES Sun Sep 19 19:39:59 2004 @@ -3,10 +3,10 @@ GENERAL -------
-this has been optimized for use with CMUCL. it has been -tested and somewhat optimized for allegro. SBCL and OpenMCL -are definitely also desired targets. Lispworks is a target -as well but less so: i don't have access to it. +this has been optimized for use with CMUCL / SBCL. it has +been tested and somewhat optimized for allegro. OpenMCL is +definitely also a target. Lispworks is a target as well but +less so: i don't have access to it.
Theoretically one can port this to any lisp with a decent FFI and MOP. However since those are two of the less @@ -46,6 +46,10 @@ slot-boundp-using-class inside of shared-initialize, which necessitates some work.
+CMUCL doesn't do non-standard allocation types correctly, so +we've created our own slot definition keyword :transient. +In the future this will change. + Andrew will add some notes here in the future.
----------- @@ -89,8 +93,25 @@ over ordinary hash-tables from the point of view of persistence.
-TODO: programmatic way to create secondary indicies -(probably Lisp-level, since FFI callbacks are nasty.) +There is a separate table for BTrees. This is because we +use a hand coded C function for sorting, which understands a +little of the serialized data. It can handle numbers (up to +64-bit bignums -- they are approximated by floats) and +strings (case-insensitive for 8-bit, code-point-order for +16-bit Unicode.) It should be fast but we don't want a +performance penalty on objects. + +Secondary indices are mostly handled on the lisp side, +because of our weird table layout (see below) and to avoid +crossing FFI boundaries. Some unscientific microbenchmarks +indicated that there was no performance benefit on CMUCL / +SBCL, and only minor benefit (asymptotically nil) on +OpenMCL. They have a separate table. Actually two handles +are opened on this table: one which is plain, and one which +is associated to the primary btree table by a no-op indexing +function. Since we maintain the secondary keys ourselves, +the associated handle is good for gets / cursor traversals. +We use the unassociated handle for updates.
---------- CONTROLLER @@ -142,13 +163,15 @@
OID + Slot ID
-Collections use +Collections use 2 tables, one for primaries and one for +secondaries (which supports duplicates.) They are keyed on
OID + key
-the root object is a btree with OID = 0. Since keys are +The root object is a btree with OID = 0. Since keys are lexicographically ordered, this will create cache locality -for items in the same persistent object / collection. +for items in the same persistent object / collection. We +use a custom C sorter for the btree tables.
Other layout options:
@@ -214,7 +237,7 @@ CMUCL's consing dpb/ldb arithmetic means serializing bignums conses (but they shouldn't have to!) Serializing everything else should not cons (with the exception of maybe symbols -and pathnames.) +and pathnames.) SBCL seems much better with this.
Deserialization of fixnums is non-consing. floats appear to cons on CMUCL, i'm not sure if this is just because of @@ -300,15 +323,17 @@ pointer-arithmetic is bignum and therefore consing.
TODO: write faster, lispier versions of the -pointer-arithmetic functions. (Definitely possible under -OpenMCL; maybe possible using SAP arithmetic under CMUCL. -Dunno about Allegro, Lispworks.) +pointer-arithmetic functions. This is done for CMUCL / +SBCL. (Definitely possible under OpenMCL. Dunno about +Allegro, Lispworks.)
CMUCL et al can't do dynamic-extent buffers, so we use globals bound to specials, which should be thread-safe if properly initialized. While we provide functions talk to -the DB using strings, Elephant itself only uses foreign char -buffers. +the DB using strings, Elephant itself only uses +"buffer-streams", which are structures which have a +stream-like interface to foreign char buffers for reading / +writing C datatypes.
Lispworks is much happier passing back and forth statically allocated lisp arrays. since the general string will almost