Update of /project/elephant/cvsroot/elephant/doc In directory clnet:/tmp/cvs-serv4997/doc
Modified Files: reference.texinfo tutorial.texinfo user-guide.texinfo Log Message: Export bdb performance tweaks; lots more documentation; new ops for libberkeley-db
--- /project/elephant/cvsroot/elephant/doc/reference.texinfo 2007/04/21 17:22:35 1.11 +++ /project/elephant/cvsroot/elephant/doc/reference.texinfo 2007/04/25 02:27:56 1.12 @@ -16,6 +16,7 @@ * Cursors:: Traversing BTrees. * Transactions:: Transactions. * Migration and Upgrading:: Migration and upgrading. +* Miscellaneous:: Other functions and data store specific functions @end menu
@node Store Controllers @@ -109,6 +110,7 @@ @include includes/fun-elephant-find-item.texinfo @include includes/fun-elephant-map-pset.texinfo @include includes/fun-elephant-pset-list.texinfo +@include includes/fun-elephant-drop-pset.texinfo
@c @node Query Interfaces @c @comment node-name, next, previous, up @@ -148,6 +150,7 @@ @include includes/fun-elephant-get-index.texinfo @include includes/fun-elephant-get-primary-key.texinfo @include includes/fun-elephant-remove-index.texinfo +@include includes/fun-elephant-drop-pset.texinfo
@node Cursors @comment node-name, next, previous, up @@ -161,33 +164,35 @@ @include includes/fun-elephant-make-cursor.texinfo @include includes/fun-elephant-cursor-close.texinfo
-@include includes/fun-elephant-cursor-current.texinfo -@include includes/fun-elephant-cursor-delete.texinfo @include includes/fun-elephant-cursor-duplicate.texinfo +@include includes/fun-elephant-cursor-current.texinfo @include includes/fun-elephant-cursor-first.texinfo -@include includes/fun-elephant-cursor-get-both-range.texinfo -@include includes/fun-elephant-cursor-get-both.texinfo @include includes/fun-elephant-cursor-last.texinfo -@include includes/fun-elephant-cursor-next-dup.texinfo -@include includes/fun-elephant-cursor-next-nodup.texinfo @include includes/fun-elephant-cursor-next.texinfo +@include includes/fun-elephant-cursor-prev.texinfo +@include includes/fun-elephant-cursor-set.texinfo +@include includes/fun-elephant-cursor-set-range.texinfo +@include includes/fun-elephant-cursor-get-both.texinfo +@include includes/fun-elephant-cursor-get-both-range.texinfo +@include includes/fun-elephant-cursor-delete.texinfo +@include includes/fun-elephant-cursor-put.texinfo + @include includes/fun-elephant-cursor-pcurrent.texinfo @include includes/fun-elephant-cursor-pfirst.texinfo -@include includes/fun-elephant-cursor-pget-both-range.texinfo -@include includes/fun-elephant-cursor-pget-both.texinfo @include includes/fun-elephant-cursor-plast.texinfo -@include includes/fun-elephant-cursor-pnext-dup.texinfo -@include includes/fun-elephant-cursor-pnext-nodup.texinfo @include includes/fun-elephant-cursor-pnext.texinfo -@include includes/fun-elephant-cursor-pprev-nodup.texinfo @include includes/fun-elephant-cursor-pprev.texinfo -@include includes/fun-elephant-cursor-prev-nodup.texinfo -@include includes/fun-elephant-cursor-prev.texinfo -@include includes/fun-elephant-cursor-pset-range.texinfo @include includes/fun-elephant-cursor-pset.texinfo -@include includes/fun-elephant-cursor-put.texinfo -@include includes/fun-elephant-cursor-set-range.texinfo -@include includes/fun-elephant-cursor-set.texinfo +@include includes/fun-elephant-cursor-pset-range.texinfo +@include includes/fun-elephant-cursor-pget-both.texinfo +@include includes/fun-elephant-cursor-pget-both-range.texinfo + +@include includes/fun-elephant-cursor-next-dup.texinfo +@include includes/fun-elephant-cursor-next-nodup.texinfo +@include includes/fun-elephant-cursor-pnext-dup.texinfo +@include includes/fun-elephant-cursor-pnext-nodup.texinfo +@include includes/fun-elephant-cursor-prev-nodup.texinfo +@include includes/fun-elephant-cursor-pprev-nodup.texinfo
@node Transactions @comment node-name, next, previous, up --- /project/elephant/cvsroot/elephant/doc/tutorial.texinfo 2007/04/21 17:22:35 1.17 +++ /project/elephant/cvsroot/elephant/doc/tutorial.texinfo 2007/04/25 02:27:56 1.18 @@ -442,19 +442,118 @@ The remaining problem outlined in the section on @ref{Serialization} is that operations which mutate collection types do not have persistent side effects. We have solved this problem for objects, but -not for collections such as as arrays, hashes or lists. Elephant's -solution to this problem is the @code{btree} class which provides -persistent addition, deletion and mutation of elements. - -The BTree stores arbitrarily sized sets of key-value pairs ordered by -key. Every key-value pair is stored independantly in Elephant just -like persistent object slots. They inherit all the important -properties of persistent objects: btree identity and fast -serialization / deserialization. They also resolve the mutated -substructure and nested aggregates problem for collections. Every -mutating write to a btree is an independent and persistent operation -and you can serialize or deserialize a btree without serializing any -of it's key-value pairs. +not for collections such as as arrays, hashes or lists. Elephant +provides two solutions to this problem: the @code{pset} and +@code{btree} classes. Each provides persistent addition, deletion and +mutation of elements, but the pset is a simple data structure that may +be more efficient in memory and time than the more general btree. + +@subsection Using PSets + +The persistent set maintains a persistent, unordered collection of +objects. They inherit all the important properties of persistent +objects: identity and fast serialization. They also resolve the +mutated substructure and nested aggregates problem for collections. +Every mutating write to a @code{pset} is an independent and persistent +operation and you can serialize or deserialize a @code{pset} without +serializing any of it's key-value pairs. + +The @code{pset} is also a very convenient data structure for enabling +a persistent slot contain a collection that can be updated without +deserializing and/or reserializing a list, array or hash table on +every access. + +Let's explore this data structure through a (very) simple social +networking example. + +@lisp +(defpclass person () + ((name :accessor person-name :initarg :name)) + ((friends :accessor person-friends :initarg :friends))) +@end lisp + +Our goal here is to store a list of friends that each person has, this +simple graph structure enables analyses such as who are the friends of +my friends, or do I know someone who knows X or what person has the +minimum degree of separation from everyone else? + +Without psets, we would have to do something like this: + +@lisp +(defmethod add-friend ((me person) (them person)) + (let ((friends (person-friends me))) + (pushnew them friends) + (setf (person-friends me) friends))) + +(defmethod remove-friend ((me person) (them person)) + (let ((remaining-friends (delete them (person-friends me)))) + (setf (person-friends me) remaining-friends))) + +(defmethod map-friends (fn (me person)) + (mapc fn (person-friends me))) +@end lisp + +Ouch! This results in a large amount of consing. We have to +deserialize and generate a freshly consed list every time we call +@code{person-friends} and then reserialize and discard it on every +call to @code{(setf person-friends)}. + +Instead, we can simply use a @code{pset} as the value of friends and +implement the add and remove friend operations as follows: + +@lisp +(defpclass person () + ((name :accessor person-name :initarg :name)) + ((friends :accessor person-friends :initarg :friends :initform (make-pset)))) + +(defmethod add-friend ((me person) (them person)) + (insert-item them (person-friends me))) + +(defmethod remove-friend ((me person) (them person)) + (remove-item them (person-friends me))) + +(defmethod map-friends (fn (me person)) + (map-pset fn (person-friends me))) +@end lisp + +If you want a list to be returned when the user calls person-friends +themselves, you can simply rejigger things like this: + +@lisp +(defpclass person () + ((name :accessor person-name :initarg :name)) + ((friends :accessor person-friends-set :initarg :friends :initform (make-pset)))) + +(defmethod person-friends ((me person)) + (pset-list (person-friends-set me))) +@end lisp + +If you just change the person-friends calls in our prior functions, +the new set of functions removes @code{(setf person-friends)}, which +doesn't make sense for a collection slot, allows users to get a list +of the friends for easy list manipulations and avoids all the consing +that plagued our earlier version. + +You can use a @code{pset} in any way you like just like a persistent +object. The only difference is the api used to manipulate it. +Instead of slot accessors, we use insert, remove, map and find. + +There is one drawback to persistent sets and that is that they are not +garbage collected. Over time, orphaned sets will eat up alot of disk +space. Therefore you need to explicitly free the space or resort to +more frequent uses of the migrate procedure to compact your database. +The pset supports the @code{drop-pset} + +However, given that persistent objects have the same explicit storage +property, using psets to create collection slots is a nice match. + +@subsection Using BTrees + +BTrees are collections of key-value pairs ordered by key with a log(N) +random access time and a rich iteration mechanism. Like persistent +sets, they solve all the collection problems of the prior sections. +Every key-value pair is stored independently in Elephant just like +persistent object slots.
The primary interface to @code{btree} objects is through @code{get-value}. You use @code{setf} @code{get-value} to store --- /project/elephant/cvsroot/elephant/doc/user-guide.texinfo 2007/04/24 16:39:30 1.16 +++ /project/elephant/cvsroot/elephant/doc/user-guide.texinfo 2007/04/25 02:27:57 1.17 @@ -9,20 +9,22 @@ * The Store Controller:: Behind the curtain. * Serialization details:: The devil hides in the details. * Persistent Classes and Objects:: All the dirt on persistent objects. -* Class Indices:: In-depth discussion about indexing persistent indices. -* Using BTrees:: Using the native btree. -* Using Cursors:: Low-level access to BTrees. -* The BTree index:: Alternative ways to reference objects in btrees +* Class Indices:: In-depth discussion about indexing objects. +* Persistent Sets:: Using the persistent set. +* Persistent BTrees:: Using the native btree. +* BTree Cursors:: Low-level access to BTrees. +* BTree Indexing:: Alternative ways to reference objects in btrees. +* Index Cursors:: Low-level access to BTree indices. +* Multi-threaded Applications:: What considerations are required for safe multi-threading * Transaction Details:: Develop a deeper understanding of transactions and avoid the pitfalls. * Multi-repository Operation:: Specifying repositories. -* Multi-threaded Applications:: What considerations are required for safe multi-threading * Multiple Processes and Distributed Applications:: Can elephant be run on multiple CPUs and multiple machines? * Repository Migration and Upgrade:: How to move objects from one repository to another. -* Garbage Collection:: How to recover storage and OIDs in long-lived repositories. * Performance Tuning:: How to get the most from Elephant. +* Garbage Collection:: How to recover storage and OIDs in long-lived repositories. * Berkeley DB Data Store:: Commands and concerns specific to the :BDB data store -* CL-SQL Data Store:: Commands and concerns specific to the :CLSQL data store -* Postmodern Data Store:: +* CLSQL Data Store:: Commands and concerns specific to the :CLSQL data store +* Postmodern Data Store:: * Native Lisp Data Store:: @end menu
@@ -46,7 +48,7 @@
@itemize @item Berkeley DB: '(:BDB "/path/to/datastore/directory/") -@item CL-SQL: '(:CLSQL (<sql-db-name> <sql-connect-command>)) +@item CLSQL: '(:CLSQL (<sql-db-name> <sql-connect-command>)) @end itemize
Valid CLSQL database tags for @code{<sql-db-name>} are @@ -84,7 +86,7 @@ @end lisp
The data store sections of the user guide (@ref{Berkeley DB Data -Store} and @ref{CL-SQL Data Store}) list all the data-store specific +Store} and @ref{CLSQL Data Store}) list all the data-store specific options to various elephant functions.
When you finish your application, @code{close-store} will close the @@ -792,37 +794,155 @@ improve error handling for this case, so you can delete the derived index and continue performing the write to a persistent object that flagged the error.}
-@node Using BTrees +@node Persistent Sets +@comment node-name, next, previous, up +@section Persistent Sets + +Persistent sets are fairly straightforward and are well-introduced by +the tutorial, please review the tutorial or read the reference section +for persistent sets. + +@node Persistent BTrees @comment node-name, next, previous, up -@section Using BTrees +@section Persistent BTrees
-A BTree is a data structure designed for on-disk databases. -Traditional binary trees are a tree structure that stores a key and -value and two links in each node. To get to a value, you compare your -query key to the node key. If it is equal, return the value in this -node. If it is less, follow the first link and if it is greater, -follow the second link. The problem here is that every link requires -a disk seek. +A BTree is a data structure designed for on-disk databases. It's +design goal is to minimize the number of disk seeks while traversing +the tree structure. In contrast to a binary tree, the BTree exploits +the properties of memory/disk data heirarchies. Disk seeks are +expensive while loading large blocks of data is relatively inexpensive +and in-memory scanning of a block of memory is much cheaper than a +disk seek. This means a few, large nodes containing many keys is +a more balanced data structure than + +The BTree, or derivatives, are the basis of most record-oriented +database including SQL servers, Berkeley DB and many others. Elephant +directly exposes the BTree structure to the user so the user can +decide how best to manage and traverse it. Many of Elephant's other +facilities, such as the class indexing discussed above, are +implemented on top of the BTree. + +The basic interface to the BTree is via the @code{get-value} method. +Both the key and the value are serialized and then the BTree is +traversed according to the sorted order of the key and the value +inserted in its sorted order. Insertion, access and deletion (via +@code{remove-kv}) are all O(log N) complexity operations. + +Sorting in BTrees requires some discussion. The sorting constraints +on btrees are dictated by the original implementation on Berkeley DB. +The Berkeley DB data store sorts keys based on their serialized +representation. The CLSQL implementation has to sort based on the +deserialized lisp value, so sorted traversals require reading all the +objects into memory. This places some limitations on systems that +exploit the CLSQL implementation (@pxref{CLSQL Data Store} for more +information). + +Sorting is done first by primitive type (string, standard-class, +array, etc) and then by value within that type. The type order and +internal sorting constraint is:
-The BTree exploits the properties of memory/disk data heirarchies. -The key properties are that disk seeks are expensive, loading large -blocks of data is relatively inexpensive after a seek and comparisons -on objects that are stored in memory is cheap. +@enumerate +@item Numbers. All numbers are sorted as a class by their numeric value. Effectively +all numbers are coerced into a double float and sorted relative to each other. +@item Strings. Because the serializer stores strings in variable width structures. Each width type is sorted separately, then sorted lexically. (NOTE: This should get fixed for 1.0. Strings should be sorted together) +@item Pathnames. Sorted by their string radix then lexically. +@item Symbols. Sorted by string radix, then lexically. +@item Aggregates. Sorted by type in the following order, then arbitrarily internally. Persistent instance references, cons, hash-table, standard objects, arrays, structs and then nil. +@end enumerate
+String comparisons are case insensitive today, so @code{"Adam" = +"adam" > "Steve" }. When unicode support is finalized, comparisons +will be case sensitive. + +Like persistent sets, BTrees are not garbage collected so to recover +the storage of a BTree, just run the function @code{drop-btree} to +delete all the key-value pairs and return their storage to the +database for reuse. The oid used by the btree, however, will not be +recovered. + +@c *** FINISH *** +@node BTree Cursors +@comment node-name, next, previous, up +@section BTree Cursors
+Aside from getting, setting and dropping key-value pairs from the +database, you can also traverse the BTree structure one key-value pair +at a time. + +Creating cursors (inside transactions) +Duplicating cursors + +Initializing cursors to location +- first, last, set +Traversing +- next, prev, (set) +Side effects +- delete +- put +- Can also use setf get-value while traversing
-@node Using Cursors +Large cursor operations (locks, resources?) + +@c *** FINISH *** +@node BTree Indexing @comment node-name, next, previous, up -@section Using Cursors +@section BTree Indexing
-- initialized, not-initialized
-@node The BTree Index +@c *** FINISH *** +@node Index Cursors @comment node-name, next, previous, up -@section The BTree Index +@section Index Cursors + +Index cursors are just like BTree cursors except you can get the main +BTree value instead of the index value. There are also a parallel set +of operations such as @code{cursor-pnext} instead of +@code{cursor-next} which returns @code{exists}, @code{key}, @code{primary-btree-value} +and @code{index-value = primary-btree-key}. + +Operations that have the same behavior, but return primary btree values and keys are: + +@c need table here + +@itemize +@item cursor-first -> cursor-pfirst, cursor-last -> cursor-plast +@item cursor-current -> cursor-pcurrent +@item cursor-next -> cursor-pnext, cursor-prev -> cursor-pprev +@item cursor-set-range +@end itemize + +The big difference between btree cursors and index cursors is that +indices can have duplicate key values. This means we have to choose +between incrementing over elements, unique key-values or only within +a duplicate segment. There are cursor operations for each: + +@itemize +@item Simple move. Standard btree operations plus @code{cursor-pnext} and @code{cursor-pprev}. +@item Move to next key value. @code{cursor-pnext-nodup} and @code{cursor-pprev-nodup}. +@item Move to next duplicate. +@end itemize + +@c FINISH +@node Multi-threaded Applications +@comment node-name, next, previous, up +@section Multi-threaded Applications + +Berkeley DB plays well with threads and processes. The store +controller is thread-safe by default, that is, can be shared amongst +threads. This is enabled by the @code{:thread} keyword argument which +defaults to true. Transactions may not be shared amongst threads +except serially. One thing which is NOT thread and process safe is +recovery, which should be run when no one is else is talking to the +database environment. + +The following shared regions of Elephant are protected by standard locks: +- buffer stream pool +- access to serializer circular buffers +- writes to connection db +- where else?
-Empty.
+@c *** FINISH *** @node Transaction Details @comment node-name, next, previous, up @section Transaction Details @@ -838,7 +958,7 @@ ;; ;; User and designer considerations: ;; - *current-transaction* is reserved for use by dynamic transaction context. The default global -;; value must always be null (no transaction). Each backend can set it to a different parameter +;; value must always be null (no transaction). Each data storeend can set it to a different parameter ;; within the dynamic context of an execute-transaction. ;; - Any closures returned from within a transaction cannot bind *current-transaction* ;; - Only a normal return value will result in the transaction being committed, any non-local exit @@ -848,45 +968,7 @@ ;; knowing that the transaction will be aborted ;;
-@node Multi-threaded Applications -@comment node-name, next, previous, up -@section Multi-threaded Applications - -Sleepycat plays well with threads and processes. The store controller -is thread-safe by default, that is, can be shared amongst threads. -This is set by the @code{:thread} key. Transactions may not be shared -amongst threads except serially. One thing which is NOT thread and -process safe is recovery, which should be run when no one is else is -talking to the database environment. Consult the Sleepycat docs for -more information. - -Elephant uses some specials to hold parameters and buffers. If you're -using a natively threaded lisp, you can initialize these specials to -thread-local storage by using the @code{run-elephant-thread} function, -assuming your lisp creates thread-local storage for let-bound -specials. (This functionality is currently broken) - -Persisting ordinary aggregate types (e.g. NOT persistent classes or -btrees) suffers from something called "merge-conflicts." Since -updating one value of an aggregate object requires the entire object -to be written to the database, in heavily threaded situations you may -overwrite changes another thread or process has committed. This is -not protected by transactions! - -Consider two processes operating on the same cons: - -@code{ ------start---read--update-car--write--commit---------------- --start------read--update-cdr-----------------write--commit-- -} - -Although the first process successfully committed its transaction, its -work (writing to the car) will be erased by the second transaction -(which writes both the car and cdr.) - -Persistent classes and persistent collections do not suffer from -merge-conflicts, since each slot / entry is a separate database entry. - +@c *** FINISH *** @node Multi-repository Operation @comment node-name, next, previous, up @section Multi-repository Operation @@ -895,7 +977,7 @@ actual database connections.
If a database spec is a string, it is assumed to be a BerkeleyDB path. -If it is a list, it is a assumed to be a CL-SQL connection specification. +If it is a list, it is a assumed to be a CLSQL connection specification. For example: @lisp ELE-TESTS> *testdb-path* @@ -934,7 +1016,15 @@ @comment node-name, next, previous, up @section Multiple Processes and Distributed Applications
-Can we do this? What do we have to say about this? +Just start up two lisp images and connect to the same database. +Transactions will ensure there is no interaction between processes. +This has not been extensively tested, but should work without any +problem. Any field experience will get reflected in this section of the manual + +Distributed applications may be supported if the underlying SQL server +or an appropriate Berkeley DB database is used. We provide no +documentation nor have we heard of this use-case. This remains +fertile ground for future investigation.
@node Repository Migration and Upgrade @comment node-name, next, previous, up @@ -1070,19 +1160,153 @@ @comment node-name, next, previous, up @section Garbage Collection
-GC is not implemented, but migration (@pxref{Repository Migration and -Upgrade}) will consolidate storage and recover OIDs which emulates GC. -No online solution is currently supported. - +Garbage collection is not implemented as part of the persistent object +protocol. However, the migration (@pxref{Repository Migration and +Upgrade}) mechanism will consolidate storage and recover OIDs which is +an effective offline GC. No online solution is currently anticipated.
@node Berkeley DB Data Store @comment node-name, next, previous, up @section Berkeley DB Data Store
+This section briefly describes special facilities of the Berkeley DB +data store and explains how persistent objects map onto it. Elephant +was originally written targeting only Berkeley DB. As such, the +design of Elephant was heavily influenced by the Berkeley DB architecture. + +Berkeley DB is a C library that very efficiently implements a database +by allowing the application to directly manipulate the memory pools +and disk storage without requiring communication through a server as +in many relational database applications. The library supports +multi-threaded and multi-process transactions through a shared memory +region that provides for shared buffer pools, shared locks, etc. Each +process in a multi-process application is independently linked to the +library, but shares the memory pool and disk storage. + +The following subsections discuss places where Berkeley DB provides +additional facilities to the Elephant interfaces described above. + +@subsection Architecture Overview + +The Berkeley DB data store (indicated by a @code{:BDB} in the data +store specification) supports the Elephant protocols using Berkeley DB +as a backend. The primary features of the BDB library that are used +are BTree databases, the transactional subsystem, a shared buffer pool +and unique ID sequences. + +All data written to the data store ends up in a BTree slot using a +transaction. There are two databases, one for persistent slot values +and one for btrees. The mapping of Elephant objects is quite simple. + +Persistent slots are written to a btree using a unique key and the +serialized value being written. The key is the oid of the persistent +object concatenated to the serialized name of the slot being written. +This ordering groups slots together on the disk + +@subsection Opening a Store + +When opening a store there are several special options you can invoke: + +@itemize +@item @strong{@code{:recover}} tells Berkeley DB to run recovery on the + underlying database. This is reasonably cheap if you do not need + to run recovery, but can take a very long time if you let your log + files get too long. This option must be run in a single-threaded + mode before other threads or processes are accessing the same database. +@item @strong{@code{:recover-fatal}} runs Berkeley DB catastrophic recovery (see BDB documentation). +@item @strong{@code{:thread}} set this to nil if you want to run single threaded, + it avoids locking overhead on the environment. The default is + to run @emph{free-threaded}. +@item The @strong{@code{:deadlock-detect}} launches a background process via + the run-shell commands of lisp. This background process connects to a Berkeley + DB database and runs a regular check for deadlock, freeing locks as appropriate + when it finds them. This can avoid a set of annoying crashes in Berkeley DB, + the very crashes that, in part, motivated Franz to abandon AllegroStore and write + the pure-Lisp AllegroCache. +@end itemize + +@subsection Starting a Transaction + +Berkeley DB transactions have a number of additional keyword +parameters that can help you tune performance or change the semantics +in Berkeley DB applications. They are summaried briefly here, see the +BDB docs for detailed information: + +@itemize +@item @strong{@code{:degree-2}} This option provides for cursor stability, that is whatever + object the cursor is currently at will not change, however prior +values read may change. This can significantly enhance performance if +you frequently map over a btree as it doesn't lock the entire btree, +just the current element. All transactions running concurrently over +the btree can commit without restarting. The global parameter +@code{*map-using-degree2*} determines the default behavior of this +option. It is set to true by default so that map has similar +semantics to lists. This violates both @emph{Atomicity and +Consistency} depending on how it is used. +@item @strong{@code{:read-uncommitted}} Allows reading data that has been written by other
[79 lines skipped]