Update of /project/elephant/cvsroot/elephant/doc In directory clnet:/tmp/cvs-serv18053
Modified Files: tutorial.texinfo user-guide.texinfo Log Message: First cut new tutorial
--- /project/elephant/cvsroot/elephant/doc/tutorial.texinfo 2007/03/25 11:04:38 1.8 +++ /project/elephant/cvsroot/elephant/doc/tutorial.texinfo 2007/03/26 03:37:27 1.9 @@ -355,16 +355,9 @@
Using the @code{persistent-metaclass} metaclass declares all slots to be persistent by default. To make a non-persistent slot use the -@code{:transient t} flag. Class slots are never persisted, for either -persistent or ordinary classes. (Someday, if we choose to store class -objects, this policy decision may change). - -Readers, writers, accessors, and @code{slot-value-using-class} are -instrumented, so override these with care. Because @code{slot-value, -slot-unboundp, slot-makunbound} are not generic functions, they are -not guaranteed to work properly with persistent slots. Use the -@code{*-using-class} versions or the @code{closer-to-mop} MOP compliance -layer by Pascal Costanza (we may integrate this in later versions). +@code{:transient t} flag. Class slots @code{:allocation :class} are +never persisted, for either persistent or ordinary classes. (Someday, +if we choose to store class objects, this policy may change).
Persistent classes may inherit from other classes. Slots inherited from persistent classes remain persistent. Transient slots and slots @@ -372,47 +365,105 @@ cannot inherit from persistent classes -- otherwise persistent slots could not be stored!
-Note that the database is read every time you access a slot. This is -a feature, not a bug, especially in concurrent situations: you want -the most recent commits, right? Note that this can be used as a weak -form of IPC. But also note that in particular, if your slot value is -not an immediate value, reading will cons or allocate the value. Gets -are not an expensive operation; you can perform thousands to tens of -thousands of primitive reads per second. However, if you're -concerned, cache large values in memory. +@lisp +(defclass stdclass1 () + ((slot1 :initarg :slot1 :accessor slot1))) + +(defclass stdclass2 (stdclass1) + ((slot2 :initarg :slot2 :accessor slot2))) + +(defpclass pclass1 (stdclass2) + ((slot1 :initarg :slot1 :accessor slot1) + (slot3 :initarg :slot3 :accessor slot3))) + +(make-instance 'pclass1 :slot1 1 :slot2 2 :slot3 3) +=> #<PCLASS1 @{x10deb88a@}> + +(add-to-root 'pinst *) +=> #<PCLASS1 @{x10deb88a@}> + +(slot1 pinst) +=> 1 + +(slot2 pinst) +=> 2 + +(slot3 pinst) +=> 3 +@end lisp + +Now we can simulate a new lisp session by flushing the instance cache, +reloading our object then see what slots remain. Here persistent +slot1 should shadow the standard slot1 and thus be persistent. Slot3 +is persistent by default and slot2, since it is inherited from a +standard class should be transient. + +@lisp +(elephant::flush-instance-cache *store-controller*) +=> #<EQL hash-table with weak values, 0 entries @{x11198a02@}> + +(setf pinst (get-from-root 'pinst)) +=> #<PCLASS1 @{x1119b652@}> + +(slot1 pinst) +=> 1 + +(slot-boundp pinst slot2 pinst) +=> nil + +(slot3 pinst) +=> 3 +@end lisp + +Using persistent objects has implications for the performance of your +system. Note that the database is read every time you access a slot. +This is a feature, not a bug, especially in concurrent situations: you +want the most recent commits by other threads, right? This can be +used as a weak form of IPC. But also note that in particular, if your +slot value is not an immediate value or persistent object, reading +will cons or freshly allocate storage for the value. + +Gets are not an expensive operation; you can perform thousands to tens +of thousands of primitive reads per second. However, if you're +concerned, cache large values in memory and avoid writing them back to +disk as long as you can.
@node Persistent collections @comment node-name, next, previous, up @section Persistent collections
-The remaining problem outlined in @ref{Serialization} is that -operations which mutate aggregate objects are not persistent. While -we solved this problem for objects, there is no collection type such -as arrays, hashes or lists which provide this ability. Elephant -provides two primary types of collections, a @code{btree} and a -@code{indexed-btree}. - -We will focus on the core concepts of BTrees in this section, for a -detailed review including the behavior of indexed BTrees, @pxref{Using -BTrees}, @ref{Secondary Indices} and @ref{Using Cursors} in the -@ref{User Guide}. - -Elephant provides a rich data structure called a BTree for storing -large sets of key-value pairs. Every key-value pair is stored -independantly in Elephant just like persistent object slots. -Therefore they inherit all the nice properties of persistent objects: -identity, fast serialization / deserialization, no merge conflicts, -etc. +The remaining problem outlined in the section on @ref{Serialization} +is that operations which mutate collection types do not have +persistent side effects. We have solved this problem for objects, but +not for collections such as as arrays, hashes or lists. Elephant's +solution to this problem is the @code{btree} class which provides +persistent addition, deletion and mutation of elements. + +The BTree stores arbitrarily sized sets of key-value pairs ordered by +key. Every key-value pair is stored independantly in Elephant just +like persistent object slots. They inherit all the important +properties of persistent objects: btree identity and fast +serialization / deserialization. They also resolve the mutated +substructure and nested aggregates problem for collections. Every +mutating write to a btree is an independent and persistent operation +and you can serialize or deserialize a btree without serializing any +of it's key-value pairs.
The primary interface to @code{btree} objects is through -@code{get-value}. You can also @code{setf} @code{get-value} to store -key-value pairs. +@code{get-value}. You use @code{setf} @code{get-value} to store +key-value pairs. This interface is very similar to @code{gethash}. + +The following example creates a btree called +@code{*friends-birthdays*} and adds it to the root so we can retrieve +it during a later sessions. We then will add two key-value pairs +consisting of the name of a friend and a universal time encoding their +birthday.
@lisp (defvar *friends-birthdays* (make-btree)) => *FRIENDS-BIRTHDAYS*
-(add-to-root "friends-birthdays" *friends-birthdays*) +(add-to-root 'friends-birthdays *friends-birthdays*) => #<BTREE @{4951CF6D@}>
(setf (get-value "Ben" *friends-birthdays*) @@ -445,8 +496,8 @@ due to serialization semantics may be strange for other values like arrays, lists, standard-objects, etc.
-Because elements are sorted by value, we should be able to iterate -over all the elements of the BTree in order. We entered the data in +Because elements are sorted by value, we can iterate over all the +elements of the BTree in order. Notice that we entered the data in reverse alphabetic order, but will read it out in alphabetical order.
@lisp @@ -459,16 +510,21 @@ => NIL @end lisp
-But what if we want to read out our friends from oldest to youngest, -or youngest to oldest? In the @ref{User Guide}, specifically the -section on @ref{Secondary indices} you will discover ways to sort -according to the order defined by a lisp function of the key-value pair. +But what if we want to read out our friends from oldest to youngest? +One way is to employ another btree that maps birthdays to names, but +this will require storing values multiple times for each update and +increases the burden on the programmer. Elephant provides a better +way. + +The next section @ref{Indexing Persistent Classes} shows you how to +order and retrieve persistent classes by one or more slot values. +
@node Indexing Persistent Classes @comment node-name, next, previous, up @section Indexing Persistent Classes
-Class indices simplify the recording and retrieving of persistent +Class indexing simplifies the storing and retrieval of persistent objects. An indexed class stores every instance of the class that is created, ensuring that every object is automatically persisted between sessions. @@ -571,16 +627,20 @@ increase the cost of writes and disk storage, each entry is only slightly larger than the size of the slot value. Numbers, small strings and symbols are good candidate types for indexed slots, but -any value may be used, even different types. +any value may be used, even different types. Once a slot is indexed, +we can use the index to retrieve objects by slot values.
-Once we've indexed a slot, we can use another set of -@code{get-instances} and @code{map} functions to access objects -in-order and by their slot value. +@code{get-instances-by-value} will retrieve all instances that are +equal to the value argument.
@lisp (get-instances-by-value 'friends 'name "Carlos") => (#<Carlos>) +@end lisp
+But more interestingly, we can retrieve objects for a range of values. + +@lisp (get-instances-by-range 'friends 'name "Adam" "Devin") => (#<Adriana> #<Carlos>)
@@ -591,78 +651,67 @@ name: Zaid birthdate: (14 8 1976) name: Adriana birthdate: (24 4 1980) => (#<Zaid> #<Adriana>) +@end lisp + +To retrieve all instances of a class in the order of the index instead +of the arbitrary order returned by @code{get-instances-by-class} you +can use nil in the place of the start and end values to indicate the +first or last element. (Note: to retrieve instances null values, use +@code{get-instances-by-value} with nil as the argument).
-(map-class-index #'print-friend 'friend 'name "Carlos" "Carlos") +@lisp +(get-instances-by-range 'friend 'name nil "Sandra") +=> (#<Adriana> #<Carlos>) + +(get-instances-by-range 'friend 'name nil nil) +=> (#<Adriana> #<Carlos> #<Zaid>) +@end lisp + +There are also functions for mapping over instances of a slot index. +To map over values, use the :value keyword argument. To map by range, +use the :start and :end arguments. + +@lisp +(map-class-index #'print-friend 'friend 'name :value "Carlos") name: Carlos birthdate: (1 1 1972) => NIL
-(map-class-index #'print-friend 'friend 'name "Adam" "Devin") +(map-class-index #'print-friend 'friend 'name :start "Adam" :end "Devin") name: Adriana birthdate: (24 4 1980) name: Carlos birthdate: (1 1 1972) => NIL
(map-class-index #'print-friend 'friend 'birthday - (encode-birthday '(1 1 1974)) - (encode-birthday '(31 12 1984))) + :start (encode-birthday '(1 1 1974)) + :end (encode-birthday '(31 12 1984))) name: Zaid birthdate: (14 8 1976) name: Adriana birthdate: (24 4 1980) => NIL
-(map-class-index #'print-friend 'friend 'birthday nil (encode-birthday '(10 10 1978))) +(map-class-index #'print-friend 'friend 'birthday + :start nil + :end (encode-birthday '(10 10 1978))) name: Carlos birthdate: (1 1 1972) name: Zaid birthdate: (14 8 1976) => NIL
(map-class-index #'print-friend 'friend 'birthday - (encode-birthday '(10 10 1975)) - nil) + :start (encode-birthday '(10 10 1975)) + :end nil) name: Zaid birthdate: (14 8 1976) name: Adriana birthdate: (24 4 1980) => NIL @end lisp
-You can enable/disable class indexing for an entire class. When you disable -indexing all references to instances of that class are lost. If you re-enable -class indexing only newly created classes will be stored in the class index. -You can manually restore them by using @code{find-class-index} to get the -clas index BTree if you have an alternate in-memory index. - -You can add/remove a secondary index for a slot. So long as the class index -remains, this can be done multiple times without losing any data. - -There is also a facility for defining 'derived slots'. These can be non-slot -parameters which are a function of the class's persistent slot values. For -example you can use an index to keep an alternate representation available -for fast indexing. If an object has an x,y coordinate, you could define a -derived index for r,theta which stored references in polar coordinates. -These would be ordered so you could iterate over a class-index to get objects -in order of increasing radius from the origin or over a range of theta. - -Beware, however, that derived indices have to compute their result every -time you update any persistent instance's slot. This is because there is -no way to know which persistent slots the derived index value(s) depends -on. Thus there is a fairly significant computational cost to objects -with frequent updates having derived indices. The storage cost, however, -may be less as all that is added is the index value and an OID reference -into the class index. To add a slot value you add a serialized -OID+class-ref+slotname to index value which can be much larger if you -use long slotnames and package names and unicode. - -Thus, the question of if and how a given class should be indexed is -very flexible and dynamic, and does not need to be determined at the -beginning of your development. This represents the ability to ``late bind'' -the decision of what to index. - -In general, there is always a tradeoff: an indexed slot increases storage -associated with that slot and slows down write operations. Reads however remain -as fast as for unindexed persistent slots. The Elephant system -makes it simple to choose where and when one wants to utilize this tradeoff. - -Finally, that file @file{src/elephant/classindex-utils.lisp} documents -tools for handling class redefinitions and the policy that should be -used for synchronizing the classes with the database. This process is -somewhat user customizable; documentation for this exists in the source -file referenced above. +The @ref{User Guide} contains a descriptions of the advanced features +of @ref{Class indices} such as ``derived indicies'' that allow you to +order classes according to an arbitrary function, a dynamic API for +adding and removing slots and how to set a policy for resolving +conflicts between the code image and a database where the indexing +specification differs. + +This same facility is also available for your own use. For more +information @pxref{Using Indexed BTrees}.
@node Using Transactions @@ -670,24 +719,24 @@ @section Using Transactions
One of the most important features of a database is that operations -satisfy the ACID properties: Atomic, Consistent, Isolated, and +enforce the ACID properties: Atomic, Consistent, Isolated, and Durable. In plainspeak, this means that a set of changes is made all at once, that the database is never partially updated, that each set of changes happens sequentially and that a change, once made, is not lost.
Elephant provides this protection for all primitive operations. For -example, when you write a value to an indexed BTree, the update to the -BTree and all of its indices is protected by a transaction that -peforms atomic updates to all the BTrees, thus maintaining their -consistency. - -Most real applications will need to have explicit transactions because -you will want one or more read-modify-update operations to happen as -an atomic unit. A common motivating example for this is a banking -system. If a thread is going to modify a balance, we don't want -another thread modifying it in the middle of the operation or one of -the modifications may be lost. +example, when you write a value to an indexed slot, the update to the +persistent slot record as well as the slot index is protected by a +transaction that performs all the updates atomically and thus +enforcing consistency. + +Most real applications will need to use explicit transactions rather +than relying on the primitives alone because you will want multiple +read-modify-update operations act as an atomic unit. A good example +for this is a banking system. If a thread is going to modify a +balance, we don't want another thread modifying it in the middle of +the operation or one of the modifications may be lost.
@lisp (defvar *accounts* (make-btree)) --- /project/elephant/cvsroot/elephant/doc/user-guide.texinfo 2007/03/25 11:04:38 1.2 +++ /project/elephant/cvsroot/elephant/doc/user-guide.texinfo 2007/03/26 03:37:27 1.3 @@ -34,6 +34,12 @@ @code{initforms} are always evaluated, so beware. (What is the current model here?)
+Readers, writers, accessors, and @code{slot-value-using-class} are +employed in redirecting slot accesses to the database, so override +these with care. Because @code{slot-value, slot-boundp, +slot-makunbound} are not generic functions, they are not guaranteed by +the specification to work properly with persistent slots. However the +proper behavior has been verified on SBCL, Allegro and Lispworks.
@node The Store Controller @comment node-name, next, previous, up