Update of /project/elephant/cvsroot/elephant/doc In directory clnet:/tmp/cvs-serv1234/doc
Modified Files: scenarios.texinfo Log Message: Documentation edits; edi's lispworks patch to ele build
--- /project/elephant/cvsroot/elephant/doc/scenarios.texinfo 2007/04/12 02:47:23 1.4 +++ /project/elephant/cvsroot/elephant/doc/scenarios.texinfo 2007/04/19 05:24:37 1.5 @@ -1,55 +1,292 @@ @c -*-texinfo-*-
-@node Usage Scenarios +@node Design Patterns @comment node-name, next, previous, up -@chapter Usage Scenarios -@cindex Usage Scenarios +@chapter Design Patterns +@cindex Design Patterns
@menu -* File Replacement:: Simple deployment of Elephant as file replacement +* File System Replacement:: Deployment of Elephant as file replacement +* Checkpointing Program State:: How to recover the application state as recorded in a set of interdependant standard classes for purposes of undo, crash recovery and session persistence. * Persistent System Objects:: Making persistent objects a natural part of your system -* Crash Recovery:: How to recover application state from application or system crashes * Elephant as Database:: Using Elephant as a database for records and user data instead of using a SQL relational Database * Multithreaded Web Applications:: Elephant is a natural match for web applications * Graph-oriented Applications:: Elephant is good, but not optimized, for graph-oriented applications. * Real-World Application Examples:: See some real-world applications Elephant has been used for and a brief discussion of how it was used and any novel uses of Elephant. @end menu
-@node File Replacement +This chapter explores different ways that Elephant can be used to +solve common problems in user programs. The term ``Design Pattern'' +may be overkill as there is no formal specification of patterns. +However the goals is similar to classical design patterns: provide a +coherent description of how to approach ceratain common problems using +Elephant as an enabling tool. + +Most of this chapter falls short of a tutorial in the application of a +pattern. Instead it provides a conceptual guide to implementing the +pattern along with some code examples to show how Elephant features +are invoked to support the pattern. + +The authors hope that users of Elephant will find this a good source +of inspiration for how to apply Elephant to their own programs and +that they will be motivated to contribute design patterns of their own. + + +@node File System Replacement @comment node-name, next, previous, up -@section File Replacement +@section File System Replacement
-One of the annoying overheads in writing many programs is persisting -data between lisp sessions or invocations of that program. Elements -such as configuration files, raw data such as graphics and other -formats take time, attention and are a potential source of bugs. -Elephant can ease these concerns and allow you to work directly with -your natural in-memory representations with no work to encode/decode -formats or manage files in the file system. +One of the more annoying time-wasting activities in programming is +saving and restoring data from disk. Data in configuration files, +static data such as graphics and other formats take time and attention +away from solving the main problem and are additional sources of bugs. +Because Elephant's serializer supports most lisp types, Elephant can +greatly simplify ease these concerns and allow you to work directly +with your natural in-memory representations with almost no work to +encode/decode formats or manage files in the file system.
The simplest way to accomplish this is to simply open a store controller and use the root btree as a key-value store instead of a -file system directory. You might hide some of the complexity sort of -like this: +file system directory. You might hide some of the details like this:
@lisp -(defmacro def-resource (name initializer) - (assert (symbolp name)) - `(defparameter name (list nil nil ,initializer))) - -(defun call-initializer (init) - (case init-stmt - (symbol (funcall (symbol-function init-stmt))) - (list (apply (first init) (rest init))))) +(defvar *resources* (make-hash-table))
(defun get-resource (name) - (if (and (symbol-value name) - (symbol-value-name) - (let ((newval (get-from-root name))) - (if newval - (setq name (add-to-root name newval)) - (setq name (add-to-root name (call-initializer -@end lisp + (multiple-value-bind (value foundp) (gethash name *resources*) + (if foundp + value + (multiple-value-bind (value foundp) (get-from-root name) + (if foundp + value + (error "Resource named ~A was not initialized" name)))))) + +(defun set-resource (value name) + (add-to-root name value) + (setf (gethash name *resources*) value)) + +(defsetf get-resource set-resource) +@end lisp + +Another simple metaphor is to use Elephant btrees as persistent hash +tables that persist key-value pairs for you. We'll wrap the Elephant +btree in a simple class to provide a little conceptual isolation. + +@lisp +(defclass phash () + ((btree :accessor phash-btree :initarg :btree :initform (make-btree)))) + +(defun make-persistent-hash (name) + (let ((btree (get-from-root name))) + (if btree + (make-instance 'phash :btree btree) + (let ((phash (make-instance 'phash))) + (add-to-root name (phash-btree phash)) + phash)))) + +(defun getphash (key phash) + (get-value key (phash-btree phash))) + +(defun setphash (value key phash) + (setf (get-value key (phash-btree phash)) value)) + +(defsetf getphash setphash) +@end lisp + +Of course to make a proper abstraction we'd want to provide some +conditions that allowed restarts that initialized values or allowed +users to update the hash in the background and continue computation. + +@footnote{Example provided by Ian Eslick, April 2007} + +@node Checkpointing Program State +@comment node-name, next, previous, up +@section Checkpointing Program State + +Another challenge for many programs is saving some subset of program +state. This could involve checkpointing an evolving computation, +keeping track of state for the purposes of 'undo' or enabling crash +recovery at key points in the program's execution. + +One approach is to transform all our program state into persistent +objects. However if the use of program state is slot-access +intensive, this can have a significant performance impact. To improve +the performance of the application, careful use of transactions is +needed which further complicates program design and operation. + +Can Elephant be used to provide a simple solution that retains the +in-memory performance that we want? Can we do all this without having +to put a ton of persistence assumptions into our main program code? +The answer is yes, assuming you are willing to explicitly checkpoint +your code and adhere to some simple constraints in accessing your +program objects. + +@subsection Assumptions + +To get speed, we want all our objects to be standard lisp objects that +are in memory and have no special harnesses that would interfere with +using the full power of lisp. At some point in execution, we want to +store the current state of a bunch of objects to disk, but make it +easy to reproduce the exact state at a later point in time. For +simplicity, we'll assume that we are talking about collections of CLOS +objects. + +An additional complication is that many programs have sets of +interdependant objects. These could be complex program graphs, the +state of an active search process or a standard OO system that uses a +bunch of program objects to function. This means that we need to +persist not just object state, but also references and any object that +is referred to. + +Using CLOS reflection we can provide a general solution to capturing +objects, slot values and references. However to reproduce references, +we'll need to be able to find the object referenced and the only +general way to do that is to store it as well. Thus a snapshot is a +closed set of self-referential objects. + +The assumptions required to implement the simple checkpointing +implemented here is: + +@itemize +@item @strong{Use standard CLOS objects and references to other CLOS objects.} +We need reflection to +@item @strong{Use standard hash tables to keep track of sets of objects.} +Your program should use the hash table as an entry point to find +objects. When objects are restored, just replace an existing hash +table with the new one and access your objects that way. Any parts of +your program that have pointers into your objects but are not +themselves snapshotted, will need to be able to refresh their pointers +in some way. +@item @strong{Find your root object (s) and know what is ``reachable'' from them.} +Ensure that you aren't referring to standard objects outside those you +want to store as they will be stored too (persistent object references +are fine though). Make sure your root refers to objects that refers +to other objects and so on such that all objects you want to store can +be reached by some set of pointer traversals. Looping references are +fine. +@end itemize + +@subsection Implementation: The Snapshot Set + +To generalize all this behavior, we will define a new class called a +snapshot set. The set itself is a persistent object that wraps a +btree, but provides all the automation to store and recover sets of +objects. + +@subsection Isolating snapshot sets + +A brief note on how to separate out the objects you want to store from +those you don't may be useful. We want to snapshot groups of +inter-referential objects without sucking in the whole system in one +snapshot. These object sets must be closed and fully connected. If +the program consists of a set of subgraphs, a root element of each +graph should be stored in a hash table that is then treated as the +snapshot root. + +@itemize +@item @strong{Manual registration:} +Objects without external references are easy, just @code{register} or +@code{unregister} them from the @code{snapshot-set} as needed and then +map over them to get them back. +@item @strong{Implicit registration:} +Just store objects in a hash that is the root of a @code{snapshot-set} +and you are good to go. +@item @strong{Graphs:} +Graphs are easy to store as they naturally consist of a closed set of +objects. If the graph nodes reference other system objects that you +don't want to store, you'll need to implement something akin to the +indirection provided here. Just store the root of the graph in the +snapshot set root and go from there. +@item @strong{All instances of a type:} +Another easy way to create sets is to overload @code{make-instance} to +store all new objects in a weak hash table that is treated as the root +of a @code{snapshot-set} (NOTE: I have not verified that weak hashes +are properly serialized and reproduced - I suspect they are not so you +might have to copy after a @code{restore}). +@end itemize + +For more complex applications, you can isolate these closed sets of +objects by using @code{snapshot-set} root hash tables as an +indirection mechanism. Instead of storing direct references in an +object slot or hash value, isolation is ensured by storing keys and +indirecting through a hash table to get the target object. This can +be hidden from the programmer in multiple ways. The easiest way is +just to make sure that when you store references you store a key and +overload the slot accessor. A sketch of this follows: + +@lisp +(defparameter *island1-hash* (make-hash-table)) +(defparameter *island2-hash* (make-hash-table)) +(defvar *unique-id* 0) + +(defclass island1-object () + ((pointer-to-island1 :accessor child :initform nil) + (pointer-to-island2 :accessor neighbor :initform nil))) + +(defmethod neighbor :around ((obj island1-object)) + (let ((key (call-next-method))) + (when key (gethash key *island2-hash*)))) + +(defmethod (setf neighbor) :around (ref (obj island1-object)) + (cond ((subtypep (type-of ref) 'island2-object) + (let ((key (find-object ref *island2-hash*))) + (if key + (progn + (call-next-method key obj) + obj) + (progn + (setf (gethash (incf *unique-id*) *island2-hash*) ref) + (call-next-method *unique-id* obj) + obj)))) + (t (call-next-method)))) + +(defun find-object (obj hash) + (map-hash (lambda (k v) + (declare (ignore k)) + (if (eq obj v) + (return-from find-object obj))) + hash)) +@end lisp + +The same template would apply to @code{island2} references to +@code{island1} objects. You could further simplify creating these +hash table indirections with a little macro: + +@lisp +(defmacro def-snapshot-reference-wrapper (accessor-name (source-classname target-classname hashname uid)) + (with-gensysms (obj key ref) + `(progn + (defmethod ,accessorname :around ((,obj ,source-classname)) + (let ((,key (call-next-method))) + (when ,key (gethash ,key ,hashname)))) + (defmethod (setf ,accessorname) :around (,ref (,obj ,source-classname)) + (cond ((subtypep (type-of ,ref) ,target-classname) + (let ((,key (find-object ,ref ,hashname))) + (if ,key + (progn + (call-next-method ,key ,obj) + ,obj) + (progn + (setf (gethash (incf ,uid) ,hashname) ,ref) + (call-next-method ,uid ,obj) + ,obj)))) + (t (call-next-method))))))) + +(defclass island2-object () + ((pointer-to-island2 :accessor child :initform nil) + (pointer-to-island1 :accessor neighbor :initform nil))) + +(def-snapshot-reference-wrapper neighbor (island2 island1 *island1-hash* *unique-id*)) +@end lisp + +Of course this doesn't work for multi-threaded environments, or for +separating more complex collections of types. I am also sure that +more elegant solutions could be generated. + +In most cases, we assume the user will have a natural collection of objects +that is closed + +@footnote{Example provided by Ian Eslick, April 2007}
@node Persistent System Objects @comment node-name, next, previous, up @@ -62,10 +299,6 @@ @item Look up objects using class indices @end itemize
-@node Crash Recovery -@comment node-name, next, previous, up -@section Crash Recovery - @node Elephant as Database @comment node-name, next, previous, up @section Elephant as Database @@ -135,7 +368,7 @@ @item Process Components @item Bulk storage of post-processed web data @item Class indexes on strings -@item Cheap associations +@item User associations @item Inverted document index @end itemize