-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
To all:
This is the beginning of a discussion about what I called meta-level usage of the Cells approach and mechanism actually using Kenny's Cells package. You better start at the bottom to get this straight then go up again and read downwards <g>
Here we go:
Am 24.05.2005 um 00:30 schrieb Kenny Tilton:
Frank Goenninger - PRION Consulting wrote:
Kenny,
while designing my app here I am stuck with the following problem:
Using Cells I want to build an object management kernel (to be used within my Product Information Management Application). This kernel has to implement a basic set of operations to manage objects (instances of any kind of class) such as Create, Read, Update (using Check-in / Check-Out into / from Vaults) and such.
While Cells is perfect for modeling dependencies between instances of objects (using synapses) I am now about to implement relations and dependencies between classes.
uh-oh, maybe Thomas is right and "synapses" is as confusing as it is clever. And synapses are now just Cells, to make things even worse. :) But I still talk about synapses as a class. Cells model dataflow from one slot of an instance to another slot of the same or different instance. That is, a c-ruled (or is it c-dependent?) can access any combination of other c-ruled or c-input cells. A synapse arises where one wants either to (a) mediate that dependency, say with an f-sensitivity synapse; (b) translate the dataflow, as with a delta synapse; or (c) do either with an arbitrary subform within a cell rule (something I might have pointed out with the new Synapse mechanism, and will shortly in a standalone post).
I think it is worth here to distinguish between two separate domains where the terms CLASS, INSTANCE, RELATION, DEPENDENCY each have different meanings and side effects.
In the Application domain (my app being based on Cells) has Classes that get defined while adapting the app to a user's needs. A typical class would be "BOM" (said bill of material). The other one to be use in the use case outlined below is class "PART".
So we have:
+ Application Class BOM + Application Class PART + Class Relation IS-USED-IN: PART -> BOM (read: a PART can be related to a BOM with the relation IS-USED-IN). This is a relation defined on CLASS (!) level.
+ Instance "As Built BOM, Baseline 001" of application class BOM + Instance "High Pressure Turbine, Serial Nr 0001" of class PART
In order to make the use case more verbose we define some attributes on the classes:
+ Attribute PRICE on class BOM + Attribute NR-PARTS on class BOM
+ Attribute PRICE on class PART
The relation is a "general" one: if any attribute of an instance of class PART changes then recalculate all attributes of the corresponding instance of class BOM.
Several things have to be noted here:
Adding a part to a Bom is done by the following steps (not using Cells but the - say - traditional approach):
0. BOM instance is already there 1. Create an instance of class PART 2. Create an instance of relation IS-USED-IN 3. Set the LHS (left hand side) of the relation to the object id of the new part 4. Set the RHS of the relation to the object id of the BOM instance
Now, the update mechanism in this traditional approach always crawls all relation instances using the id of the changed object to find all other objects being related to that instance id and sends an update event to all these instances.
This should be avoided by limiting the nr of events being fired. Not every related object has to be updated when a part is added. The same is true when the Price attribute of a part is changed: the BOM need not recalculate the nr of parts in it.
The intended use case is:
Define a relation class that says: update attribute PRICE on instances of class BOM if the attribute PRICE of instances of class PART changes - respecting the fact that a particular instance of class PART is always related to particular instances of class BOM (so update only those affected).
A part can be related to more than one BOM. A BOM can have more than one part (well, a simple n:m relation).
So, I want to define a class-level dependency between classes BOM and PART by defining the slots PRICE in BOM and NR-PARTS, PRICE in PART as cells.
Using whatever mechanism out of the Cells package I defined the relation between these Cell slots. As there will be millions of instances of PART and (a few less) instances of BOMs I want to have to define the dependency just once while still being able to overwrite the firing rules and the dependency rules between any two instances of classes BOM and PART.
The reminder of the email exchange is now discussion the various aspects of this and also what Cells really are ;-)
The use case for this will be: I create a dependency between class BOM (bill of material) and Part (individual components). Once having create that class synapse-connection every instance created of the classes BOM and Part "knows" of the dependency ...
A couple of things here.
(1) I myself have lately stepped up my understanding of Cells, viz, that a lot of the power comes from being instance-oriented, as in different instances of the same class can have different rules for the same slot. Note that def-c-output compromises this by being class-oriented, so now I have a solid reason for my long-held vague discomfort with def-c-outputs which are not really outputs but instead feed back into the model by setf-ing c-input cells. So this talk of relations between classes has me thinking "you better have a good reason". <g>
(2) I forget the second thing.
and after being related to specific instance (say, a Part "High Pressure Turbine, Serial Nr 0001" is related to an instance "As Built BOM , Baseline 001" of class BOM) then, when the Part "High Pressure Turbine, Serial Nr 0001" gets changed..
You lost me. Do you mean it gets changed in the sense that some other part is now being used, or in the sense that some attribute of the part (say, the price of the turbine) changes?
Both:
If a new part is added or a part is deleted from the BOM or the price attribute is changed.
the synapse between the two classes triggers the /right/ instance of class BOM, here the instance "As Built BOM , Baseline 001".
"Triggers"? Do you mean, the total on the BOM would have to be changed to reflect the changed turbine price?
Yes.
Anyway, it does not sound as if the two classes require dataflow, it sounds as if each instance of each class will require that dataflow with some other instance.
Agreed.
As much as I talk about instance-oriented programming, hey, if you author a cell in a rule supplied in an initform or default-initarg, guess what? You get class-oriented behavior. :) Just do not override those rules at make-instance time. :)
Aha!
####
It all started with my email:
Kenny,
while designing my app here I am stuck with the following problem:
Using Cells I want to build an object management kernel (to be used within my Product Information Management Application). This kernel has to implement a basic set of operations to manage objects (instances of any kind of class) such as Create, Read, Update (using Check-in / Check-Out into / from Vaults) and such.
While Cells is perfect for modeling dependencies between instances of objects (using synapses) I am now about to implement relations and dependencies between classes.
The use case for this will be: I create a dependency between class BOM (bill of material) and Part (individual components). Once having create that class synapse-connection every instance created of the classes BOM and Part "knows" of the dependency and after being related to specific instance (say, a Part "High Pressure Turbine, Serial Nr 0001" is related to an instance "As Built BOM , Baseline 001" of class BOM) then, when the Part "High Pressure Turbine, Serial Nr 0001" gets changed the synapse between the two classes triggers the /right/ instance of class BOM, here the instance "As Built BOM , Baseline 001".
Hmmm - you, as the expert here, would you:
1) Create a new class of synapses
2) Create a new class of cells
3) Implement a new dependency mechanism
4) Do something else
(and for every case I dare to ask: Why ?) ;-)
- --- EOM (End Of Mail) ---
I think it is worth here to distinguish between two separate domains where the terms CLASS, INSTANCE, RELATION, DEPENDENCY each have different meanings and side effects.
In the Application domain (my app being based on Cells) has Classes that get defined while adapting the app to a user's needs. A typical class would be "BOM" (said bill of material). The other one to be use in the use case outlined below is class "PART".
So we have:
- Application Class BOM
- Application Class PART
- Class Relation IS-USED-IN: PART -> BOM (read: a PART can be related
to a BOM with the relation IS-USED-IN). This is a relation defined on CLASS (!) level.
- Instance "As Built BOM, Baseline 001" of application class BOM
- Instance "High Pressure Turbine, Serial Nr 0001" of class PART
In order to make the use case more verbose we define some attributes on the classes:
Attribute PRICE on class BOM
Attribute NR-PARTS on class BOM
Attribute PRICE on class PART
The relation is a "general" one: if any attribute of an instance of class PART changes then recalculate all attributes of the corresponding instance of class BOM.
Several things have to be noted here:
Adding a part to a Bom is done by the following steps (not using Cells but the - say - traditional approach):
- BOM instance is already there
- Create an instance of class PART
- Create an instance of relation IS-USED-IN
- Set the LHS (left hand side) of the relation to the object id of
the new part 4. Set the RHS of the relation to the object id of the BOM instance
Now, the update mechanism in this traditional approach always crawls all relation instances using the id of the changed object to find all other objects being related to that instance id and sends an update event to all these instances.
This should be avoided by limiting the nr of events being fired. Not every related object has to be updated when a part is added. The same is true when the Price attribute of a part is changed: the BOM need not recalculate the nr of parts in it.
The intended use case is:
Define a relation class that says: update attribute PRICE on instances of class BOM if the attribute PRICE of instances of class PART changes - respecting the fact that a particular instance of class PART is always related to particular instances of class BOM (so update only those affected).
A part can be related to more than one BOM. A BOM can have more than one part (well, a simple n:m relation).
So, I want to define a class-level dependency between classes BOM and PART by defining the slots PRICE in BOM and NR-PARTS, PRICE in PART as cells.
As I said, just use an :initform or :default-initarg:
(defclass BOM (priceable) () (:default-initargs :price (c? (apply '+ (mapcar 'price (^parts))))))
BTW, if this gets into a long-enough list (a) at 64 you will hit a silly Cells limitation which i will look at eliminating shortly but (b) i think we are into new territory which I first noticed on the Dow-Jones Index use case, viz, one slot quite reasonably depending on a kazillion other slots. DJI got solved by ducking the problem -- it turned out each ticker was being depended on twice (for two different values) and a carefully placed without-c-dependency got the dependency count under 64 without sacrificing true dependency (the others were semantically redundant) -- but obviously this bad boy of an issue has to be dealt with.
How long are the parts lists? Even after I relax the 64-link limitation I observed a performance hit from cells internals searching down that long list (gets hit a lot, relatively speaking). I am thinking a clever divide-conquer scheme will be necessary, in which cells internals generate trees of synapses with a maximum of 16 or whatever dependencies at each node. We will see.
Using whatever mechanism out of the Cells package I defined the relation between these Cell slots. As there will be millions of instances of PART and (a few less) instances of BOMs I want to have to define the dependency just once while still being able to overwrite the firing rules and the dependency rules between any two instances of classes BOM and PART.
A Cell data structure has, inter alia, slots for used cells and cells which use it. These must be instance specific, since in the end we really do have a normal case of instances depending on other instances. The slots for the rule will all contain the same rule instance (unless it closes over a lexical variable). So you could save one slot in each Cell -- if defstruct supported class-allocated slots, but I do not think they do. (You would think I would know for sure, but Lisp still surprises me from time to time.)
Are all these parts loaded into RAM at once? With that kind of demand on memory i do not think Cells will stand out as a burden, though i could be wrong. Have you tried the normal approach and found a problem? I am not completely against early optimization and do it often, but this would be a case where I would let the normal approach fail before working on optimization.
Conceivably we can carve out a lighter-weight Cell, but I am not sure we can save more than a few slots. I think structures are implemented as arrays, so we are just making smaller arrays. I am concerned that that will not help much if a bottleneck is discovered. I would then lean towards your idea of a single Cell capable of managing the dependencies of all class instances. That would save almost all the slots at the cost of adding a hashtable lookup on an instance before getting to its list of used or user cells, and of course save all those make-cell calls themselves.
Hey, if you can set up some dummy classes and bog it down, I will see what I can do about creating class-allocated cells which can still be overridden if necessary.
You lost me. Do you mean it gets changed in the sense that some other part is now being used, or in the sense that some attribute of the part (say, the price of the turbine) changes?
Both:
If a new part is added or a part is deleted from the BOM or the price attribute is changed.
You just need slots Part-BOMs and BOM-Parts. Any rule such as:
(defclass BOM () () (:default-initargs :price (c? (apply '+ (mapcar 'price (^parts))))))
Will establish dependencies on (a) the parts slot and (b) the price of each part. Note by the way that this means you cannot use destructive operations to change a parts list, unless you get sneaky and create a destructive function which guarantees the first cons cell is different, perhaps by re-consing the first element back on (assuming it is not being deleted).
the synapse between the two classes triggers the /right/ instance of class BOM, here the instance "As Built BOM , Baseline 001".
"Triggers"? Do you mean, the total on the BOM would have to be changed to reflect the changed turbine price?
Yes.
Anyway, it does not sound as if the two classes require dataflow, it sounds as if each instance of each class will require that dataflow with some other instance.
Agreed.
As much as I talk about instance-oriented programming, hey, if you author a cell in a rule supplied in an initform or default-initarg, guess what? You get class-oriented behavior. :) Just do not override those rules at make-instance time. :)
Aha!
Well, let's see if you hit a problem with that approach and then look at optimizations. Might be a fun task. In a sense, the class-allocated Cell is precisely analogous (he said after several seconds of careful analysis) to the RDBMS scheme of setting up an intermediate many-many relation, with the extra feature of automating dataflow from parts to BOM.
kt
Kenny Tilton wrote:
If a new part is added or a part is deleted from the BOM or the price attribute is changed.
You just need slots Part-BOMs and BOM-Parts. Any rule such as:
(defclass BOM () () (:default-initargs :price (c? (apply '+ (mapcar 'price (^parts))))))
Will establish dependencies on (a) the parts slot and (b) the price of each part.
Nah, that will not work. We could have a parts slot on the BOM class mediated by c-input, but then how does each part get its BOMs slot updated? In the past I would kludge up an output method (via def-c-output) on the parts slot of BOM to maintain the BOMs slot of Part, but with Cells II we have a Prime Directive which says -- well, it gets complicated, but logically those two updates are one, and output methods do not run until propagation is complete, so the model is inconsistent during propagation of any change to the parts list of a BOM -- any rule that fires will see a BOMs value on any new part which does not show the BOM to which the part has been added.
We can go the RDBMS route and create or destroy instances of Relations, or we can do what AllegroStore does with its persistent CLOS database: define a so-called inverse function on a slot, via a new defmodel slot option. It would work like this:
(defclass BOM () ((parts :cell t :inverse-cell part-BOMs :initform (c-in nil) :initarg :parts :accessor parts)))
(defclass part ()())
After which:
(let ((p (make-be 'part)) (BOM (make-be 'bom :parts (c-in (list p))))) (part-BOMs p)) => A list containing the BOM instance
...and part-BOMS is a cell like any other cell, accept that there is no BOMs slot on part. Now Cells II's new propagation scheme naturally takes care of consistency, since it arranges for just-in-time consistency during propagation.
AllegroStore went one more step and supported a "unique" option for the case where a one-to-many relationship is to be modeled. We could then apply this to the Family class, where kids have only one fm-parent. What do we achieve by this? For one, the inverse Cell fm-parent will now return just one parent instance instead of a list of one. For another, an error can be generated if a kid gets pushed onto the kids slot of more than one instance at the same time.
This is interesting. Although I have gotten by nicely with the one Family class for a long time, I have started to notice occasions where the special handling given the kids slot might be useful more generally. I have even considered a new cell slot option which would let any slot work like the kids slot. We are talking about a different issue now, but it is interesting that they point back to the same kind of relational slot.
Thoughts?
kt