I think it is worth here to distinguish between two separate domains where the terms CLASS, INSTANCE, RELATION, DEPENDENCY each have different meanings and side effects.
In the Application domain (my app being based on Cells) has Classes that get defined while adapting the app to a user's needs. A typical class would be "BOM" (said bill of material). The other one to be use in the use case outlined below is class "PART".
So we have:
- Application Class BOM
- Application Class PART
- Class Relation IS-USED-IN: PART -> BOM (read: a PART can be related
to a BOM with the relation IS-USED-IN). This is a relation defined on CLASS (!) level.
- Instance "As Built BOM, Baseline 001" of application class BOM
- Instance "High Pressure Turbine, Serial Nr 0001" of class PART
In order to make the use case more verbose we define some attributes on the classes:
Attribute PRICE on class BOM
Attribute NR-PARTS on class BOM
Attribute PRICE on class PART
The relation is a "general" one: if any attribute of an instance of class PART changes then recalculate all attributes of the corresponding instance of class BOM.
Several things have to be noted here:
Adding a part to a Bom is done by the following steps (not using Cells but the - say - traditional approach):
- BOM instance is already there
- Create an instance of class PART
- Create an instance of relation IS-USED-IN
- Set the LHS (left hand side) of the relation to the object id of
the new part 4. Set the RHS of the relation to the object id of the BOM instance
Now, the update mechanism in this traditional approach always crawls all relation instances using the id of the changed object to find all other objects being related to that instance id and sends an update event to all these instances.
This should be avoided by limiting the nr of events being fired. Not every related object has to be updated when a part is added. The same is true when the Price attribute of a part is changed: the BOM need not recalculate the nr of parts in it.
The intended use case is:
Define a relation class that says: update attribute PRICE on instances of class BOM if the attribute PRICE of instances of class PART changes - respecting the fact that a particular instance of class PART is always related to particular instances of class BOM (so update only those affected).
A part can be related to more than one BOM. A BOM can have more than one part (well, a simple n:m relation).
So, I want to define a class-level dependency between classes BOM and PART by defining the slots PRICE in BOM and NR-PARTS, PRICE in PART as cells.
As I said, just use an :initform or :default-initarg:
(defclass BOM (priceable) () (:default-initargs :price (c? (apply '+ (mapcar 'price (^parts))))))
BTW, if this gets into a long-enough list (a) at 64 you will hit a silly Cells limitation which i will look at eliminating shortly but (b) i think we are into new territory which I first noticed on the Dow-Jones Index use case, viz, one slot quite reasonably depending on a kazillion other slots. DJI got solved by ducking the problem -- it turned out each ticker was being depended on twice (for two different values) and a carefully placed without-c-dependency got the dependency count under 64 without sacrificing true dependency (the others were semantically redundant) -- but obviously this bad boy of an issue has to be dealt with.
How long are the parts lists? Even after I relax the 64-link limitation I observed a performance hit from cells internals searching down that long list (gets hit a lot, relatively speaking). I am thinking a clever divide-conquer scheme will be necessary, in which cells internals generate trees of synapses with a maximum of 16 or whatever dependencies at each node. We will see.
Using whatever mechanism out of the Cells package I defined the relation between these Cell slots. As there will be millions of instances of PART and (a few less) instances of BOMs I want to have to define the dependency just once while still being able to overwrite the firing rules and the dependency rules between any two instances of classes BOM and PART.
A Cell data structure has, inter alia, slots for used cells and cells which use it. These must be instance specific, since in the end we really do have a normal case of instances depending on other instances. The slots for the rule will all contain the same rule instance (unless it closes over a lexical variable). So you could save one slot in each Cell -- if defstruct supported class-allocated slots, but I do not think they do. (You would think I would know for sure, but Lisp still surprises me from time to time.)
Are all these parts loaded into RAM at once? With that kind of demand on memory i do not think Cells will stand out as a burden, though i could be wrong. Have you tried the normal approach and found a problem? I am not completely against early optimization and do it often, but this would be a case where I would let the normal approach fail before working on optimization.
Conceivably we can carve out a lighter-weight Cell, but I am not sure we can save more than a few slots. I think structures are implemented as arrays, so we are just making smaller arrays. I am concerned that that will not help much if a bottleneck is discovered. I would then lean towards your idea of a single Cell capable of managing the dependencies of all class instances. That would save almost all the slots at the cost of adding a hashtable lookup on an instance before getting to its list of used or user cells, and of course save all those make-cell calls themselves.
Hey, if you can set up some dummy classes and bog it down, I will see what I can do about creating class-allocated cells which can still be overridden if necessary.
You lost me. Do you mean it gets changed in the sense that some other part is now being used, or in the sense that some attribute of the part (say, the price of the turbine) changes?
Both:
If a new part is added or a part is deleted from the BOM or the price attribute is changed.
You just need slots Part-BOMs and BOM-Parts. Any rule such as:
(defclass BOM () () (:default-initargs :price (c? (apply '+ (mapcar 'price (^parts))))))
Will establish dependencies on (a) the parts slot and (b) the price of each part. Note by the way that this means you cannot use destructive operations to change a parts list, unless you get sneaky and create a destructive function which guarantees the first cons cell is different, perhaps by re-consing the first element back on (assuming it is not being deleted).
the synapse between the two classes triggers the /right/ instance of class BOM, here the instance "As Built BOM , Baseline 001".
"Triggers"? Do you mean, the total on the BOM would have to be changed to reflect the changed turbine price?
Yes.
Anyway, it does not sound as if the two classes require dataflow, it sounds as if each instance of each class will require that dataflow with some other instance.
Agreed.
As much as I talk about instance-oriented programming, hey, if you author a cell in a rule supplied in an initform or default-initarg, guess what? You get class-oriented behavior. :) Just do not override those rules at make-instance time. :)
Aha!
Well, let's see if you hit a problem with that approach and then look at optimizations. Might be a fun task. In a sense, the class-allocated Cell is precisely analogous (he said after several seconds of careful analysis) to the RDBMS scheme of setting up an intermediate many-many relation, with the extra feature of automating dataflow from parts to BOM.
kt