On 9/6/06, Andrew Dalke dalke@dalkescientific.com wrote:
Me [Andrew Dalke]:
I developed a simple dependency system which I've used for several clients. The 2nd one was simpler because no modifications occurred after a property was set or computed. The heart was something like this
One thing it couldn't handle is query optimization.
In two cases I computed properties using an external program. If I wanted property A it took 1 minute, if I wanted property B it took 1 minute, if I wanted properties A and B computed at the same time then it took 1 minute 30 seconds.
My system was all lazy. Set it up with the rules and ask for the desired properties. But that meant asking for them one-at-a-time. There was no way to figure out that the user wanted A and B so go ahead and do the combined request instead of 2 separate requests.
I don't think PyCells can do this either. It's a hard problem without explicit knowledge of the dependencies.
The way I would phrase it is not that PyCells can or cannot do something, rather: Right, PyCells cannot help with any dependency it does not "see". And all that means is that, right, Cells/PyCells is not a free lunch after all; we must work around the things it can do to get what we want, and I must say I spent quite a while learning to think fluently in the dataflow paradigm, and even now hard problems (the ones involving events) I have to think pretty hard to come up with declarative solutions. The only saving grace is that it is that fun kind of hard thinking.
My solution is to provide a hint - when I know I'll do something
which needs A+B I'll look up a special property which computes A+B and then sets A and B.
That would work. Or buy into the apparent potential for composite queries lock stock and barrel, by creating an "and" operator in your burgeoning language and encouraging users to divide-and-conquer queries as much as possible to get re-use. This might mean moving the query-result cache out into a separate supervisory class. Then you have a query class that gets a pointer to the supervisor to see the search space and cache. Queries can be atomic (A, B) or "operation queries" which have an opcode "+" and so many "opnd" queries as an attribute. Hopefully you can figure out how to cache composites as well as atomics if the operations themselves are expensive enough (and reuse of those is to be expected). Then you are not doing anything special to cache subqueries.
hth,kt