Phillip,
Nice write-up. Random notes:
What I discovered is quite cool. The Cells system *automatically discovers* dynamic dependencies, without having to explicitly specify that X depends on Y, as long as X and Y are both implemented using cell objects.
<g> And that is part of why Cells is pretty much all-or-nothing for a developer: I have not tried to figure out the threshhold, but above what I think is a very low one, all application semantics must be expressed declaratively as Cell rules. Otherwise imperative code gets left out of the action as the automatic dataflow engine does its thing. The corollary being your "as long as" qualifier: my declarative rules are crippled if some important datapoint is not a Cell.
For the first seven years of Cells development, when in doubt I started out a new slot/attribute as a non-Cell, bending over backwards if you will not to force the mechanism where it should not go. The default :cell meta-attribute was nil until just recently. But in each case, soon enough it turned out I would need them to be Cells. Just recently "true" became the default for :cell.
Specifically, the cells system understands how to make event-based updates orderly and deterministic, in a way that peak.events cannot.
It may be of interest that this orderliness is relatively new to Cells. For the longest time and in the most intense applications I got away with murder. Strangely, it was development of a RoboCup client that forced Cells to "grow up".
One especially interesting bit is that the Cells system can "optimize away" dependencies or subscriptions when their subjects are known to be constant values.
I was quite surprised at how much faster this made Cells run.
I'm also wondering if a Cells-like system couldn't also be used to implement STM (Software Transactional Memory) to allow for atomic operations even in the presence of threads. All reads and writes are controlled by the cells system, so it can in principle abort and retry a "transaction", by waiting until *something changes* that would affect the transaction's ability to succeed.
We have a Google SoC project over on the Lisp side to implement STM, and yes, I am excited about that making Cells viable in a multi-threaded situation. Mind you, I had never heard of STM before this proposal landed on our doorstep, nor do even have much idea of what is available to applications when it comes to dealing with threads, but looking at how Cells manages data integrity I know it will need help to survive threads. STM looks like a great fix.
However, seeing how the Cells paradigm works, it seems to me that it should be pretty easy to establish the convention that side-effects should be confined to non-rule "observer" code.
Right, it is just a convention, but I think one that gets easier to follow because the engine provides a simple way to say "do this when the time is right".
experience w/e.g. peak.binding attributes shows that it's rare to want to put side-effects into pull-oriented rules.
"We could do it, but it would be wrong."
Really, the principal downside to Cells is wrapping your head around the idea that *everything* should be treated as pull-oriented rules.
Yes, it really is a paradigm shift, one it takes a long time to internalize. What I noticed was that, if I decided to add a significant new mechanism to the system, after about two hours of coding I would be having increasing difficulties and start to get a vague "bad feeling". Then I would realize that I had, from long habit, fallen back into an imperative style. Hence the "bad feeling". Because the code was all new, it did not grow naturally from the Cell-based model. if it had, It would of course been done originally in the declarative style.
I have encouraged Ryan, the PyCells author, not to allow backdoors to the Cells engine, precisely because of this. The big win comes from the declarative paradigm, and developers will not climb that learning curve if they can avoid it. SImple human nature. Cells makes one think harder up front in return for all sorts of good things later, and that is a tradeoff I have always liked to make as a developer.
There are some operations (such as receiving a command and responding to it) that seem to be more naturally expressed as pushing operations, where you make some decisions and then directly update things or send other commands out.
Exactly! A spreadsheet is a steady-state thing (here are the values, here is the computed other state) and using Cells to express static reality is a snap. Otoh, imperative code is all about change, so it is great for handling events.
We use ephemeral Cells to model events (they take on a value, propagate, then revert to null silently, without propagating), but one still can end up thinking pretty hard when it comes to events. I think the most frightening "rule" I have written was for a Timer class implemented by the Tcl "after" command.
Actually, you can still do that, it's just that those updates or commands force another "virtual moment" of time into being, where if you had made them pull-driven they could've happened in the *same* "virtual moment". So, it's more that pull-based rules are slightly more efficient than push-based ones, which is nice because that means most developers will consider it worth learning how to do it the pull way. ;)
That and the straitjacket I hope PyCells keeps from Cells.
Anyway, there is a *lot* of interesting food for thought, here. For example, you could create object validation rules using cells, and the results would be automatically recomputed when something they depended on changed. Not only that, but it would be possible to do atomic updates, such that the validation wouldn't occur until *after* all the changes were made -- i.e., no false positives. Of course, you'd get the resulting validation errors in the *next* "time quantum", so you'd need to make the response to them event-driven as well.
It's definitely a slippery slope. :)
For example, this deterministic model of computation seems to resemble "object prevalence" (e.g. Prevayler) in that everything (even the clock) is deterministic, changes are atomic, and I/O occurs between logical moments of time. I haven't thought this particular link through very much yet, it's just an intriguing similarity.
Nice call. I have heard the Cells data integrity model maps nicely onto the transaction model of AllegroCache, a persistent Lisp object database.
The head-exploding part is figuring out how to get errors to propagate backwards in time, so that validation rules (which run in the "next moment") could appear to cause an error at the point where the values were set.
Sounds like you want at least one Undo. What about a "fail now or forever hold your peace" policy?
cheers, kenny