I have seen the same phenomenon that Attila refers to, across different machines and different lisp implementations.
I had been chalking it up to differences in sorting and hash table behaviors.
It can be annoying because, as Attila points out, the nondeterminism/partial order (in the sense that the algorithm as coded in CL does not uniquely determine a build order) can mask bugs in system dependency specification -- system definitions with missing dependencies can seem to work.
I do not believe that this problem is acute enough to try to fix with explicit randomization, especially since the partial ordering is something ASDF exploits to minimize rebuilds. I think it would be better to provide groveling support that supports programmers who wish to check/improve their system definitions. The cost of introducing explicit randomization in code complexity, need for repeated test runs, etc., seems like a bad trade-off.
Actually, an ASDF variant that shuffles partially ordered components randomly, and repeatedly rebuilds, as a tool that one could run on a cloud service or something to check definitions might be a fun project. But it's not something I think should go into core ASDF.
cheers, r
Sent from my iPad
On Jul 17, 2013, at 9:24, Faré fahree@gmail.com wrote:
On Wed, Jul 17, 2013 at 11:56 AM, Attila Lendvai attila.lendvai@gmail.com wrote:
ASDF's process for constructing a build plan from partial-order dependencies is (unless Faré changed something when I wasn't looking) non-deterministic.
does it mean that it's actively randomized?
No, there was never active randomization, but the infamous union-of-dependencies function used in practice to reverse the order in which depends-on is consulted as compared to the order in which it is specified.
i remember in our busier days (well before ASDF3) we used to have load order anomalies that seemed like they depended on filesystem order or something external to lisp.
There used to be many anomalies related to timestamps. ASDF3 should have that fixed, at least.
the symptom was that on my dev machine stuff compiled/loaded fine reproducibly, then when the recorded changes got pulled to the live server it suddenly failed to compile/load due to a dependency that was not explicitly added to the .asd file.
i don't really remember if it was just partial reloading, or clean recompile/reload, but i think it was the former, because it was rather baffling and the former wouldn't be too baffling.
This description is not enough for me to identify what bug (or "feature") you may have been hitting. Inter-system dependencies used to just not be done correctly; also, timestamps could be "interesting" when compiling in one image and loading in another. Finally, time skew between filesystem server and local kernel could be very "interesting". All things that should now be fixed with ASDF3.
if the ordering is actually not deterministic (as opposed to merely unspecified), then maybe it's a good idea to actively randomize lists before dependency constraints are applied?
It is merely unspecified, and happened to be reversed in ASDF1 and after it ASDF2, as compared to user order (however, components would still be traversed in user order).
(In ASDF2, I incrementally refactored the traversal algorithm until I understood it, and some bugs were fixed in corner cases, but the algorithm mainly stayed the same. In ASDF3, to fix a "last" couple of bugs, I had to completely rewrite the algorithm, twice, and all the traversal bugs and quirks should be gone now. See my ASDF3 talk at ELS 2013.)
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org An apple every eight hours will keep three doctors away.