Earlier I claimed that the new ASDF TRAVERSE computed a conservative approximation to what needed to be recompiled. In particular, if we have
(defsystem X :depends-on (Y) ....)
With the new patch, if we do load-op on X, and Y has changed, X will be recompiled.
In Classic ASDF, this would not happen.
However, even with the new patch, this does not happen correctly.
In particular, this case will not work properly
(defsystem X :depends-on (Y) ...)
(defsystem Z :depends-on (Y) ...)
Let's imagine I load X and Z, which causes me to load Y.
Now I modify some of the code of Y and reload X. ASDF will notice the change in Y and trigger a recompilation of X.
BUT if I now (asdf:load-system :z), when we check the dependencies of Z and find Y, there will be nothing that needs to be done for Y, so recompilation of Z will /not/ be triggered.
So the problem with trying to do a conservative estimation of what needs to be recompiled is that we don't store with a system object information about the state of things upon which it depends, so we cannot detect a change in the state of those dependencies.....
Information about the last compilation of a system /is/ available in ASDF, but we don't store with X and Z information about the state of Z when they were compiled.
To open YA can of worms, we also don't cache information about systems *across lisp sessions*. But I think I'll just raise that issue and then drop it like a hot potato.
Cheers, r
[I don't expect the following to appear in ASDF. :)]
Many build systems consider dependencies to be a solved problem. Unfortunately, I don't know of any suitable for CL.
There are two proven approaches to detecting changes - timestamps - file hashes
The beauty of timestamps is that, on most filesystems, they don't require saving any extra data. If any of an output's dependencies are newer than it is, then it needs to be rebuilt. However, some filesystems (e.g. network shares) exhibit significant time jitter or coarse time resolution, thus breaking this approach. Moving or untarring old files into the build area can also be painful. I suspect the unix "touch" command was initially created to force updates in timestamp-based systems.
File hashes require extra computation and disk storage. Existing systems show that their computation can be a reasonably small fraction of the compile time. Rechecks can be optimized by saving the timestamp and a cheap hash for a quick scan, and a longer hash to compare files flagged by the first scan. Time spent determining that one file doesn't need to recompile may be regained by not recompiling files that depend on it. Storage can be shared with a related issue: compile-time dependency tracking.
System descriptions like ASDF tend to be overly simplistic; indeed that's a good design goal. They specify the minimal information required for a clean first compile -- library dependencies that must be met, obvious internal dependencies, etc. A good compiler can output more information; as each file is compiled, a separate output lists the files which contained definitions used by this one. Thus the next time this system is compiled, the build tools have a better picture of what dependencies to look at.
Automated dependency tracking is fairly straightforward in C/C++; the compiler must simply track the paths of each #include file. Java was designed to make it a non-issue (e.g. runtime resolution of constants across class files).
In CL, dependency semantics are more complicated; but I think they boil down to tracking the source files for each nonstandard (not in the impl) function used by the reader (e.g. reader macros and #.), all macros, all constants, and all inline functions. CL semantics resolve everything else at load/run time. Faré may have a better defined set due to his work on XCVB.
Unfortunately, I don't think any CL implementations can output this information. Then again, ISTR XCVB making progress on this front.
Later, Daniel
On 2/10/10 Feb 10 -10:08 PM, dherring@tentpost.com wrote:
[I don't expect the following to appear in ASDF. :)]
Many build systems consider dependencies to be a solved problem. Unfortunately, I don't know of any suitable for CL.
There are two proven approaches to detecting changes
- timestamps
- file hashes
The beauty of timestamps is that, on most filesystems, they don't require saving any extra data. If any of an output's dependencies are newer than it is, then it needs to be rebuilt. However, some filesystems (e.g. network shares) exhibit significant time jitter or coarse time resolution, thus breaking this approach. Moving or untarring old files into the build area can also be painful. I suspect the unix "touch" command was initially created to force updates in timestamp-based systems.
....
I'm going to ignore file-hashes. If someone else is really gung-ho for them, great.
System descriptions like ASDF tend to be overly simplistic; indeed that's a good design goal. They specify the minimal information required for a clean first compile -- library dependencies that must be met, obvious internal dependencies, etc. A good compiler can output more information; as each file is compiled, a separate output lists the files which contained definitions used by this one. Thus the next time this system is compiled, the build tools have a better picture of what dependencies to look at.
Automated dependency tracking is fairly straightforward in C/C++; the compiler must simply track the paths of each #include file. Java was designed to make it a non-issue (e.g. runtime resolution of constants across class files).
In CL, dependency semantics are more complicated; but I think they boil down to tracking the source files for each nonstandard (not in the impl) function used by the reader (e.g. reader macros and #.), all macros, all constants, and all inline functions. CL semantics resolve everything else at load/run time. Faré may have a better defined set due to his work on XCVB.
A real issue is that ASDF has provided (insufficiently documented) means for extending the protocol for updating things /without/ clearly boiling this back to something like timestamps. Which means that the notion of "dependency" is not operationalized. A subsidiary hindrance is the notion of operations on composites (modules and systems) have odd semantics because of the postorder traversal.
WRT macros, BTW, I believe that the Allegro defsystem has a special dependency relationship that corresponds to a macro definition, :definitions, which imposes more substantial recompilation obligations than a non-macro component. See http://www.franz.com/support/documentation/current/doc/defsystem.htm for possibly helpful discussion.
Another issue is that CL has a notion of RUNTIME state, which make does not have to deal with. That is, in a running lisp image, a CL make-alike must provide the means to go from one good state to another through a combination of compiling and loading. Make need only worry about creating a good state from zero.
Unfortunately, I don't think any CL implementations can output this information. Then again, ISTR XCVB making progress on this front.
It's not clear to me that this information is effectively computable, in which case we have to fall back on the defsystem author providing it.
best, r