What if the previous hierarchy was plain buggy?
Then going forward we have to introduce a NEW class to replace the old class.
Yes, that breaks code, but it only breaks code VISIBLY. Redefinition of existing classes breaks code INVISIBLY.
If we had a new name, say OPERATION-CLASS, then everyone would see that their existing code needed modification. Yes, a nuisance.
I've tried to adhere to this guideline in the past, but sometimes it's just not possible: too many people rely on the old name, for several semantic effects, some of which are preserved and some of which aren't. Why break all those that are preserved, or even enhanced, when only a few of them are broken, what more, that were already broken in corner cases?
In the case of OPERATION, many systems were defining methods on OPERATION. The idea that OPERATION was a base class was deeply rooted in the code. Many systems defined methods on TEST-OP, for which the sideway propagation was an undesired bug, not an actual feature: when I test a system of mine, I do NOT want to test every third party library that I depend on, some of which have broken tests that I don't care about and don't care to fix. If I want propagation, it's easy enough to write a MONOLITHIC-TEST-OP or such that will propagate sideway and maybe even downward. As I described in my previous audit, there were many operation classes being defined for which sideway and downward propagation was a bug. It was a notable bug for the bundle operations. It was even a bug for LOAD-OP itself, in corner cases — the root bug that started the whole refactoring.
Once again, sometimes the right thing to do IS to keep the same name, and break some client code that was subtly broken.
But I can speak to the fact that code that quietly goes away and does something unexpected or, worse, quietly does not do something expected, is very difficult to debug.
Indeed, the problem here is not that the breakage was invisible; the problem is that, upon seeing the breakage, the author (you in this case) had trouble tracing down the breakage to its root cause.
You're right that maybe we could have at the end of each file a check for new subclasses of OPERATION, and a warning if an unknown one was found. It is probably more portable to use the MOP to walk all defined subclasses than to intercept their definition as it happens.
It's particularly bad in ASDF, because people often don't think to look into it (it's like looking for a bug in "make") and because even if you do think to look into it, you may be getting it from your implementation, which may make it difficult to find.
Yes, it's always shaking when the foundations you rely upon crumble under your feet. But that's precisely why it's important to fix the structural misdesign early rather than late. The challenge is to do it in non-disruptive ways; I admit I have only been so good at that.
This will be a maintenance principle going forward: no changes to user-extendible classes without renaming.
I fear this is a good guideline, but not a strict principle.
This suggests a corollary: in order to prevent horrible *internal* update issues, we should split apart user-extensible classes from internal classes.
E.g., it would make sense to have something like INTERNAL-OPERATION and OPERATION be distinct going forward. That naming policy is obviously not tenable for ASDF now -- OPERATION is used everywhere and must remain -- but adding USER-OPERATION might be a plausible step.
That allows us to incompatibly modify the guts of the program while providing a stable, or visibly-broken, API to the user.
I'm not convinced that adding a class makes things more stable rather than less. If anything the name should be more like a versions ASDF3-OPERATION than USER-OPERATION. And what of people who want to define methods on ALL operations? Or do you want to document two distinct classes, the base of all the hierarchy OPERATION, versus the base for user-extensibility, with plenty of additional semantics, that would be ASDF3-OPERATION? But what if people precisely want no additional semantics? Worse, what if a third party wants to hook in additional semantics to some (or all) operations? I think it makes sense to define new classes when you add new semantics, and not put all the functionality in the base class; but I don't think it makes sense to create gratuitous new classes just because you believe their might be a need for an entry point.
But once again, I believe it's too late to do anything about OPERATION. So let's consider a potential future cleanup one might want to do. Let's pick an "easy" one: the case of MODULE and SYSTEM. When rewriting ASDF, I initially wanted to make a system NOT be a MODULE, but to introduce separate classes PARENT-COMPONENT and CHILD-COMPONENT, such that a MODULE would be both a PARENT- and a CHILD- whereas a SYSTEM would be a PARENT- but not a CHILD-. This wasn't compatible with many systems that define methods on MODULE for the specific purpose of their being run on systems: indeed, that was the recommended backward-compatible way in the manual(!) to override the file type of Lisp files in a system, as opposed to the new way of using (and maybe defining) a subclass of CL-SOURCE-FILE with a different :type. If you want to clean that up, you'll have to make sure no one write methods on MODULE and expect them to work on SYSTEM. That will require changing a few systems in Quicklisp and providing a transition period; but eventually, you can make that cleanup. And for that cleanup, it would be counter-productive to rename the whole class hierarchy and make everything incompatible, when 99% of the code is compatible, and the offending code is arguably buggy (be it at the suggestion of the buggy manual).
Class hierarchy surgery is hard. But "just create a new hierarchy" is sometimes self-defeating. If the goal is a better system at the price of incompatibility, consider XCVB. If the goal is to slowly steer the CL community towards having a saner build system, all the while providing a smooth upgrade path via backward compatibility with recent-enough versions, then software surgery is the rule.
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org Tradition is the matter of which civilization is made. Anyone who rejects tradition per se should be left naked in a desert island. Innovation is the matter with which civilization is built. Anyone who rejects innovation per se should be left naked in a desert island.