I’m both asking how they should be named, and how to advertise them for programmatic consumption. For example, and automatic testing program such as that included in quicklisp, should not try to stand-alone load systems which are not designed to work stand-alone. We have to work around this by artificially making all systems “work” in standalone enough to fool quicklisp.
Can you explain the quicklisp constraint? How does it find all systems?
One simple expedient for this quicklisp issue -- if I understand it correctly -- would be to have a test-op default perform method for all systems that simply succeeds. It should probably by default issue a warning that no "real" test method exists, and that warning should have a particular type so that it can be muffled by quicklisp. Probably also we should allow the programmer of the original system to make a test-op no-op method that emits no warning (because the system is intended not to be testable).
As I understand quicklisp, it ties to compile each system in a top-level sbcl, and asserts that that works. As far as I know that is the only test it does. I don’t believe it does anything special with test-op.