Hello. Thanks everyone who answered in this thread. It was very helpful. I now collect test results for CFFI (and other libraries using the RT test framework) detailed to individual test failures, including the information what failures are known (aka "expected"). Now implementing it for other test frameworks. As for CFFI, you can see that from 14 Lisp / OS combination we've run the tests, only on two of the all the failures are known: http://common-lisp.net/project/cl-test-grid/pivot_lib-lisp_ql.html I also apply the "known failures" idea in another way - I compare test results of two consecutive quicklisp distributions on the same Lisp implementation and detect new failures, which were absent in the old version. (Here "known" failures are failures in the previous version - something along the lines of keeping the list of "known" failures separate from the tests.) That way, even it the library test suite already had failures, we can detect and quickly react to new bugs. Best regards, - Anton