[asdf-devel] ASDF test-op question

older
[asdf-devel] Announce : ASDF 1.367

Robert Goldman

2 Oct 2009 2 Oct '09

4:20 p.m.

I have been backporting Gary's markdown documentation to texinfo, and came across this: This operation should test the specified component. ASDF does not (currently) provide a method for returning a value from the test-op (or any other operation), so the implementor must ensure that executing test-op has the side effect of printing useful information about the tests to the CL stream *standard-output*. Question: do we want to specify *standard-output* as the destination for test-op output? I find that to be undesirable because on many CL implementations, it leads to test output being interspersed with lots of messages from compilation and loading, which makes it hard to find the test output you're looking for. I don't like specifying *standard-output* as the destination because it seems to me to be committing us to a bad solution. I think here /not/ specifying a destination is better than specifying a bad destination. I've never found a satisfactory solution to this problem. One suggestion would be to have the test-op take an init arg that specifies a stream (or a filename?) to which test output should be written. It would be left to test op implementers to figure out how to get their test output to go to that destination. I imagine this would typically be achieved by having the PERFORM method dynamically bind some variable that designates the destination. So, two proposals: 1. Delete the specification of *standard-output* as destination. 2. Add an initarg and slot for test-op and specify that output should go there. These are actually two somewhat independent proposals. A 'Yes' on 2 entails a 'Yes' on 1, but it's possible to say 'Yes' to 1 and 'No' to 2. Best, R

Show replies by date

Gary King

2 Oct 2 Oct

6:39 p.m.

Hi Robert, Ignoring any thinking about your actual questions <smile>...

...

This operation should test the specified component. ASDF does not (currently) provide a method for returning a value from the test-op (or any other operation),

This is no longer true; ADSF does return the operation so if there was also a way for test systems to annotate the operation, some things would become well. -- Gary Warren King, metabang.com Cell: (413) 559 8738 Fax: (206) 338-4052 gwkkwg on Skype * garethsan on AIM * gwking on twitter

Robert Goldman

6:56 p.m.

Gary King wrote:

...

Hi Robert,

Ignoring any thinking about your actual questions <smile>...

...
This operation should test the specified component. ASDF does not (currently) provide a method for returning a value from the test-op (or any other operation),

This is no longer true; ADSF does return the operation so if there was also a way for test systems to annotate the operation, some things would become well.

I don't believe that this is a general solution, for two reasons: 1. The test frameworks aren't particularly set up to stick stuff onto ASDF objects. I suppose if they write to streams that we create, then we can save the stream contents to a string and save them in the operations, though. It will be a burden on writing the test-op menthods, too. 2. Returning a single operation isn't enough, is it? For example, if I have system X, with sub-systems A, B, and C, I may be testing A, B, and C, so my traversal would have to gather up the three subsidiary test-op entities and either package them into the parent test-op object, or rip the test results out of them and push them into the parent test-op object. I don't believe the plan-then-execute logic of ASDF makes this easy, but I may be wrong. I will have to have a look into the latest code and see how the operations are returned. Thanks for the pointer. best, r

Gary King

5 Oct 5 Oct

4:35 p.m.

Hi Robert,

...

I don't believe that this is a general solution, for two reasons:

I agree that it isn't a general solution especially since there is no interface/API for clients to do anything with an ASDF operation! It might, however, be a small step in the right direction. -- Gary Warren King, metabang.com Cell: (413) 559 8738 Fax: (206) 338-4052 gwkkwg on Skype * garethsan on AIM * gwking on twitter

Robert Goldman

6 Oct 6 Oct

12:52 a.m.

Gary King wrote:

...

Hi Robert,

...
I don't believe that this is a general solution, for two reasons:

I agree that it isn't a general solution especially since there is no interface/API for clients to do anything with an ASDF operation! It might, however, be a small step in the right direction.

An alternative solution would be to provide a :stream or :filename init argument for the test-op operation class and bind a dynamic variable around every perform, making the stream or filename available for writing.... Best, r

Daniel Herring

2:42 a.m.

On Mon, 5 Oct 2009, Robert Goldman wrote:

...

Gary King wrote:

...
Hi Robert,

...
I don't believe that this is a general solution, for two reasons:

I agree that it isn't a general solution especially since there is no interface/API for clients to do anything with an ASDF operation! It might, however, be a small step in the right direction.

An alternative solution would be to provide a :stream or :filename init argument for the test-op operation class and bind a dynamic variable around every perform, making the stream or filename available for writing....

Why serialize the data? Could we design a structured API to be used by other tools? What if each test logged messages to asdf:*test-stream* and finished by calling (asdf:test-result test-description-string result-keyword) DejaGnu has a good description of test results; the keywords from good to bad could be :xpass, :xfail, :untested, :unresolved, :unsupported, :pass, and :fail. http://www.gnu.org/software/dejagnu/manual/x47.html Then asdf:test-op could return (values worst-result results-list). Of course ASDF doesn't need to reinvent testing; there are plenty of existing frameworks to choose from. http://www.cliki.net/test_framework Later, Daniel

Robert Goldman

3:18 a.m.

Daniel Herring wrote:

...

On Mon, 5 Oct 2009, Robert Goldman wrote:

...
Gary King wrote:

...
Hi Robert,

...
I don't believe that this is a general solution, for two reasons: I agree that it isn't a general solution especially since there is no interface/API for clients to do anything with an ASDF operation! It might, however, be a small step in the right direction. An alternative solution would be to provide a :stream or :filename init argument for the test-op operation class and bind a dynamic variable around every perform, making the stream or filename available for writing....

Why serialize the data? Could we design a structured API to be used by other tools?

What if each test logged messages to asdf:*test-stream* and finished by calling (asdf:test-result test-description-string result-keyword)

DejaGnu has a good description of test results; the keywords from good to bad could be :xpass, :xfail, :untested, :unresolved, :unsupported, :pass, and :fail. http://www.gnu.org/software/dejagnu/manual/x47.html

Then asdf:test-op could return (values worst-result results-list).

Of course ASDF doesn't need to reinvent testing; there are plenty of existing frameworks to choose from. http://www.cliki.net/test_framework

What I am after is an ASDF test-op that will adapt as well as possible to the widest possible set of testing frameworks. I agree that we should not attempt to reinvent testing. That is why I have been suggesting that we provide a test operation that binds a stream --- because most of the test frameworks I have worked with provide a test report, rather than returning results. I don't believe that having the test-op return a result will work, given the current ASDF execution model. If we are testing system X that has A and B dependencies, then what will happen will be something like .... testing A's components... test-op A....testing B's components ... test-op B .... test other X components ... test-op X It's not obvious to me how we take the results from test-op A and test-op B and roll them up into test-op X. Nor do I see how, in general, a test-op done in LIFT for A, FiveAM for B and NST for X, can be made to communicate with each other. It's not ideal, but I think a separate stream may be about as good as we can do. Best, r

Daniel Herring

4:13 a.m.

On Mon, 5 Oct 2009, Robert Goldman wrote:

...

Daniel Herring wrote:

...
On Mon, 5 Oct 2009, Robert Goldman wrote:

...
Gary King wrote:

...
Hi Robert,

...
I don't believe that this is a general solution, for two reasons: I agree that it isn't a general solution especially since there is no interface/API for clients to do anything with an ASDF operation! It might, however, be a small step in the right direction. An alternative solution would be to provide a :stream or :filename init argument for the test-op operation class and bind a dynamic variable around every perform, making the stream or filename available for writing....

Why serialize the data? Could we design a structured API to be used by other tools? ... Of course ASDF doesn't need to reinvent testing; there are plenty of existing frameworks to choose from. http://www.cliki.net/test_framework

What I am after is an ASDF test-op that will adapt as well as possible to the widest possible set of testing frameworks. I agree that we should not attempt to reinvent testing.

That is why I have been suggesting that we provide a test operation that binds a stream --- because most of the test frameworks I have worked with provide a test report, rather than returning results.

I don't believe that having the test-op return a result will work, given the current ASDF execution model. ...

Given that, how about deprecating/removing test-op? Then people can continue to simply load sysname-test.asd and run tests however they want. I see value in creating a widely accepted testing framework. I don't see the benefit in simply binding a stream variable. Later, Daniel

Robert Goldman

4:22 a.m.

Daniel Herring wrote:

...

On Mon, 5 Oct 2009, Robert Goldman wrote:

...
Daniel Herring wrote:

...
On Mon, 5 Oct 2009, Robert Goldman wrote:

...
Gary King wrote:

...
Hi Robert,

...
I don't believe that this is a general solution, for two reasons: I agree that it isn't a general solution especially since there is no interface/API for clients to do anything with an ASDF operation! It might, however, be a small step in the right direction. An alternative solution would be to provide a :stream or :filename init argument for the test-op operation class and bind a dynamic variable around every perform, making the stream or filename available for writing.... Why serialize the data? Could we design a structured API to be used by other tools? ... Of course ASDF doesn't need to reinvent testing; there are plenty of existing frameworks to choose from. http://www.cliki.net/test_framework

What I am after is an ASDF test-op that will adapt as well as possible to the widest possible set of testing frameworks. I agree that we should not attempt to reinvent testing.

That is why I have been suggesting that we provide a test operation that binds a stream --- because most of the test frameworks I have worked with provide a test report, rather than returning results.

I don't believe that having the test-op return a result will work, given the current ASDF execution model. ...

Given that, how about deprecating/removing test-op? Then people can continue to simply load sysname-test.asd and run tests however they want.

I see value in creating a widely accepted testing framework. I don't see the benefit in simply binding a stream variable.

Let me clarify. I /do/ see the value of providing a test-op. We have our own testing framework at my company, NST (see presentation hosted at tc-lispers.org for more information). We have provided a back-end for it that hooks into ASDF, so that people can do (test-system <foo>) and our NST tests will run and a report will be printed. This isn't fully ideal, though, since the results of tests will be mingled in the console output together with compilation results, which makes finding the test outputs a little more difficult than it should be. So I think providing a standardized output stream would be helpful, and probably generally helpful, since no matter what your unit-testing library, if it writes a report to the console, and is invoked from ASDF, it's likely to suffer from the same problem of output interleaving. I don't think that ASDF needs to provide a test framework; all it needs to do is to provide hooks for invoking existing test framework(s). Best, r

Attila Lendvai

8:04 a.m.

...

That is why I have been suggesting that we provide a test operation that binds a stream --- because most of the test frameworks I have worked with provide a test report, rather than returning results.

fyi, stefil returns a CLOS object containing the test results (and provides slime inspector customizations to present it specially in the inspector). as it doesn't work too well through asdf:test-op, it also provides a print-object method that prints a minimal text representation of the data. when test-op'ed, we print the result object to *standard-output*. when used interactively, the test defun that was used to start the testing simply returns the result value which we inspect in slime when needed. -- attila

Robert Goldman

1:01 p.m.

Attila Lendvai wrote:

...

...
That is why I have been suggesting that we provide a test operation that binds a stream --- because most of the test frameworks I have worked with provide a test report, rather than returning results.

fyi, stefil returns a CLOS object containing the test results (and provides slime inspector customizations to present it specially in the inspector).

as it doesn't work too well through asdf:test-op, it also provides a print-object method that prints a minimal text representation of the data.

when test-op'ed, we print the result object to *standard-output*. when used interactively, the test defun that was used to start the testing simply returns the result value which we inspect in slime when needed.

Do you see the same problems I see? I find that my test's output can get interleaved with irrelevant chaff. So some of our code has something like this: (defmethod perform :around ((op our-test-op) c) (let ((<special variable> (test-stream op))) (call-next-method))) It's actually far messier than this, because the <special variable>'s package needs to be wrangled, and I needed to do fussy things with make-sub-operation (is there some reason this is not a generic function?) to make sure that our-test-op gets propagated through traverse. Actually, I tell a lie. Getting make-sub-operation to do the right thing was so hard, that I ended up doing something horrible with CHANGE-CLASS. The resulting code was not suitable for sensitive eyes.

Attila Lendvai

4:25 p.m.

...

...
when test-op'ed, we print the result object to *standard-output*. when used interactively, the test defun that was used to start the testing simply returns the result value which we inspect in slime when needed.

Do you see the same problems I see? I find that my test's output can get interleaved with irrelevant chaff. So some of our code has something like this:

i think we don't have such problems because in stefil tests are expanded into augmented defun's, so no compilation happens while the tests run (unless compilation at each run is requested for some tests, but that's very rare). one exception is when the query compiler tests run in cl-perec. then we do get some interleaved compiler and test output, but the test output is only one char per test, so we don't mind so much... what i do mind though, is that an (asdf:oos 'asdf:test-op :foo) does not return the result value of the test run, which could be inspected in slime. -- attila

Tobias C. Rittweiler

8 Oct 8 Oct

10:16 a.m.

Robert Goldman writes:

...

An alternative solution would be to provide a :stream or :filename init argument for the test-op operation class and bind a dynamic variable around every perform, making the stream or filename available for writing....

To me, the most interesting advantage that I see in ASDF providing a test operation, is that it should allow for automatic testing of arbitrary software packages. I do not see how providing a stream argument is relevant to that. Or do you propose that people should /parse/ a test framework's output? -T.

Juan Jose Garcia-Ripoll

10:18 a.m.

On Thu, Oct 8, 2009 at 12:16 PM, Tobias C. Rittweiler <tcr@freebits.de> wrote:

...

To me, the most interesting advantage that I see in ASDF providing a test operation, is that it should allow for automatic testing of arbitrary software packages.

Indeed! See http://ecls.sourceforge.net/logs_lib.html

...

I do not see how providing a stream argument is relevant to that. Or do you propose that people should /parse/ a test framework's output?

Please, please do not force that upon us. It would be so EASY to ask each tester to just return a sequence or number with the tests that failed! Juanjo -- Instituto de Física Fundamental, CSIC c/ Serrano, 113b, Madrid 28006 (Spain) http://juanjose.garciaripoll.googlepages.com

Robert P. Goldman

1:41 p.m.

The problem is that the asdf framework does not lend itself to returning things from ops (particularly if you need to roll up subsidiary results from sub-operations). ___ Robert P. Goldman Principal Scientist, SIFT, LLC www.sift.info ..... Original Message ....... On Thu, 8 Oct 2009 12:18:50 +0200 "Juan Jose Garcia-Ripoll" <juanjose.garciaripoll@googlemail.com> wrote:

...

On Thu, Oct 8, 2009 at 12:16 PM, Tobias C. Rittweiler <tcr@freebits.de> wrote:

...
To me, the most interesting advantage that I see in ASDF providing a test operation, is that it should allow for automatic testing of arbitrary software packages.

Indeed! See http://ecls.sourceforge.net/logs_lib.html

...
I do not see how providing a stream argument is relevant to that. Or do you propose that people should /parse/ a test framework's output?

Please, please do not force that upon us. It would be so EASY to ask each tester to just return a sequence or number with the tests that failed!

Juanjo

-- Instituto de Física Fundamental, CSIC c/ Serrano, 113b, Madrid 28006 (Spain) http://juanjose.garciaripoll.googlepages.com

_______________________________________________ asdf-devel mailing list asdf-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel

Nikodemus Siivola

10:28 a.m.

2009/10/8 Tobias C. Rittweiler <tcr@freebits.de>:

...

To me, the most interesting advantage that I see in ASDF providing a test operation, is that it should allow for automatic testing of arbitrary software packages.

I do not see how providing a stream argument is relevant to that. Or do you propose that people should /parse/ a test framework's output?

I mostly agree with this. If you are running your own tests, (TEST-SYSTEM :FOO) is not a huge improvement over (LOAD-SYSTEM :FOO-TESTS) + (FOO-TESTS:RUN-FOO-TESTS), or whatever the call to run your tests is. It's interesting for running tests set up by other people, in which case the failure behaviour should be unambigous -- be it error-on-failure, an object with a PRINT-OBJECT method that makes its meaning obvious, or whatever. Output during the operation does not qualify as unambiguous to me, particularly since I might be redirecting output... That said, I don't see the ability to capture test output as contrary to this, just orthogonal. Cheers, -- Nikodemus

Robert P. Goldman

1:41 p.m.

No, nobody's supposed to parse it. It's just so if you want you can look at *only* the output of the tester, not the output of the tester mooshed together with the output of the compiler. Our tester does a bunch of compilation and esp on sbcl, the test output proper gets mixed together with lots of chaff, so it's hard to find, e.g., which tests are failing... ___ Robert P. Goldman Principal Scientist, SIFT, LLC www.sift.info ...... Original Message ....... On Thu, 08 Oct 2009 12:16:48 +0200 "Tobias C. Rittweiler" <tcr@freebits.de> wrote:

...

Robert Goldman writes:

...
An alternative solution would be to provide a :stream or :filename init argument for the test-op operation class and bind a dynamic variable around every perform, making the stream or filename available for writing....

To me, the most interesting advantage that I see in ASDF providing a test operation, is that it should allow for automatic testing of arbitrary software packages.

I do not see how providing a stream argument is relevant to that. Or do you propose that people should /parse/ a test framework's output?

-T.

_______________________________________________ asdf-devel mailing list asdf-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/asdf-devel

Tobias C. Rittweiler

4:14 p.m.

Robert Goldman writes:

...

2. Returning a single operation isn't enough, is it? For example, if I have system X, with sub-systems A, B, and C, I may be testing A, B, and C, so my traversal would have to gather up the three subsidiary test-op entities and either package them into the parent test-op object, or rip the test results out of them and push them into the parent test-op object. I don't believe the plan-then-execute logic of ASDF makes this easy, but I may be wrong.

Does performing TEST-OP on a sytem really result in testing all the system's dependencies? Or did you mean something else? -T.

Robert Goldman

4:23 p.m.

Tobias C. Rittweiler wrote:

...

Robert Goldman writes:

...
2. Returning a single operation isn't enough, is it? For example, if I have system X, with sub-systems A, B, and C, I may be testing A, B, and C, so my traversal would have to gather up the three subsidiary test-op entities and either package them into the parent test-op object, or rip the test results out of them and push them into the parent test-op object. I don't believe the plan-then-execute logic of ASDF makes this easy, but I may be wrong.

Does performing TEST-OP on a sytem really result in testing all the system's dependencies? Or did you mean something else?

We often have large structured systems where testing system X is done by testing subsidiary systems A, B, and C that X depends on. Consider, for example, if one were to write a test-op for CLSQL. One might then have subsidiary systems for the various DB backends, and one would have the test-op for CLSQL run the test-op on each backend (or some subset of the backends that are turned on). best, r

Tobias C. Rittweiler

6:09 p.m.

Robert Goldman writes:

...

Tobias C. Rittweiler wrote:

...
Robert Goldman writes:

...
2. Returning a single operation isn't enough, is it? For example, if I have system X, with sub-systems A, B, and C, I may be testing A, B, and C, so my traversal would have to gather up the three subsidiary test-op entities and either package them into the parent test-op object, or rip the test results out of them and push them into the parent test-op object. I don't believe the plan-then-execute logic of ASDF makes this easy, but I may be wrong.

Does performing TEST-OP on a sytem really result in testing all the system's dependencies? Or did you mean something else?

We often have large structured systems where testing system X is done by testing subsidiary systems A, B, and C that X depends on.

Consider, for example, if one were to write a test-op for CLSQL. One might then have subsidiary systems for the various DB backends, and one would have the test-op for CLSQL run the test-op on each backend (or some subset of the backends that are turned on).

So CLSQL's method on TEST-OP should perform tests on each backend and then merge the results of the subsystems into one result, shouldn't it? -T.

Robert Goldman

11:40 p.m.

Tobias C. Rittweiler wrote:

...

Robert Goldman writes:

...
Tobias C. Rittweiler wrote:

...
Robert Goldman writes:

...
2. Returning a single operation isn't enough, is it? For example, if I have system X, with sub-systems A, B, and C, I may be testing A, B, and C, so my traversal would have to gather up the three subsidiary test-op entities and either package them into the parent test-op object, or rip the test results out of them and push them into the parent test-op object. I don't believe the plan-then-execute logic of ASDF makes this easy, but I may be wrong. Does performing TEST-OP on a sytem really result in testing all the system's dependencies? Or did you mean something else? We often have large structured systems where testing system X is done by testing subsidiary systems A, B, and C that X depends on.

Consider, for example, if one were to write a test-op for CLSQL. One might then have subsidiary systems for the various DB backends, and one would have the test-op for CLSQL run the test-op on each backend (or some subset of the backends that are turned on).

So CLSQL's method on TEST-OP should perform tests on each backend and then merge the results of the subsystems into one result, shouldn't it?

That's one way of doing it, I suppose. We typically do something like: (:in-order-to ((test-op (test-op comp1)) (test-op (test-op comp2)) ...)) Yes, one can work around that, and hand-code more, but I'd prefer not to. OTOH, I'm coming to the conclusion that ASDF simply doesn't offer a good enough solution to this problem, and fixing it in ASDF won't benefit enough ASDF users. The different use patterns, and different underlying test libraries seem to diverge enough that we should all fix this ourselves. best, r

Faré

9 Oct 9 Oct

4:45 a.m.

Maybe ASDF is the wrong place to try to standardize testing infrastructure? I mean, maybe instead the authors of various test infrastructures should have a common list where they discuss interoperability, reporting, and a declarative way of specifying dependencies between test suites, between files and test suites, etc.? Since for instance XCVB doesn't have testing support yet, but testing support is amongst our eventual goals, I would be eager to have feedback on how a new build system could "do things right" in this regard -- or perhaps avoid doing anything, and just plugging into something else. [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] To say that power is corrupted is like saying that water is wet.

Robert Goldman

3:05 p.m.

Faré wrote:

...

Maybe ASDF is the wrong place to try to standardize testing infrastructure?

This is the conclusion I have reached, as well. I was hoping that some very weak standard could be arrived at that would make the test-op more generally useful to people installing systems, so that they could simply run the test-op and easily tell whether or not the test operation was successful. However, it may be that it's just a combination of features of our test framework and the way we have built our systems that makes it difficult for us (the original problem --- we find that we get compilation and loading output mixed together with the test output), so this is not a place where the ASDF community can readily get consensus. So please consider the suggest dropped. I may try to come back to the list with a proposal for modifying the MAKE-SUB-OPERATION function so that it makes specializing operations (including the test-op) easier. That would be a more general proposal that might make many things easier for ASDF system specifiers.

...

I mean, maybe instead the authors of various test infrastructures should have a common list where they discuss interoperability, reporting, and a declarative way of specifying dependencies between test suites, between files and test suites, etc.?

Since for instance XCVB doesn't have testing support yet, but testing support is amongst our eventual goals, I would be eager to have feedback on how a new build system could "do things right" in this regard -- or perhaps avoid doing anything, and just plugging into something else.

Let me suggest a few desiderata: 1. Make it possible for operation performance to return values. Think about what the types of these values should be while designing this facility. 2. Provide a standard means for specifying how sub-operation values should roll up to the values of their parents. 3. If you are taking the plan then act design framework, think about how to keep the (direct) child operations of a parent in some kind of procedure call envelope (dynamic scope extent). 4. Deal with the difference between how different operations should handle values and failure. E.g., a test operation should simply accumulate failures. A build operation should almost always fail at the first point of failure. Hope that helps, r

Tobias C. Rittweiler

18 Oct 18 Oct

9:39 a.m.

Robert Goldman writes:

...

Faré wrote:

...
Maybe ASDF is the wrong place to try to standardize testing infrastructure?

This is the conclusion I have reached, as well. I was hoping that some very weak standard could be arrived at that would make the test-op more generally useful to people installing systems, so that they could simply run the test-op and easily tell whether or not the test operation was successful.

However, it may be that it's just a combination of features of our test framework and the way we have built our systems that makes it difficult for us (the original problem --- we find that we get compilation and loading output mixed together with the test output), so this is not a place where the ASDF community can readily get consensus.

So please consider the suggest dropped.

Please excuse me that I cannot grow at peace with this conclusion. The missed utility is far too great to let this rest: Juanjo's automatic regression testing work to check for regressions in ECL is dead on, and I'm sure other implementators would be interested in such infrastructure as well. The same is true for D Herring and his LibCL project which looks very promising, but is also a candidate for automatic regression testing. I think that Daniel is the right guy for the job and I hope he'll have the necessary perseverance for what he has started. I have to excuse myself again because I'm not familiar with ASDF in any great detail; on the other hand, ASDF -- unlike most other Lisp project -- consists of over half a dozen of people who do know it pretty well, so I cannot believe that we can not come up with some solution. In particular because it's my impression that the problems have been over-stated. What's wrong with something on the line of defining a class OPERATION-RESULT which consists of a single flag SUCCESSP, making each OPERATION store such an object in a RESULT slot. Then define an :AROUND method on PERFORM which does (setf (operation-result op) (multiple-value-call #'make-operation-result op (call-next-method)))

Robert Goldman

10 p.m.

Tobias C. Rittweiler wrote:

...

Robert Goldman writes:

...
Faré wrote:

...
Maybe ASDF is the wrong place to try to standardize testing infrastructure? This is the conclusion I have reached, as well. I was hoping that some very weak standard could be arrived at that would make the test-op more generally useful to people installing systems, so that they could simply run the test-op and easily tell whether or not the test operation was successful.

However, it may be that it's just a combination of features of our test framework and the way we have built our systems that makes it difficult for us (the original problem --- we find that we get compilation and loading output mixed together with the test output), so this is not a place where the ASDF community can readily get consensus.

So please consider the suggest dropped.

Please excuse me that I cannot grow at peace with this conclusion.

The missed utility is far too great to let this rest:

Juanjo's automatic regression testing work to check for regressions in ECL is dead on, and I'm sure other implementators would be interested in such infrastructure as well.

The same is true for D Herring and his LibCL project which looks very promising, but is also a candidate for automatic regression testing. I think that Daniel is the right guy for the job and I hope he'll have the necessary perseverance for what he has started.

I have to excuse myself again because I'm not familiar with ASDF in any great detail; on the other hand, ASDF -- unlike most other Lisp project -- consists of over half a dozen of people who do know it pretty well, so I cannot believe that we can not come up with some solution.

In particular because it's my impression that the problems have been over-stated.

I think there are some problems which are real, but perhaps not insuperable: 1. Not all regression test frameworks are functional, returning values. Some write reports instead. 2. One needs to come up with a means of combining operation results that takes into account the structure of the plan that traverse produces and that operate then executes. If in order to test-op X, I must test-op A and B, how do I combine together the test results from A, B, and X into a top-level operation result. Perhaps OPERATION-ANCESTOR can be pressed into service. 3. Is there some way to do this such that the regression-testing framework need not be made aware of ASDF? Ideally, ASDF and the regression-testing framework would be defined independently, and some additional combination code would be defined that would link the two together, and that would be loaded downstream. I'd be reluctant to see a solution that demanded our regression-testing frameworks become dependent on ASDF. I believe that this problem could be easily circumvented by defining new systems like ASDF-FIVEAM, ASDF-NST, ASDF-STEFIL. At my company, we have already been working on ASDF-NST, but it has not reached a fully satisfactory state. best, r

Juan Jose Garcia-Ripoll

19 Oct 19 Oct

8:01 a.m.

On Mon, Oct 19, 2009 at 12:00 AM, Robert Goldman <rpgoldman@sift.info> wrote:

...

...
In particular because it's my impression that the problems have been over-stated.

I think so. And this is indicated by the following paragraphs

...

1. Not all regression test frameworks are functional, returning values. Some write reports instead.

That does not mean anything. ASDF may impose one behavior w.r.t to the output and the package maintainer may decide to output T saying "hey, there might be errors, but I do not know." Later on other software and test suites will catch up. With RT, for instance it is trivial to grab the number of failed tests.

...

2. One needs to come up with a means of combining operation results that takes into account the structure of the plan that traverse produces and that operate then executes. If in order to test-op X, I must test-op A and B, how do I combine together the test results from A, B, and X into a top-level operation result. Perhaps OPERATION-ANCESTOR can be pressed into service.

You are just imposing too much complexity. If I want to test package Cl-UNICODE, I do nont want to test FLEXI-STREAMS or U-SOCKETS. Tests should be atomic and not generate a tree of actions like ASDF does not for everything.

...

3. Is there some way to do this such that the regression-testing framework need not be made aware of ASDF?

No, there is none. Regression tests are libraries which are used by people using ASDF. Eventually they may provide their own operations, extending what ASDF provides, but currently it is better to have some bare minimum from ASDF and let each package maintainer decide how they compute the success of their test suite. Blocking this development just because there are 5 test suites and you do not know how to combine them with ASDF is really absurd. ASDF's specifications can not depend on what your company or other people's companies are setting up for their workflow. Juanjo -- Instituto de Física Fundamental, CSIC c/ Serrano, 113b, Madrid 28006 (Spain) http://juanjose.garciaripoll.googlepages.com

Tobias C. Rittweiler

9:37 a.m.

Juan Jose Garcia-Ripoll writes:

...

On Mon, Oct 19, 2009 at 12:00 AM, Robert Goldman wrote:

[for the record; the topmost paragraph came from me:]

...

...
...
In particular because it's my impression that the problems have been over-stated.

I think so. And this is indicated by the following paragraphs

...
1. Not all regression test frameworks are functional, returning values. Some write reports instead.

That does not mean anything. ASDF may impose one behavior w.r.t to the output and the package maintainer may decide to output T saying "hey, there might be errors, but I do not know." Later on other software and test suites will catch up. With RT, for instance it is trivial to grab the number of failed tests.

...
2. One needs to come up with a means of combining operation results that takes into account the structure of the plan that traverse produces and that operate then executes. If in order to test-op X, I must test-op A and B, how do I combine together the test results from A, B, and X into a top-level operation result. Perhaps OPERATION-ANCESTOR can be pressed into service.

You are just imposing too much complexity. If I want to test package Cl-UNICODE, I do nont want to test FLEXI-STREAMS or U-SOCKETS. Tests should be atomic and not generate a tree of actions like ASDF does not for everything.

As an interim solution, there can be some merge operation which, for the moment, just bails out if multiple results really have to be combined.

...

...
3. Is there some way to do this such that the regression-testing framework need not be made aware of ASDF?

No, there is none. Regression tests are libraries which are used by people using ASDF. Eventually they may provide their own operations, extending what ASDF provides, but currently it is better to have some bare minimum from ASDF and let each package maintainer decide how they compute the success of their test suite.

Right. In particular: between ASDF and the regression test framework always stands a _human being_, the maintainer of the software package. It's his duty to interconnect his choice of third-party test framework with ASDF -- that is if he really wants to. If he doesn't, well, he just shouldn't. Existing practise does luckily not matter much in this case. It's about meta-information that nobody can depend on at the moment because it does not exist. And if it exists, it'll matter to people who care and will do the right thing. (Which may include fixing other people's code; but see that it's in _their_ own interests to do so, so they will.)

...

Blocking this development just because there are 5 test suites and you do not know how to combine them with ASDF is really absurd. ASDF's specifications can not depend on what your company or other people's companies are setting up for their workflow.

Blocking is way too strong a word. After all, both of us want to piggy-back onto work of others. We can just hope to make our case clear to those who contribute code---or contribute ourselves. -T.

Juan Jose Garcia-Ripoll

10:09 a.m.

On Mon, Oct 19, 2009 at 11:37 AM, Tobias C. Rittweiler <tcr@freebits.de> wrote:

...

Juan Jose Garcia-Ripoll writes:

...
Blocking this development just because there are 5 test suites and you do not know how to combine them with ASDF is really absurd. ASDF's specifications can not depend on what your company or other people's companies are setting up for their workflow.

Blocking is way too strong a word. After all, both of us want to piggy-back onto work of others. We can just hope to make our case clear to those who contribute code---or contribute ourselves.

I am really sorry if I was too rude in the wording. Subtleties are not my strong point. What I really meant is that just getting stalled in trying to match everybodies needs by combining ASDF with 5 test modules, several build systems, and the uses and abuses of tenths of Common Lisp developers is not rational. One should break the ice by including a bare minimum. Suppose one defines TEST-OP's perform method as returning (VALUES x y) - x= NIL, signalling failure - x= T, signalling success - y= An optional sequence of objects representing test failures Different test suites may create their own operation, RT:RT-TEST-OP, for instance, which run all tests and return the appropriate output. But at the very least the developer may define its own perform operation with an EQL specialization which applies to its own system. I would myself, but I have had problems with ASDF's internals in the past and I still do not understand how it really works or the logic behind the internal functions--should be pretty evident from my continuous questions about how to best re-write our current code for ECL/ASDF integration. Juanjo -- Instituto de Física Fundamental, CSIC c/ Serrano, 113b, Madrid 28006 (Spain) http://juanjose.garciaripoll.googlepages.com

Robert Goldman

2:27 p.m.

Juan Jose Garcia-Ripoll wrote:

...

On Mon, Oct 19, 2009 at 12:00 AM, Robert Goldman <rpgoldman@sift.info> wrote:

...
...
In particular because it's my impression that the problems have been over-stated.

I think so. And this is indicated by the following paragraphs

[..snip..]

...

...
2. One needs to come up with a means of combining operation results that takes into account the structure of the plan that traverse produces and that operate then executes. If in order to test-op X, I must test-op A and B, how do I combine together the test results from A, B, and X into a top-level operation result. Perhaps OPERATION-ANCESTOR can be pressed into service.

You are just imposing too much complexity. If I want to test package Cl-UNICODE, I do nont want to test FLEXI-STREAMS or U-SOCKETS. Tests should be atomic and not generate a tree of actions like ASDF does not for everything.

Please see earlier discussion about this topic. The point I made there is that systems may have COMPONENT subsystems such that you want to test the entire system together. Consider, e.g., a DB library with multiple backends, each described in a separate system. In order to do the test-op on the DB library you want to do the test on all active backends. Similarly, McCLIM has multiple graphics display backends, not all of which are loaded at a given time. I work actively on three (or four, depending on how you count) large CL-based applications. Each one of them is made up of multiple ASDF subsystems. The example of CL-UNICODE is a strawman, because it is the case of testing a system and its libraries. There is also the case of testing a system and its subsystems, which is the more interesting one. For that matter, though, if I am a CL-UNICODE /user/, I may very well want to test to see if it will work in its current installation, in which case I /do/ want to test FLEXI-STREAMS and U-SOCKETS, because I want to know whether my installation works. [..snip..]

Tobias C. Rittweiler

2:45 p.m.

Robert Goldman writes:

...

Juan Jose Garcia-Ripoll wrote:

...
You are just imposing too much complexity. If I want to test package Cl-UNICODE, I do nont want to test FLEXI-STREAMS or U-SOCKETS. Tests should be atomic and not generate a tree of actions like ASDF does not for everything.

Please see earlier discussion about this topic.

The point I made there is that systems may have COMPONENT subsystems such that you want to test the entire system together.

Consider, e.g., a DB library with multiple backends, each described in a separate system. In order to do the test-op on the DB library you want to do the test on all active backends. Similarly, McCLIM has multiple graphics display backends, not all of which are loaded at a given time. I work actively on three (or four, depending on how you count) large CL-based applications. Each one of them is made up of multiple ASDF subsystems.

Yes, that is a valid use case. No one disputes that. What if we start with the simple case of only carring one bit of meta-information (successp), merging is nothing more than ORing together the results. This can be done in a way to satiesfy current needs without restricting future improvement (which then can be based on actual experience in the field.) Notice that whatever we come up does not necessarily have to be part of the exported API from the beginning but can be stated to be experimental. We can, however, not achieve experience without trying.

...

The example of CL-UNICODE is a strawman, because it is the case of testing a system and its libraries. There is also the case of testing a system and its subsystems, which is the more interesting one.

For that matter, though, if I am a CL-UNICODE /user/, I may very well want to test to see if it will work in its current installation, in which case I /do/ want to test FLEXI-STREAMS and U-SOCKETS, because I want to know whether my installation works.

Very true. It's related to a previous inquire of mine about the recursiveness of :force. But it's a slightly different matter, and I'd reach agreement on the issue above first. -T.

Tobias C. Rittweiler

2:49 p.m.

"Tobias C. Rittweiler" writes:

...

What if we start with the simple case of only carring one bit of meta-information (successp), merging is nothing more than ORing together the results.

Bah, s/OR/AND/ -T.

Robert Goldman

2:58 p.m.

Tobias C. Rittweiler wrote:

...

Robert Goldman writes:

...
Juan Jose Garcia-Ripoll wrote:

...
You are just imposing too much complexity. If I want to test package Cl-UNICODE, I do nont want to test FLEXI-STREAMS or U-SOCKETS. Tests should be atomic and not generate a tree of actions like ASDF does not for everything. Please see earlier discussion about this topic.

The point I made there is that systems may have COMPONENT subsystems such that you want to test the entire system together.

Consider, e.g., a DB library with multiple backends, each described in a separate system. In order to do the test-op on the DB library you want to do the test on all active backends. Similarly, McCLIM has multiple graphics display backends, not all of which are loaded at a given time. I work actively on three (or four, depending on how you count) large CL-based applications. Each one of them is made up of multiple ASDF subsystems.

Yes, that is a valid use case. No one disputes that.

What if we start with the simple case of only carring one bit of meta-information (successp), merging is nothing more than ORing together the results.

This can be done in a way to satiesfy current needs without restricting future improvement (which then can be based on actual experience in the field.)

Actually, I don't think that the hard part is deciding how to merge things together. I agree with you that the merge operation can be made a generic function, solving this problem. The greater problem, I believe, is determining /what/ operations to merge together. Consider the case of a system, X, on which I invoke the TEST-OP. Doing so can create a plan (through TRAVERSE) which contains an arbitrary number of operations necessary to perform TEST-OP on X, including potentially many LOAD-OPs and COMPILE-OPs. How many of these operations need to be inspected for success or failure? One might think "all and only the test-op"s, but I can say from sad experience that this is not so --- a large program on which I am now working came to a halt for several hours because it did TEST-OP, and the TEST-OP didn't yield a report because some subsystem failed to build. We spent quite a while looking for the test-op's output which we, of course, did not find. So certainly, one should make sure that the test-op handle this properly. For interactive use, this is not a problem --- if the load fails, an exception will be raised. But for NON-interactive use (in our case on a build-and-test server) it's not so obvious.... A build-and-test server is very handy on medium to large size projects to verify that someone committing a patch is not breaking the build. As I said, perhaps the ancestor links among operations may give a solution to this problem: I simply haven't had time to investigate thoroughly. I will try to get some time in the near future to discuss this matter with my colleague, John Maraist, who has done the most recent work on integrating the NST unit test library and ASDF. I will report back anything useful I learn. best, Robert

5733

Age (days ago)

5750

Last active (days ago)

List overview

Download

31 comments

9 participants

participants (9)

Attila Lendvai
Daniel Herring
Faré
Gary King
Juan Jose Garcia-Ripoll
Nikodemus Siivola
Robert Goldman
Robert P. Goldman
Tobias C. Rittweiler