Re: [asdf-devel] Performance hit in TRUENAMIZE
On Fri, Jul 16, 2010 at 12:54 AM, edgar <edgar-rft@web.de> wrote:
I think the main problem here is that some implementations (notably CLISP) differ between files and directories, so PROBE-FILE cannnot be used to probe files AND directories.
Well, SBCL and ECL do not have that problem, AFAIK. But TRUENAMIZE also seems to be used with files, not directories, from what I could gather during the debugging of ASDF. I would abstract those calls into a function PROBE-FILE-AND-DIRECTORY and use PROBE-FILE internally in this function. The purpose would be clear and sensible implementations might profit from that -- maybe the terrible startup times for ASDF might be decreased this way. Juanjo -- Instituto de Física Fundamental, CSIC c/ Serrano, 113b, Madrid 28006 (Spain) http://juanjose.garciaripoll.googlepages.com
These are the differences in execution time for SBCL, comparing a call to PROBE-FILE, with IGNORE-ERRORS+TRUENAME. Differences in ECL are more dramatic. I attach a patch file that implements the change I suggested. Note that in order to produce this patch I had to add a workaround: ASDF calls truenamize with wildcard paths without really caring for the location of wildcards. My wrapper checks for those wildcards, refusing to produce a truename but avoiding to signal an error, which is more efficient -- but it all looks fragile and not future proof. Juanjo (defun test1 () (dotimes (i 100000) (probe-file "Inexistent file"))) (defun test2 () (dotimes (i 100000) (ignore-errors (truename "Inexistent file")))) (time (test1)) Evaluation took: 2.227 seconds of real time 2.171720 seconds of total run time (1.786088 user, 0.385632 system) [ Run times consist of 0.028 seconds GC time, and 2.144 seconds non-GC time. ] 97.53% CPU 5,319,066,321 processor cycles 319,996,672 bytes consed (time (test2)) Evaluation took: 3.371 seconds of real time 3.303481 seconds of total run time (2.888650 user, 0.414831 system) [ Run times consist of 0.044 seconds GC time, and 3.260 seconds non-GC time. ] 97.98% CPU 8,055,475,317 processor cycles 470,401,824 bytes consed On Fri, Jul 16, 2010 at 11:35 PM, Juan Jose Garcia-Ripoll < juanjose.garciaripoll@googlemail.com> wrote:
On Fri, Jul 16, 2010 at 12:54 AM, edgar <edgar-rft@web.de> wrote:
I think the main problem here is that some implementations (notably CLISP) differ between files and directories, so PROBE-FILE cannnot be used to probe files AND directories.
Well, SBCL and ECL do not have that problem, AFAIK. But TRUENAMIZE also seems to be used with files, not directories, from what I could gather during the debugging of ASDF. I would abstract those calls into a function PROBE-FILE-AND-DIRECTORY and use PROBE-FILE internally in this function. The purpose would be clear and sensible implementations might profit from that -- maybe the terrible startup times for ASDF might be decreased this way.
Juanjo
-- Instituto de Física Fundamental, CSIC c/ Serrano, 113b, Madrid 28006 (Spain) http://juanjose.garciaripoll.googlepages.com
-- Instituto de Física Fundamental, CSIC c/ Serrano, 113b, Madrid 28006 (Spain) http://juanjose.garciaripoll.googlepages.com
Dear Juanjo, thanks for your input and sorry for not looking at the details of your patch earlier. I don't see what probe-file does that's different from (ignore-errors (truename ...)), but OK for the sake of performance I can see it being done. Is CLISP the only one to error when probe-file'ing a directory? Can/should we always test for the wildcardness of the pathname? Can you test/review the following patch based on yours? I intend to make it into ASDF 2.112. [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] If you make people think they're thinking, they'll love you; but if you really make them think, they'll hate you. — Don Marquis
On Sun, Jul 25, 2010 at 12:48 AM, Faré <fahree@gmail.com> wrote:
I don't see what probe-file does that's different from (ignore-errors (truename ...)),
I explained it in the previous email. Determining whether a file exists has a very very small cost, just a call to fstat() or whatever equivalent function the Common Lisp provides. Say this has a cost 1. This is followed by a call to TRUENAME which has a cost N, proportional to the length of components in the directory part of the pathname, but this cost only applies when the file exists. This is relevant for ASDF because 80% of the calls to TRUENAME may happen with non-existent paths. Can you test/review the following patch based on yours? I intend to
make it into ASDF 2.112.
Seems to work just fine. Juanjo -- Instituto de Física Fundamental, CSIC c/ Serrano, 113b, Madrid 28006 (Spain) http://juanjose.garciaripoll.googlepages.com
Committed as 2.112. [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] An anarchist is a man who is careful to always use pedestrian crossings, because he utterly detests talking with policemen. — Georges Brassens On 25 July 2010 09:04, Juan Jose Garcia-Ripoll <juanjose.garciaripoll@googlemail.com> wrote:
On Sun, Jul 25, 2010 at 12:48 AM, Faré <fahree@gmail.com> wrote:
I don't see what probe-file does that's different from (ignore-errors (truename ...)),
I explained it in the previous email. Determining whether a file exists has a very very small cost, just a call to fstat() or whatever equivalent function the Common Lisp provides. Say this has a cost 1. This is followed by a call to TRUENAME which has a cost N, proportional to the length of components in the directory part of the pathname, but this cost only applies when the file exists. This is relevant for ASDF because 80% of the calls to TRUENAME may happen with non-existent paths.
Can you test/review the following patch based on yours? I intend to make it into ASDF 2.112.
Seems to work just fine.
Juanjo
-- Instituto de Física Fundamental, CSIC c/ Serrano, 113b, Madrid 28006 (Spain) http://juanjose.garciaripoll.googlepages.com
participants (2)
-
Faré
-
Juan Jose Garcia-Ripoll