On Sun, 28 Sep 2008 21:31:05 +0200 Edi Weitz edi@agharta.de wrote:
On Sun, 28 Sep 2008 14:15:40 -0500, "Matthew D. Swank" akopa.gmane.poster@gmail.com wrote:
I tried using a contruct like `(:sequence :start-anchor (:regex ,regex)) where regex is a pcre string, but matching still takes for ever (as in I gave up after 10 min) when slurping a moderately sized file (400k). Note, matching works fine for files under 1k, or if I break it up into lines for line oriented input.
Show us the regex you were using and some test data and then maybe we can help you to optimize it.
I suppose you read this?
http://weitz.de/cl-ppcre/#blabla
Edi. _______________________________________________ cl-ppcre-devel site list cl-ppcre-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-ppcre-devel
Well the regexes are defined in the lexers in this file: http://common-lisp.net/~mswank/apache-ppcre.lisp
The lexer api is in this file: http://common-lisp.net/~mswank/cl-ppcre-lexer.lisp
Finally, the log file I'm lexing: http://lcpug.asternix.com/pub/Main/ApacheLogProject/access.log
Compare (with-open-file (in "access.log") (let ((foo (stream-gen *apache-pcrelex-line* in))) (time (loop :for x := (funcall foo) :unless x :return nil))))
with
(with-open-file (in "access.log") (let ((foo (stream-gen *apache-pcrelex* in))) (time (loop :for x := (funcall foo) :unless x :return nil))))
When I slurp the entire file into a string the matches seem to be taking about a tenth of a second for each token.
Matt