Hello,
Not sure if this is the appropriate forum as the email is not related to the development of cl-ppcre, but I did not find a list for users. Please feel free to redirect me elsewhere.
I could use some help in figuring out why this regexp is so slow. As far as I can tell, there is nothing abnormal about it. I currently use the same regexp in python and its blazes through the input file. Bear in mind, this is the first time that I've used cl-ppcre. It is was an experiment to see if I could lisp for this little application.
Here is the regexp (at least a small portion of it that exhibits the behavior I am seeing):
^(?:\S+ ){7}(\S+)\s+- commAlarm
Here is the input line it is matching against (note: this is a single line albeit a long one):
1105243660 11 Sun Jan 09 04:07:40 2005 sclax02.ibasis.net - commAlarm ovnyc00p.ov.i\vanet.net [1] private.enterprises.2496.1.1.5.5.1.0 (Integer): 0 [2] private.enterprises.\2496.1.1.5.5.2.0 (Integer): 115 [3] private.enterprises.2496.1.1.5.5.3.0 (OctetString): \ISUP: UNEX ANM [4] private.enterprises.2496.1.1.5.5.4.0 (OctetString): ISDN User Part Un\expected ANM [5] private.enterprises.2496.1.1.5.5.5.0 (Integer): 2 [6] private.enterpri\ses.2496.1.1.5.5.6.0 (Integer): 1 [7] private.enterprises.2496.1.1.5.5.7.0 (Integer): 1 \ [8] private.enterprises.2496.1.1.5.5.8.0 (Integer): 2 [9] private.enterprises.2496.1.1.\1.1.1.1.1.1.1.1376258 (Integer): 1376258 [10] private.enterprises.2496.1.1.1.1.1.1.1.1.2.1376258 (Integer): 21 [11] private.enterprises.2496.1.1.1.1.1.1.1.1.4.1376258 (OctetStr\ing): ss7path-att [12] private.enterprises.2496.1.1.1.1.1.1.1.1.5.1376258 (OctetString):\ SS7 Path For ATT and NGT DPC 5.21.39 [13] private.enterprises.2496.1.1.1.1.1.1.1.1.3.13\76258 (Integer): 1245188 [14] private.enterprises.2496.1.1.5.5.9.0 (Integer): 1105243880;1 .1.3.6.1.4.1.2496.1.1.4.1 0
Stuff 51 of those lines above into a into a file and try to match on that regexp and I get the following results:
PGW> (time (parse-file "/tmp/sample")) Evaluation took: 2.984 seconds of real time 1.81 seconds of user run time 1.12 seconds of system run time 0 page faults and 228,191,424 bytes consed.
I am hoping to parse a file that has close to 75,000 lines in that format. At this rate, I will never make it in a reasonable amount of time. Here is the PARSE-FILE function I am using:
(defun parse-file (file) (with-open-file (in file) (do ((line (read-line in nil :eof) (read-line in nil :eof))) ((eql line :eof) t) (do-register-groups ((#'intern host)) ("^(?:\S+ ){7}(\S+)\s+- commAlarm" line) (format t "Found host ~A~%" host)))))
I've also read the docs and I am unable to find anything to help identify the problem with my regexp. I have tried single-line mode, however the results were very similiar. My platform is SBCL 0.8.18.23 and version 1.0 of cl-ppcre.
Any help would be appreciated.
Thanks, Pete