I'm confused. I must be doing something wrong.
I have a string:
CL-USER> *str* "1 2 3
4 "
Just to make sure it's really what it seems:
CL-USER> (loop for c across *str* do (format t "~S " c))
#\1 #\Newline #\2 #\Newline #\3 #\Newline #\Newline #\4 #\Newline NIL
I wanted to match empty lines, so I did:
CL-USER> (cl-ppcre:regex-replace-all (cl-ppcre:create-scanner "^$" :multi-line-mode t) *str* "!") "1 2 3 !! 4 !!"
Now, I would normally expect this:
"1 2 3 ! 4 "
Playing with regex-coach indeed produces the result I'd normally expect. What am I doing wrong? (using CMUCL 19a, the testing version, and CL-PPCRE-0.7.7)
many thanks, --J.
On Tue, 13 Jul 2004 05:57:42 -0700, Jan Rychter jan@rychter.com wrote:
I'm confused. I must be doing something wrong.
I have a string:
CL-USER> *str* "1 2 3
4 "
Just to make sure it's really what it seems:
CL-USER> (loop for c across *str* do (format t "~S " c))
#\1 #\Newline #\2 #\Newline #\3 #\Newline #\Newline #\4 #\Newline NIL
I wanted to match empty lines, so I did:
CL-USER> (cl-ppcre:regex-replace-all (cl-ppcre:create-scanner "^$" :multi-line-mode t) *str* "!") "1 2 3 !! 4 !!"
Now, I would normally expect this:
"1 2 3 ! 4 "
Playing with regex-coach indeed produces the result I'd normally expect. What am I doing wrong? (using CMUCL 19a, the testing version, and CL-PPCRE-0.7.7)
Yes, this looks like a bug. I'll try to fix this ASAP. Thanks for the report.
Cheers, Edi.
Should be fixed now. Please try.
CL-USER> (cl-ppcre:regex-replace-all (cl-ppcre:create-scanner "^$" :multi-line-mode t) *str* "!")
It's shorter to write
(cl-ppcre:regex-replace-all "(?m)^$" *str* "!")
instead. This will also allow the compiler macro to compile the regex at load time.
Cheers, Edi.
Should be fixed now. Please try.
CL-USER> (cl-ppcre:regex-replace-all (cl-ppcre:create-scanner "^$" :multi-line-mode t) *str* "!")
Thank you -- indeed, it is fixed. It now produces:
JWR-TEST> (cl-ppcre:regex-replace-all (cl-ppcre:create-scanner "(?m)^$") *str* "!")
"1 2 3 ! 4 !"
I guess it is debatable whether the last "!" should be there. Perl doesn't behave that way, but I guess it _is_ an empty line, now that I think of it. And I wanted to get "!" instead of empty lines. So it actually makes more sense than Perl.
It's shorter to write
(cl-ppcre:regex-replace-all "(?m)^$" *str* "!")
instead. This will also allow the compiler macro to compile the regex at load time.
Nice, thanks!
--J.
On Tue, 13 Jul 2004 23:35:28 -0700, Jan Rychter jan@rychter.com wrote:
I guess it is debatable whether the last "!" should be there. Perl doesn't behave that way, but I guess it _is_ an empty line, now that I think of it. And I wanted to get "!" instead of empty lines. So it actually makes more sense than Perl.
Hmmm, yes it seems to make more sense. On the other hand, I'm trying to be as close to Perl as possible. Do you see any pattern there? Any idea why Perl doesn't add the last exclamation mark?
Cheers, Edi.
"Edi" == Edi Weitz edi@agharta.de writes:
Edi> On Tue, 13 Jul 2004 23:35:28 -0700, Jan Rychter jan@rychter.com Edi> wrote:
I guess it is debatable whether the last "!" should be there. Perl doesn't behave that way, but I guess it _is_ an empty line, now that I think of it. And I wanted to get "!" instead of empty lines. So it actually makes more sense than Perl.
Edi> Hmmm, yes it seems to make more sense. On the other hand, I'm Edi> trying to be as close to Perl as possible. Do you see any pattern Edi> there? Any idea why Perl doesn't add the last exclamation mark?
Uh, well, hmm. I've tried reading "man perlre", but the part about \z, \Z and multiline strings gave me a headache.
I really have no idea why Perl doesn't treat the end of a string as an "$" in this case, because it certainly does so for other expressions (e.g. "^4$" _will_ match at the end of a multiline string ending in "...\n4"). I see no reason to treat a string ending in "\n$" (on UNIX) differently: "^$" should definitely match there, as a new line has begun, and ended, being empty.
My suggestion would be to document this behavior. A brave soul could report this to the Perl people, but I seriously doubt they'd consider it a bug. It might be one of those DWIM things.
--J.
cl-ppcre-devel@common-lisp.net