Sorry if this is a trivial issue, I'm a common lisp newbie.
If I apply the following very simple command (replacing one or more consecutive CR chars with one LF char)
(cl-ppcre:regex-replace-all (concatenate 'string (string #\return) "+") mystring (string #\linefeed))
to my string of 455079 characters (loaded from a utf-8 file), some of the last #\return characters are not substituted (even if they should, since if a apply again the command to the resulting string they ARE subsituted). It looks like in the search there is a sort of length limit, or maybe some string length mistake connected to multi-byte characters representation ?
Cheers.
Mario
This is an issue of the "This should not happen" variety. Certainly, there is no such limit in CL-PPCRE. If you could provide us (i.e. the mailing list) with a self-contained test case that demonstrates the problem in a reproducible way, I'll look into it. Please also make sure to let us know which Lisp on which OS you are using and which version of CL-PPCRE.
Thanks, Edi.
On Sat, Apr 9, 2011 at 5:41 PM, Mario Maio mario.maio@libero.it wrote:
Sorry if this is a trivial issue, I'm a common lisp newbie.
If I apply the following very simple command (replacing one or more consecutive CR chars with one LF char)
(cl-ppcre:regex-replace-all (concatenate 'string (string #\return) "+") mystring (string #\linefeed))
to my string of 455079 characters (loaded from a utf-8 file), some of the last #\return characters are not substituted (even if they should, since if a apply again the command to the resulting string they ARE subsituted). It looks like in the search there is a sort of length limit, or maybe some string length mistake connected to multi-byte characters representation ?
Cheers.
Mario
cl-ppcre-devel site list cl-ppcre-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-ppcre-devel
Well, I reinstalled my clisp/emacs/slime bundle switching to Lisp Cabinet and I was not able to replicate the problem, so everything's fine on that regard.
But I have another question: how do I enter Unicode chars in the rexexp? For example I need to replace "whatever" with “whatever”, I tried to replace
"([^"\r\n]*)"
with
\u201c\1\u201d
but it didn't work.
I know I could generate and concatenate Unicode chars with Lisp, e.g. (code-char #x201c), but it'd be cleaner to do it directly inside the regexp.
Thanks.
Mario
Il 09/04/2011 18:13, Edi Weitz ha scritto:
This is an issue of the "This should not happen" variety. Certainly, there is no such limit in CL-PPCRE. If you could provide us (i.e. the mailing list) with a self-contained test case that demonstrates the problem in a reproducible way, I'll look into it. Please also make sure to let us know which Lisp on which OS you are using and which version of CL-PPCRE.
Thanks, Edi.
On Sat, Apr 9, 2011 at 5:41 PM, Mario Maiomario.maio@libero.it wrote:
Sorry if this is a trivial issue, I'm a common lisp newbie.
If I apply the following very simple command (replacing one or more consecutive CR chars with one LF char)
(cl-ppcre:regex-replace-all (concatenate 'string (string #\return) "+") mystring (string #\linefeed))
to my string of 455079 characters (loaded from a utf-8 file), some of the last #\return characters are not substituted (even if they should, since if a apply again the command to the resulting string they ARE subsituted). It looks like in the search there is a sort of length limit, or maybe some string length mistake connected to multi-byte characters representation ?
Cheers.
Mario
cl-ppcre-devel site list cl-ppcre-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-ppcre-devel
.
On Thu, Apr 14, 2011 at 1:12 PM, Mario Maio mario.maio@libero.it wrote:
But I have another question: how do I enter Unicode chars in the rexexp? For example I need to replace "whatever" with “whatever”, I tried to replace
"([^"\r\n]*)"
with
\u201c\1\u201d
but it didn't work.
I know I could generate and concatenate Unicode chars with Lisp, e.g. (code-char #x201c), but it'd be cleaner to do it directly inside the regexp.
For a portable solution, you could give this a try:
Edi.
cl-ppcre-devel@common-lisp.net