Thanks very much for the property resolver suggestion. There was some feeling I think that using character classes would be messy. It also looks like the property resolver solution might allow the compiler to inline a custom matcher, if it has been decorated with the right declarations.
Thanks again.
bob
On Mon, Mar 12, 2012 at 11:18 AM, Edi Weitz edi@agharta.de wrote:
If they insist on using "\w", there's no portable way to change this except for patching the code.
Otherwise, they could of course use a character class or add their own property resolver.
Cheers, Edi.
On Mon, Mar 12, 2012 at 4:10 PM, Robert Brown robert.brown@gmail.com wrote:
Some folks I work with are using cl-ppcre. They've run into an incompatibility between cl-ppcre and the PCRE library that boils down to cl-ppcre's handling of \w. The behavior is documented in cl-ppcre's manual:
CL-PPCRE uses ALPHANUMERICP to decide whether a character matches Perl's "\w", so depending on your CL implementation you might encounter differences between Perl and CL-PPCRE when matching non-ASCII characters.
This reliance on ALPHANUMERICP may be a misfeature. It means that cl-ppcre behaves differently depending on the Lisp implementation it's running on.
My co-workers desire compatibility between cl-ppcre on SBCL (where ALPHANUMERICP follows Unicode) and PCRE for matching Latin-1 encoded strings. They patched the cl-ppcre code to make \w match a-z, A-Z, 0-9, and underscore. Is there a better workaround for them?
bob
cl-ppcre-devel site list cl-ppcre-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-ppcre-devel
cl-ppcre-devel site list cl-ppcre-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/cl-ppcre-devel