Some folks I work with are using cl-ppcre. They've run into an incompatibility between cl-ppcre and the PCRE library that boils down to cl-ppcre's handling of \w. The behavior is documented in cl-ppcre's manual:
CL-PPCRE uses ALPHANUMERICP to decide whether a character matches Perl's "\w", so depending on your CL implementation you might encounter differences between Perl and CL-PPCRE when matching non-ASCII characters.
This reliance on ALPHANUMERICP may be a misfeature. It means that cl-ppcre behaves differently depending on the Lisp implementation it's running on.
My co-workers desire compatibility between cl-ppcre on SBCL (where ALPHANUMERICP follows Unicode) and PCRE for matching Latin-1 encoded strings. They patched the cl-ppcre code to make \w match a-z, A-Z, 0-9, and underscore. Is there a better workaround for them?
bob