i have an implementation that reports char-code-limit less than actual
-- it's ABCL (working on top of Java), only 256 codes are officially
suported, but it uses Java strings, so there's no problem with
handling Unicode strings -- i set *regex-char-code-limit* to some
10000 (thanks, Edi!).
however, there are characters like 0xFFEF (the BOM), so i should set
*regex-char-code-limit* to 65535. i think it's overkill to do that --
i see ppcre creates array of that size to do matching.
how do people cope with it on unicode-enabled lisps? (afaik SteelBank
uses UCS-4 char codes, so there's definitely no sane char-code-limit)
does ppcre create that for each scanner? if there's one global array
that's ok, but array for each scanner is too much..
does *use-bmh-matchers* affect usage of this array?
if so, would it be much slower if i disable it?