Re: [cl-ppcre-devel] *regex-char-code-limit* - cl-ppcre-devel@common-lisp.net - mailman3.common-lisp.net

26 Nov 2006


      ...
Hmm ... no! I can't think of a single use case where i would need to
treat the BOM as part of the content. Actually, i can only come to the
conclusion that a BOM within the content would be a serious bug. After
all, your appication should _never_ deal with the binary representation,
only with code points. What _code point_ do you get for BOM?
i just download HTML pages using Java functions into Java strings.
then i use CL-PPCRE to extract some information from it. certainly, i
don't care about BOM, but CL-PPCRE crashes on it trying to aref array
beyong char-code-limit.
i can pre-filter data removing BOM, but i'm not guaranteed that i
won't get some other wild character.

well, there are better ways to tokenize HTML, but i've made quick and
dirty solution via CL-PPCRE :)

Re: [cl-ppcre-devel] regex-char-code-limit

Alex Mizrahi

tags

participants (1)