[Sending a copy to the openmcl-devel mailing list.]
On Wed, Apr 8, 2009 at 9:51 PM, Dan Weinreb dlw@itasoftware.com wrote:
CCL does not support having a character with code #\udcf0. The reader signals a condition if it sees this. Unfortunately, using #-ccl does not seem to solve the problem, presumably since the #- macro is working by calling "read" and it is not suppressing unhandled conditions, or something like that. It might be hard to fix that in a robust way.
Interesting. It seems that #-ccl works fine for CCL's #\ but not for Babel's #\ which is defined in babel/src/sharp-backslash.lisp and it's what we're using within the test suite. That is of course my fault. I now see in CLHS that *READ-SUPRESS* should be honoured by each reader and I had missed that.
What's the rationale behind not supporting the High Surrogate Area (D800–DBFF)? I can see how that might make sense in that Unicode states that this area does not have any character assignments. But, FWIW, the other three Lisps with full unicode support that I'm familiar with -- SBCL, CLISP and ECL -- handle this area just fine.
The disadvantage of not handling this area is that we can't implement the UTF-8B encoding. What's the advantage?
In order to make progress, I had to just comment these out. I do not suggest merging that into the official sources, but it would be very nice if we could find a way to write tests.lisp in such a way that these tests would apply when the characters are supported, and not when they are not.
I'll fix the #\ reader macro and that should take care of that annoyance. (For some reason, in my system, tests.lisp appears to load fine with some old CCL 1.2 snapshot.)
The (or (code-char ..) ...) change, on the other hand, I think should be made in the official sources. The Hyperspec says clearly that code-char is allowed to return nil.
I see. For our purposes, though, it seems that if CODE-CHAR returns NIL, we should signal a test failure immediately.