Ever since ABCL raised its CHAR-CODE-LIMIT from 256 to #x10000, 2 tests started failing: char-upcase.1 and char-upcase.2.
These 2 tests iterate through all integers between 0 and CHAR-CODE-LIMIT. While doing so, they test for the property that upcasing and downcasing returns the same character again ("round-tripping"). This property of characters is specified in section 13.1.4.3 (http://www.lispworks.com/documentation/lw51/CLHS/Body/13_adc.htm) "Characters with case". In short: characters with case are defined in pairs; additional characters with case have to be defined in pairs too.
The spec provides char-upcase and char-downcase to convert characters-with-case to their 'other-case equivalent'.
However, in section 13.1.10, there seems to be an escape hatch: "Documentation of implementation-defined scripts". A script is a subtype of CHARACTER, nothing more nothing less. An implementation-defined script gets to document the effect on CHAR-UPCASE and CHAR-DOWNCASE.
Now, if I were to define our Unicode script to be every character except those in the base set, char-upcase and char-downcase may have different semantics, except for the standard characters. That way, there's no need to have the round-tripping requirement apply to most of unicode - as can't be expected, see latin-small-letter-dotless-i for an example.
In the light above, is it really portable for the tests to assume all characters must be round-tripped? I think it's not.
What are your opinions?
Bye,
Erik.