On 4/18/12 7:24 AM, Douglas Crosher wrote:
- UTF-8 detection removed to save on the line count. Still detects a UTF-8 BOM, and reads a UTF-8 encoding file option. Adding a
UTF-8 encoding file option will help other tools too anyway.
Does anyone use a BOM with utf8? It's not required, doesn't do anything except consume 2 octets, but it's not disallowed either.
- Removed the lengthy external-format translation table to save on the line count. It should be easy for CL implementations to
include more aliases so move the burden of maintaining the aliases to the CL implementation. CLISP and the Scieneer CL already support an extensive range of codes and aliases, and an update set of aliases has been sent for CMUCL.
The updated set of aliases for CMUCL will be in the next snapshot. Thanks!
- Added a large set of test files that exercise the reading of the encoding file option and try to include enough characters from
each character set to check that the encoding option has been successful mapped to an appropriate external-format. This includes all the encoding in linux 'iconv -l' that include support for the characters needed by CL, but excludes the EBCDIC codes. All 629 tests pass on the Scieneer CL, 628 on CLISP, and a much smaller but useful sent on CCL, CMUCL, ECL, and SBCL due to their limited aliases and limited encoding support. See: http://www.scieneer.com/files/coding-tests.tar.bz2
Is there any "mandatory" set of encodings that asdf wants? Presumably that means at least iso8859-1 and utf8. Anything else is a bonus?
Ray