Stelian Ionescu stelian.ionescu-zeus@poste.it writes:
On Mon, 2009-04-13 at 22:24 +0200, james anderson wrote:
[ironic in this discussion, is that utf-8b is non-conformant - by definition.]
I don't think so. See http://www.unicode.org/versions/Unicode5.1.0/ paragraph E: "in processing the UTF-8 code unit sequence <F0 80 80 41>, the only requirement on a converter is that the <41> be processed and correctly interpreted as <U+0041>."
I think James' point is that UTF-8B is not specified by any standard so it has nothing to conform to.
You are right, though, that the UTF-8B decoding process is compatible/conformant with UTF-8. Not so for the encoding process: a UTF-8B encoder might generate invalid UTF-8.