-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi,
I've noticed the following in ACL7 on the attached document:
CL-USER(115): (cxml:parse-file "C:\graph.xml" (rune-dom:make-dom-builder)) #<RUNE-DOM::DOCUMENT @ #x212c5a82> CL-USER(116): (dom:map-document (cxml:make-namespace-normalizer (cxml:make-octet-stream-sink *standard-output*)) *) <svg xmlns="http://www.w3.org/2000/svg"> <script type="text/css"> </script> </svg> #<MULTIVALENT stream socket connected from localhost/3813 to localhost/3817 @ #x205003d2>
Now, it appears that this doesn't seem to be a problem for the javascript processor in firefox. It still seems to process lines and quotes and so on that have been thus escaped correctly. The XML 1.1 proposal in its description of CDATA states that the only markup that will be recognized by the XML processor is the CDATA end:
``Within a CDATA section, only the CDEnd string is recognized as markup, so that left angle brackets and ampersands may occur in their literal form; they need not (and cannot) be escaped using "<" and "&". CDATA sections cannot nest.''
Can cxml please correctly follow this requirement?
Thanks,
Sunil
Quoting Sunil Mishra (smishra@sfmishras.com):
CL-USER(116): (dom:map-document (cxml:make-namespace-normalizer (cxml:make-octet-stream-sink *standard-output*)) *)
Note that make-octet-stream-sink defaults to canonical mode for historical reasons.
<svg xmlns="http://www.w3.org/2000/svg"> <script type="text/css"> </script> </svg> #<MULTIVALENT stream socket connected from localhost/3813 to localhost/3817 @ #x205003d2>
Sorry, I don't see a bug. The serializer in canonical mode outputs character references for the newlines here, but it doesn't output a CDATA section either in the first place, so that's fine.
If you want to see a CDATA section, use non-canonical mode:
cl-user(43): (dom:map-document (cxml:make-octet-stream-sink *standard-output* :canonical nil) (cxml:parse-file "~/graph.xml" (cxml-dom:make-dom-builder))) <?xml version="1.0" encoding="UTF-8"?> <svg> <script type="text/css"> <![CDATA[
]]> </script> </svg>
``Within a CDATA section, only the CDEnd string is recognized as markup, so that left angle brackets and ampersands may occur in their literal form; they need not (and cannot) be escaped using "<" and "&". CDATA sections cannot nest.''
Can cxml please correctly follow this requirement?
It follows this requirement while parsing.
Only in serialization there is one little "problem" (unrelated to your question):
A document constructed in memory might include a CDATA section with characters not representable in a CDATA section. That is a user error, and CXML should signal an error when told to serialize such a document in non-canonical mode; right now I believe it does not signal that error and outputs the user data as-is, resulting in output that isn't well-formed. (But I'm taking patches. :-))
d.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi David,
Thanks for the clarification. I didn't notice that CDATA wasn't being output.
I'll try the non-canonical mode.
Sunil
David Lichteblau wrote:
Quoting Sunil Mishra (smishra@sfmishras.com):
CL-USER(116): (dom:map-document (cxml:make-namespace-normalizer (cxml:make-octet-stream-sink *standard-output*)) *)
Note that make-octet-stream-sink defaults to canonical mode for historical reasons.
<svg xmlns="http://www.w3.org/2000/svg"> <script type="text/css"> </script> </svg> #<MULTIVALENT stream socket connected from localhost/3813 to localhost/3817 @ #x205003d2>
Sorry, I don't see a bug. The serializer in canonical mode outputs character references for the newlines here, but it doesn't output a CDATA section either in the first place, so that's fine.
If you want to see a CDATA section, use non-canonical mode:
cl-user(43): (dom:map-document (cxml:make-octet-stream-sink *standard-output* :canonical nil) (cxml:parse-file "~/graph.xml" (cxml-dom:make-dom-builder)))
<?xml version="1.0" encoding="UTF-8"?>
<svg> <script type="text/css"> <![CDATA[
]]>
</script>
</svg>
``Within a CDATA section, only the CDEnd string is recognized as markup, so that left angle brackets and ampersands may occur in their literal form; they need not (and cannot) be escaped using "<" and "&". CDATA sections cannot nest.''
Can cxml please correctly follow this requirement?
It follows this requirement while parsing.
Only in serialization there is one little "problem" (unrelated to your question):
A document constructed in memory might include a CDATA section with characters not representable in a CDATA section. That is a user error, and CXML should signal an error when told to serialize such a document in non-canonical mode; right now I believe it does not signal that error and outputs the user data as-is, resulting in output that isn't well-formed. (But I'm taking patches. :-))
d.