cxml:parse is stuck waiting for an EOF even after the last closing tag when parsing from a stream (in my case it's a http stream in a server entry point. the client waits for the server to react, the server waits for the EOF).
looking at the code i get the impression that it was never meant to be used like this, an EOF is pretty much mandatory.
could someone comment on this? was cxml ever meant to support such use-case, just noone really needed it yet?
any plans for supporting this?
any hints are appreciated,
On Mon, Oct 03, 2011 at 04:14:09PM +0200, Attila Lendvai wrote:
cxml:parse is stuck waiting for an EOF even after the last closing tag when parsing from a stream (in my case it's a http stream in a server entry point. the client waits for the server to react, the server waits for the EOF).
looking at the code i get the impression that it was never meant to be used like this, an EOF is pretty much mandatory.
Off course, since any valid xml parser needs to be told when to finish parsing.
could someone comment on this? was cxml ever meant to support such use-case, just noone really needed it yet?
What use case exactly? When and how should the parser "stop"? (keep in mind that the "last closing tag" does not neccessarily signal the end of a well-formed xml document)
Cheers, Ralf Mattes
any plans for supporting this?
any hints are appreciated,
-- attila
Notice the erosion of your (digital) freedom, and do something about it!
PGP: 2FA1 A9DC 9C1E BA25 A59C 963F 5D5F 45C7 DFCD 0A39 OTR XMPP: 8647EEAC EA30FEEF E1B55146 573E52EE 21B1FF06
could someone comment on this? was cxml ever meant to support such use-case, just noone really needed it yet?
What use case exactly? When and how should the parser "stop"? (keep in mind that the "last closing tag" does not neccessarily signal the end of a well-formed xml document)
i see, thanks. for some reason i assumed that an xml document may only have one toplevel element.
the use-case is simple xml messaging over http.
what i would need is an option to tell cxml:parse to parse me one toplevel xml element and return regardless of what else is in the stream. what would also help is a hard limit on how much to read from the stream to deal with DOS attacks, but i guess one can do that with a wrapper stream also.
but i'll deal with this another way for now: just read in the string from the http stream and feeding this to cxml. i've just thought i write up my use-case for possibly adding some TODO entries if they make sense.
thanks for the clarification!
Quoting Attila Lendvai (attila.lendvai@gmail.com):
could someone comment on this? was cxml ever meant to support such use-case, just noone really needed it yet?
What use case exactly? When and how should the parser "stop"? (keep in mind that the "last closing tag" does not neccessarily signal the end of a well-formed xml document)
i see, thanks. for some reason i assumed that an xml document may only have one toplevel element.
Only one _element_, but comments, processing instructions, and space can follow, and need to be parsed.
the use-case is simple xml messaging over http.
There is code in cl-xmpp which deals with a similar situation: XMPP opens something that looks like an XML document using a start tag, but then the actual messages are the children of this "infinite" document element. cl-xmpp solves it using cxml, IIRC using klacks to read the individual child elements. Don't know if that helps you, but perhaps you might want to take a look (start reading code at READ-STANZA).
No matter whether you're using klacks or sax, in general you could just stop parsing when you're done. In the case of klacks, you simply avoid doing further klacks calls read event events. In the case of SAX, you need to define a method on the sax:end-element generic function, and perform a non-local transfer of control out of the parser.
If you want to read more data from the stream afterwards, one important trick while doing the above is the :speed 1 setting that cl-xmpp uses. It sets cxml's buffer to a single character, effectively disabling that buffer. The xstream API that deals with these details is undocumented, but rather straightforward.
d.