Hi,
Quoting Ivan Shvedunov (ivan4th@gmail.com):
I've noticed that parsing XML that contains many entity references may cause CXML to eat up all available heap space and crash. This can be easily reproduced:
thank you for the report. I have replace tail recursion in p/content with a loop and also removed the call to append, which serves no purpose in the current code base.
(In addition, I have fixed string normalization in the DOM builder, reducing the parse time for your example from 9s to around 2s for me.)
Please test cxml CVS again.
Overall I don't think this is a proper solution, most likely xml-parse.lisp needs some serious rework to make it more reliable.
In general, recursion depth in the SAX parser should be at most proportional to the nesting depth of XML elements being parsing (as well as the nesting depth of entity references). This level of recursion should not usually be a problem. Most processing of in-memory models like DOM uses similar algorithms.
If any recursive function exceeds this complexity (in the way p/content did prior to your patch), I would regard that as a bug.
If you find any other similar problems, please report them.
Note that the Klacks parser is available as an alternative to SAX for applications that need stricter guarantees. The klacks parser will only report one event at a time and does not use most of the recursive functions found in the SAX parser.
Thanks, David