Hi.
I've noticed that parsing XML that contains many entity references
may cause CXML to eat up all available heap space and crash. This can
be easily reproduced:
(cxml:parse-rod (format nil "<xml>~{~a~}</xml>" (loop repeat 4000
collect "<p>zz")) (cxml-dom:make-dom-builder))
(tested on SBCL / Linux)
Looking at the code, I've noticed that xml/xml-parse.lisp relies on
the tail recursion a lot (though always having proper tail recursion
is frankly speaking somewhat dubious expectation when working with
Common Lisp). What's worse, P/CONTENT function is non-tail-recursive
when an entity reference is encountered, and this is exactly the cause
of aforementioned crash. I've attached the patch which helped me solve
the problem (it's made against cxml-2007-08-05). I've just noticed
that recurse-on-entity returns NIL most of time and used that fact to
cut off most of recursion. As of now my brain is somewhat foggy due to
continuing coding race, so I didn't learn what recurse-on-entity
actually does and how often can it return a non-NIL value. Overall I
don't think this is a proper solution, most likely xml-parse.lisp
needs some serious rework to make it more reliable. But I think at
least this hack will help in most cases.
Ivan