I am currently faced with the task of parsing HTML emails generated by
Outlook. My frustrations with that thing can fill an entire email of its
own, so I won't do that.
Anyway, one thing it keeps doing is to create lots of non-standard tags of
the form <o:p></o:p> and the likes. The problem is that when Closure-HTML
parses these, they end up like this: "#BAD TAGp>".
I worked around the problem by adding the following check to the function
NAME-RUNE-P: (rune= char #/:). This includes the colon as a valid character
in a node name, and thus will cause such nodes to be ignored in the
generated output.
Would it be reasonable to include this fix in an update to Closure-HTML?
Regards,
Elias