A few years back, I posted this request, but at the time I got no reply. Since there has recently been some activity here, I'm asking this again.

Currently, closure-html throws an error when trying to parse HTML generated by Microsoft Outlook. The reason for this is that Outlook generates a lot of tags with a colon (:) in them, which closure-html considers a syntax error.

I don't know if it actually is a syntax error, but closure-html should be lenient here and simply ignore these tags. That's what my fix does. Currently, I had to advice my users to manually patch closure-html. It would be very nice if this was integrated in the official version.

Here is the patch:

diff --git a/src/parse/html-parser.lisp b/src/parse/html-parser.lisp
index 1fdd457..4e45b81 100644
--- a/src/parse/html-parser.lisp
+++ b/src/parse/html-parser.lisp
@@ -106,7 +106,10 @@
      for (name value) on plist by #'cddr
      unless
        ;; better don't emit as HAX what would be bogus as SAX anyway
-       (string-equal name "xmlns")
+       (let ((s (string name))
+             (prefix "xmlns:"))
+         (or (string-equal s "xmlns")
+             (string-equal s prefix :end1 (min (length s) (length prefix)))))
      collect
      (let* ((n #+rune-is-character (coerce (symbol-name name) 'rod)
 	       #-rune-is-character (symbol-name name))
diff --git a/src/parse/sgml-parse.lisp b/src/parse/sgml-parse.lisp
index faa9029..a277ece 100644
--- a/src/parse/sgml-parse.lisp
+++ b/src/parse/sgml-parse.lisp
@@ -182,7 +182,8 @@
   (or (name-start-rune-p char)
       (digit-rune-p char) 
       (rune= char #/.) 
-      (rune= char #/-)))
+      (rune= char #/-)
+      (rune= char #/:)))
 
 (definline sloopy-name-rune-p (char)
   (or (name-rune-p char)