Hello!
When trying to parse HTML from articles on BBC, like:
http://www.bbc.co.uk/news/world-africa-21734036
I get the following error:
prefix/URI mismatch for `xml' namespace
Backtrace is:
(6FA0924) : 0 (PRINT-CALL-HISTORY :CONTEXT NIL :PROCESS NIL :ORIGIN NIL :DETAILED-P NIL :COUNT 536870911 :START-FRAME-NUMBER 0 :STREAM #<STRING-OUTPUT-STREAM #x1A11E036> :PRINT-LEVEL 2 :PRINT-LENGTH 5 :SHOW-INTERNAL-FRAMES NIL :FORMAT :TRADITIONAL) 727 (6FA09D8) : 1 (PRINT-BACKTRACE-TO-STREAM #<STRING-OUTPUT-STREAM #x1A11E036>) 71 (6FA09F0) : 2 (GET-BACKTRACE) 311 (6FA0A24) : 3 (FUNCALL #'#<(:INTERNAL (HUNCHENTOOT:HANDLE-REQUEST (HUNCHENTOOT:ACCEPTOR HUNCHENTOOT:REQUEST)))> #<CXML-STP:STP-ERROR #x1A11E04E>) 95 (6FA0A3C) : 4 (SIGNAL #<CXML-STP:STP-ERROR #x1A11E04E>) 871 (6FA0A64) : 5 (%ERROR #<CXML-STP:STP-ERROR #x1A11E04E> (:FORMAT-CONTROL "prefix/URI mismatch for `xml' namespace" :FORMAT-ARGUMENTS NIL) 29262494) 111 (6FA0A78) : 6 (STP-ERROR "prefix/URI mismatch for `xml' namespace") 103 (6FA0A88) : 7 (RENAME-ATTRIBUTE #<error printing object> "xml" "") 287 (6FA0A9C) : 8 (MAKE-ATTRIBUTE "en-GB" "xml:lang" "") 303 (6FA0AC0) : 9 (FUNCALL #'#<#<STANDARD-METHOD SAX:START-ELEMENT (CXML-STP-IMPL::BUILDER T T T T)>> #<CXML-STP-IMPL::BUILDER #x1A00D50E> "http://www.w3.org/1999/xhtml" "div" "div" (#<HAX:STANDARD-ATTRIBUTE #x1A11E206> #<HAX:STANDARD-ATTRIBUTE #x1A11E1D6>)) 279 (6FA0AEC) : 10 (FUNCALL #'#<(:INTERNAL CLOSURE-HTML::RECURSE CLOSURE-HTML:SERIALIZE-PT)> #<SGML:PT DIV ..>) 455 (6FA0B10) : 11 (FUNCALL #'#<(:INTERNAL CLOSURE-HTML::RECURSE CLOSURE-HTML:SERIALIZE-PT)> #<SGML:PT DIV ..>) 559 (6FA0B40) : 12 (FUNCALL #'#<(:INTERNAL CLOSURE-HTML::RECURSE CLOSURE-HTML:SERIALIZE-PT)> #<SGML:PT BODY ..>) 559 (6FA0B70) : 13 (FUNCALL #'#<(:INTERNAL CLOSURE-HTML::RECURSE CLOSURE-HTML:SERIALIZE-PT)> #<SGML:PT HTML ..>) 559 (6FA0BA0) : 14 (SERIALIZE-PT #<SGML:PT HTML ..> #<CXML-STP-IMPL::BUILDER #x1A00D50E> :NAME "HTML" :PUBLIC-ID NIL :SYSTEM-ID NIL :DOCUMENTP T) 343 (6FA0BD8) : 15 (PARSE-XSTREAM #<RUNES:XSTREAM NIL> #<CXML-STP-IMPL::BUILDER #x1A00D50E>) 263
The parse call in my program is:
(closure-html:parse article-page (stp:make-builder))
As far as I understand the problem has something to do with the xml prefix in attributes (for language, xml:lang) but I can not understand how to fix it or to work around.
Could please anybody give a hint where to look for the problem and its solution.
Thanks, Victor
-- реклама ----------------------------------------------------------- Модная одежда и обувь! Цены ниже, чем в магазине! http://moda.aukro.ua/?utm_source=i.ua&utm_medium=advert&utm_campaign...
On Sun, 10 Mar 2013 20:13:41 +0200, Victor bobbie@ua.fm wrote:
When trying to parse HTML from articles on BBC, like:
http://www.bbc.co.uk/news/world-africa-21734036
I get the following error:
prefix/URI mismatch for `xml' namespace
A brief search in the archives produced a previous discussion on a similar topic:
http://lists.common-lisp.net/pipermail/closure-devel/2011-March/000108.html
As far as I understand a proper and nice solution is still in the works.
Thanks, Victor