Thank you very much Russ! It works as expected! I have one last question. Running the parser with the command:
(with-open-file (out #P"teste.xml" :if-exists :supersede :direction :output) (let ((h (make-instance 'preproc :chained-handler (cxml:make-character-stream-sink out)))) (cxml:parse #P"harem.xml" h :validate t)))
where the file harem.xml begins with (see the doctype):
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE colHAREM SYSTEM "harem.dtd"> <colHAREM versao="Segundo_dourada_com_relacoes_14Abril2010"> <DOC DOCID="H2-dftre765"> <p>...
the command produces in the teste.xml output file:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE colHAREM SYSTEM "harem.dtd"<!ELEMENT EM #PCDATA> <!ATTLIST EM ID CDATA #REQUIRED> <!ATTLIST EM CATEG CDATA #IMPLIED> <!ATTLIST EM TIPO CDATA #IMPLIED> <!ATTLIST EM COMENT CDATA #IMPLIED> <!ATTLIST EM SUBTIPO CDATA #IMPLIED> <!ELEMENT ALT (#PCDATA|EM)*> <!ELEMENT OMITIDO (#PCDATA|EM|ALT|p)*> <!ELEMENT colHAREM (DOC)*> <!ATTLIST colHAREM versao CDATA #REQUIRED> <!ELEMENT p (#PCDATA|EM|OMITIDO|ALT)*> <!ATTLIST p xml:space (default|preserve) "default"> <!ELEMENT DOC (#PCDATA|p|OMITIDO)*> <!ATTLIST DOC DOCID CDATA #REQUIRED>
<colHAREM versao="Segundo_dourada_com_relacoes_14Abril2010"> ...
That is, the handler writes the DTD inside the output but in the wrong way, without the [ ]. Is it a bug in the library or in my code?
Thank you very much for this additional help!
Best,
---- Alexandre Rademaker http://arademaker.github.com
On Nov 3, 2014, at 1:35 PM, Russ Tyndall russ@acceleration.net wrote:
Howdy,
You will need to issue sax:start-element and sax:end-element calls instead of doing a string replace.Essentially you will replace the single sax:characters call with a series of characters / elements calls.
EG: (defclass preproc (cxml:sax-proxy) ())
(defmethod sax:characters ((handler preproc) data) (let ((chunks (cl-ppcre:split "\|" data))) (if (= 1 (length chunks)) (call-next-method) (loop for c in chunks for first? = t then nil do (unless first? (sax:start-element handler nil nil "bar" nil) (sax:end-element handler nil nil "bar")) (sax:characters handler c)))))
(cxml:parse "<test>content | ola</test>" (make-instance 'preproc :chained-handler (cxml:make-string-sink))) => "<?xml version=\"1.0\" encoding=\"UTF-8\"?> <test>content <bar/> ola</test>"
Hope this helps, Russ Tyndall Acceleration.net
On 11/03/2014 07:47 AM, Alexandre Rademaker wrote:
Hi,
I need to transform all characters | to tags <bar/> in all texts blocks of a big XML file. That is, whenever I found
<test att="one|two">content | something more | and done</test>
I need to transform to
<test att="one|two">content <bar/> something more <bar/> and done</test>
Note that | can also occur in attributes values and, in that case, they must be keeped unchanged. Reading the slide http://common-lisp.net/project/cxml/saxoverview/pages/11.html I wrote
=== (defclass preproc (cxml:sax-proxy) ())
(defmethod sax:characters ((handler preproc) data) (call-next-method handler (cl-ppcre:regex-replace "\|" data "<bar/>"))) ===
But of course, it produces a string (escaped) not a tag in the final XML.
WML> (cxml:parse "<test>content | ola</test>" (make-instance 'preproc :chained-handler (cxml:make-string-sink))) "<?xml version=\"1.0\" encoding=\"UTF-8\"?> <test>content <bar/> ola</test>"
Any idea or directions?
Best,
Alexandre Rademaker http://arademaker.github.com
Cxml-devel mailing list Cxml-devel@common-lisp.net http://mailman.common-lisp.net/cgi-bin/mailman/listinfo/cxml-devel
_______________________________________________ Cxml-devel mailing list Cxml-devel@common-lisp.net http://mailman.common-lisp.net/cgi-bin/mailman/listinfo/cxml-devel