hi!
i've playing with an document builder that may be interesting for the list. the basic idea is to parse the xml into CLOS objects with their own class, so that method dispatch can be used to process the document. i need this for verrazano, to process the gccxml output. the first part of the file contains the generic flexml document builder and at the end there's some specialized code for gccxml.
how it works: you can give it a default lisp package and also register namespace-uri->lisp-package mappings. when a node is encountered then a symbol is looked up in the appropriate package and make-instance is used to make the node. there's also a slot-missing specialization on it, so that it can store random slots (attributes) in a hashtable.
features currently: - can resolve crossreferences, see cross-referenced-node slot type and friends - handles some slot types specially: cross-referenced-node, cross-referenced-nodes, integer, boolean, etc - can store attributes in a mixed slot/hashtable mode where simple attributes can go into the hashtable and special ones can be full-featured slots
the code for now is only a proof of concept, but i wonder if there would be a place for it somewhere in a repo around cxml? it won't be bigger then a few pages so i think the most practical distribution would be to have its own .asd/package in the cxml repo/directory. but i don't want to run ahead... it even could be that i'm doing duplicate work with this.
opinions?
Hi
I think this is cool. I would definitively use something like it for - e.g., a revamped CL-SBML parser. Moreover, since the CXML parsers are a pretty standard pattern (create the parser as a class) it'd be worth it packaging it.
Cheers
Marco
On Nov 23, 2007, at 16:02 , Attila Lendvai wrote:
<xml-parsing.lisp>
-- Marco Antoniotti, Associate Professor DISCo, Università Milano Bicocca U14 2043 Viale Sarca 336 I-20126 Milan (MI) ITALY
Please note that I am not checking my Spam-box anymore.
dear list,
as a followup, please find the current version attached. unfortunately i was drawn away from this code, but it works well when working with gccxml outputs.
the following code fragments are from a new, halfdone version of verrazano. it parses a gccxml output into clos objects that are later used in multiple dispatch. the classes must be precreated for safety/sanity, but they could be automatically created, too. stuff that is gccxml specific is prefixed with gccxml: for clarity.
slots with 'flexml:cross-referenced-node(s) type are automatically resolved to the referenced nodes through the id attribute.
(defclass gccxml-parser (flexml:flexml-builder) ((macro-name->macro-node :initform (make-hash-table :test #'equal) :accessor macro-name->macro-node-of) (type->name->node :initform (make-hash-table :test #'eq) :accessor type->name->node-of) (id->file-node :initform (make-hash-table :test #'equal) :accessor id->file-node-of) (input-files :initform (make-hash-table :test #'eq) :accessor input-files-of)))
(defun make-gccxml-parser () (make-instance 'gccxml-parser :default-package "GCCXML"))
(defclass gccxml:node (flexml:flexml-node) ((gccxml:file :initform nil :type flexml:cross-referenced-node :accessor gccxml:file-of) (gccxml:line :initform nil :type (or null integer) :accessor gccxml:line-of) (gccxml:context :initform nil :type flexml:cross-referenced-node :accessor gccxml:context-of)))
(defclass gccxml:node-with-name (gccxml:node) ((gccxml:name :initform nil :accessor gccxml:name-of)))
(defclass gccxml:node-with-type (gccxml:node) ((gccxml:type :type flexml:cross-referenced-node :accessor gccxml:type-of)))
(defclass gccxml:node-with-members (gccxml:node) ((gccxml:members :initform #() :type flexml:cross-referenced-nodes :accessor gccxml:members-of)))
(defclass gccxml:definition (gccxml:node-with-name) ())
(defclass gccxml:externable-node (gccxml:definition) ((gccxml:extern :type boolean :accessor gccxml:extern?)))
(macrolet ((define (&body entries) `(progn ,@(iter (for entry :in entries) (destructuring-bind (name &optional (supers '(gccxml:node)) &body slots) (ensure-list entry) (collect `(defclass ,name ,supers (,@slots)))))))) (define gccxml:gcc_xml (gccxml:namespace (gccxml:node-with-name gccxml:node-with-type gccxml:node-with-members)) (gccxml:variable (gccxml:externable-node gccxml:node-with-type)) (gccxml:function (gccxml:externable-node) (gccxml:returns :type flexml:cross-referenced-node :accessor gccxml:returns-of)) (gccxml:argument (gccxml:node-with-name gccxml:node-with-type)) gccxml:ellipsis (gccxml:enumeration (gccxml:definition)) (gccxml:enumvalue (gccxml:node-with-name)) (gccxml:struct (gccxml:definition gccxml:node-with-members) (gccxml:incomplete :initform nil :type boolean :accessor gccxml:incomplete?)) (gccxml:union (gccxml:definition gccxml:node-with-members)) (gccxml:typedef (gccxml:definition gccxml:node-with-type)) (gccxml:fundamentaltype (gccxml:node-with-name)) (gccxml:pointertype (gccxml:node-with-type)) (gccxml:arraytype (gccxml:node-with-type)) (gccxml:functiontype (gccxml:definition) (gccxml:returns :type flexml:cross-referenced-node)) (gccxml:cvqualifiedtype (gccxml:node-with-type)) gccxml:referencetype (gccxml:field (gccxml:node-with-name gccxml:node-with-type) (gccxml:bits :initform nil :type (or null integer) :accessor gccxml:bits-of) (gccxml:offset :type integer :accessor gccxml:offset-of)) gccxml:constructor (gccxml:file (gccxml:node-with-name)) (gccxml:macro (gccxml:definition) (gccxml:name :accessor gccxml:name-of) (gccxml:arguments :initform nil :accessor gccxml:arguments-of) (gccxml:body :accessor gccxml:body-of) (gccxml:raw-body :accessor gccxml:raw-body-of))))
(defun parse-gccxml-output (gccxml-file &optional macros-file) (bind ((*parser* (make-gccxml-parser))) (cxml:parse gccxml-file *parser*) (when macros-file (parse-macro-definitions macros-file)) *parser*))
Hi,
Quoting Attila Lendvai (attila.lendvai@gmail.com):
as a followup, please find the current version attached. unfortunately i was drawn away from this code, but it works well when working with gccxml outputs.
that sounds interesting.
However, these days I try to package up extensions to cxml as separate projects. My suggestion would be to do the same for your code.
(For example, cxml-stp, cxml-rng, Plexippus XPath, and Xuriella XSLT are strictly add-ons to cxml. In addition, projects like Closure Common and Closure HTML were originally a part of Closure and are now maintained separately.)
I am still undecided whether it would be a good idea to go further and actually split up the existing cxml code base. The DOM implementation would be a good candidate.
For now, I will probably not split up cxml, but also not make it much larger.
d.
as a followup, please find the current version attached. unfortunately i was drawn away from this code, but it works well when working with gccxml outputs.
that sounds interesting.
However, these days I try to package up extensions to cxml as separate projects. My suggestion would be to do the same for your code.
i'd also package it as a standalone asdf system, making it a plugin to cxml. but even though it's a standalone lib, i'd also put it in the cxml repo, under let's say a plugins/ directory to minimize the effort needed by users to get a whole package.
imho, having it in a random unadvertised repo maintained by someone hardly related to cxml is not much better (or maybe even worse) then sending the code to the mailing list, which i've already done.
anyways, the code is archived here now, so i'm one less concerned by my laptop burning to dust... :)
Quoting Attila Lendvai (attila.lendvai@gmail.com):
imho, having it in a random unadvertised repo maintained by someone hardly related to cxml is not much better (or maybe even worse) then sending the code to the mailing list, which i've already done.
The "Add-on features" section on the homepage is my idea of making sure those projects aren't left unadvertised.
I can add flexml if there's a stable URL for a flexml project.
Requirements for an entry in that list include: - well-maintianed stand-alone project somewhere (c-l.net or elsewhere) - cliki entry, plus support for asdf-install - at least some documentation - works with the latest cxml release
For example, Plexippus and Xuriella are mentioned only in cxml CVS, not the real homepage, because at this point there don't fulfill all those criteria yet. Once we have well-documented releases for those, the homepage will also mention them.
d.
The "Add-on features" section on the homepage is my idea of making sure those projects aren't left unadvertised.
I can add flexml if there's a stable URL for a flexml project.
Requirements for an entry in that list include:
- well-maintianed stand-alone project somewhere (c-l.net or elsewhere)
- cliki entry, plus support for asdf-install
- at least some documentation
- works with the latest cxml release
i'm sorry but i can't fulfill these requirements. i consider it too much trouble for 4 pages of code and i already have too many projects to maintain.
if i were you i'd open a directory called "handlers" or "document-models" and collect things like flexml there in their separate directories but in the same repository as cxml. it has the advantage of locality of related code, single point of update and a packaged whole with all the options (imho, these all help the users).
but these are only some 0.02...