Hello,
I would like to inquire as for whether and how there may have been proposed a means for supoprt of XML schemas in CXML.
Certainly, I can direct my inquiry towards a search across the the cxml-devel list @ clnet. I had thought that a more direct inquiry would be appropriate.
* Approaching an implementation
I am aware as that CXML supports a representation of DTDs.
I intend to become more familiar with the code by which CXML provides support for DTD representation. I consider that I will have to become more familiar with it, in order to continue about some intended projects.
I have been pressing along with some projects to the Tioga project -- now having the documentation done for the first, ableit trivial release to the Tioga Auxiliary Library's tal-base system. Before releasing the documentation, I have been trying to get some things designed-out, for supporting how that documentation will be built -- across all the documentation to the project.
Shortly before beginning to write this message, I had not been sure if I would be able to use CXML, immediately, in managing the system of documentatry items to the Tioga project. It had appeared that it was going to result in a bunch of DTD-hacking -- at some point, hopefully dovetailing with an application of CXML.
Presently, I have recalled a matter, upon which I had wanted to raise the above inquiry.
I do not mean to sound angered about SGML or XML. It is the stuff of a big system, SGML and XML, whether in integration of the both, or disjunctly.
In all, XML is widely popular -- even at such as xml.house.gov for instance, let alone in ebXML, WebDAV, and furthermoe -- and SGML has been an object to quite some popularity and quite some work, in its time, as at HyTime, DSSSL, ISMID, and furthermore.
Upon the prospect of having an efficient, Common Lisp programming system that supports XML -- namely, CXML -- one thing that I recall, now, that I can apprecate of it: That I may be able to stop using the generally awkward (i.e involving) DTD syntax, and not have to use the more awkard (i.e. involving, and not in CL code) XSD sytnax, so in order to represent structure in markup. We can use Lisp for it.
Upon my observing how the following appears -- it was an item, put like on a restaurant napkin, within a resource-references document being written to the Tioga project:
<!-- e.g. <seg><cvs-archive cvsroot=":pserver:common-lisp.net:/project/cxml/cvsroot" user="anonymous" password="anonymous" id="sccm.cvs.clnet.cxml"/></seg>
<seg><archive-component name="cxml"><archive><xref linkend="sccm.cvs.clnet.cxml"/></archive></seg>
<!ELEMENT archive-component ((archive | xref)?)> <!ATTLIST archive-component name CDATA REQUIRED>
-->
(That may not be a valid DTD-segment, there; I'm still a bit rusty on DTD syntax. The segment would be applied as in extension onto the DocBook DTD.)
It was on my lookin at that text, at which I came to recall how much I would rather use CL for it, even for the initialization of DTD information.
* Constraining the Markup Definition - DTD or XSD
For represenation of information apart from the markup-element names, attribute names, and content models -- e.g. representation of attribute value types more discrete than 'cdata' -- it may be done in a format compatible to a DTD, using <?processing instructions?> within the DTD markup. It may be more approached towards a more more succint representation of the information, however, if it would be appraoched as to be implemented onto XML schemas.
So, I thought I would inquire, as for whether and how XML schemas may have been proposed to be supported in CXML.
If there is existing work about such, and if there are proposals toward how it would be approached, I should be glad to direct my attention to it.
If there is no existing work about such, I would propose that I can try to "take the matter on".
* Initial propoals for a design of XSD support onto CXML
** Shared Functionality - DTDs and XSDs, as documentary schemas
I would propose that the CXML DTD support code would be regarded, as for what degree of shared functionality would be applicable of the code, as between DTD support and XSD support.
** Operations in the Parser
In the operations of the XSD parser, there may be applied some operations for a type-actuated value-translation mechanmism, e.g.
UNMARSHAL-VALUE TYPE IN-SUBSTRATE [generic function] MARSHAL-VALUE TYPE OUT-SUBSTRATE IN-SUBSTRATE [generic function]
Regarding UNMARSHAL-VALUE, in the case of an XSD parser:
- the TYPE would be an object representative of a type -- a class metaobject, if not such as a CMUCL/SBCL CTYPE object (a CTYPE driven approach would be implementation-specific; I am not aware as for whether or how the code for it may be ported to other implementations. I do not know how any implementations beside CMUCL and SBCL would approach 'type handling' and 'type representation', at any level more finite than of a CLASS metaobject. Regardless of it it would be implementation-specific, It would be very convenient w.r.t type translation, to use CTYPE classes as specializers).
The TYPE value would indicate the type of the object that must be initialized of the method. (Some extension onto MOP method specialization may be approached as for some optimization -- as for to ensure that the method's return-value type, and the return-value type of the effective method resulting with an end at that method, would be denoited to the compiler as being of the same type as the TYPE argument; this optimization would require that only a class-typed specialization of methods would be supported on the generic function. I would propose this optimization as a "feasible, though not directly necessary" kind of 'milestone' step, as in a project 'roadmap'.)
- the IN-SUBSTRATE value, I suspect it would be such as either:
(1, w.r.t a SAX approach) a stream, with the cursor positioned somewhere about the input that would have resulted in a certain SAX event
(2, w.r.t a DOM approach) an instance of a class in the DOM type system; an object that would be representative of an XML elment.
*** DOM or SAX? (Proposed: DOM)
A directly SAX-driven approach may be appropriate, there.
I consider that a DOM-driven approach may be the more appropriate. I consider that it would be easier to make a DOM-driven approach, and easier to make that approach, in paralllel to other systems that may use a DOM-driven mechanism, on the same input information.
In the case of an approach directly utilizing the SAX API, the input infoset -- the in-substrate -- it would be a stream, parsed-across, once (and it would not be determinable until "late" if the infoset would be valid and well-formed). Then, the in-substrate object would still be available, as a stream (whether or not that stream would support repositioning of the stream cursor -- as a socket stream may not, and in "Linux space", would not).
In the case of a DOM-driven approach, the input infoset would be parsed into a DOM representation -- ensured, then, as that it would be valid (if not well-formed ??), if it may even be represented in a DOM node-tree. To the UNMARSHAL-VALUE method, the IN-SUBSTRATE would then be a DOM object.
After processing in UNMARSHAL-VALUE, the DOM representation of the initial infoset may then be disregarded, or may be retained for other uses; perhaps it may be used as to process the original DOM information, for representaton of that same information onto a CLIM pane.
I would propose that an XSD parser in CXML would be addressed onto a DOM-driven approach.
*** UNMARSHAL-VALUE
In the case of the UNMARSHAL-VALUE operation, the TYPE value would be representative of something about the type (and element-name) of the "target DOM node". The IN-SUBSTRATE would be something similar to what would have been produced of MARSHAL-VALUE (e.g. a class, or a CTYPE). The OUT-SUBSTRATE would be the DOM node supposed to contain the generated DOM node, i.e. the DOM node that UNMARSHAL-VALUE would generate.
Barring a non-local return from the method, the process calling UNMARSHAL-VALUE method would be responsible for "linking" the resulting DOM node into the containing node. This would involve a step, outside of the UNMARSHAL-VALUE method; it should result in a gretaer modularization of the code.
*** MARSHAL-VALUE
I would appraoch the implementaion of MARSHAL-VALUE as it being a later 'milestone'.
*** The calling process
What would call UNMARSHAL-VALUE, in unmarshaling of an XML schema object : another UNMARSHAL-VALUE method, specizlied on such that would represent the 'root document' of an infoset, and a TYPE representative of a "container" for the XSD information, such as an XML-SCHEMA object.
What would call MARSHAL-VALUE, in the marshaling of CL information onto an XML schema object : another MARSHAL-VALUE method, with the following arguments : - IN-SUBSTRATE representing a a container of XSD information - OUT-SUBSTRATE representing either a DOM node, a stream, or a pathname; the DOM-area specialization would be the first I'd suggest to take - TYPE object being NULL; the type information for operations of methods resulting from the call should be determinable on the IN-SUBSTRATE.
*** regarding the TYPE argument in MARSHAL-VALUE
To retain the TYPE value as an argument to MARSHAL-VALUE, it would serve to retain some consistency if the method would be specialized onto CFFI system objects.
**** Exmple onto CFFI
A CL INTEGER value may be represented as a value written into a buffer of any given width at or beyond the integer-length of the value -- with allowance for the encoding of a negative value. To export an INTEGER onto a raw, malloc'd memory block, it would be necessary to specify a numeric type for the output, so in order to ensure that the value could be correctly marshaled (i.e encoded onto the external substrate).
**** Example onto CXML
An XML schema may include any number of type definitions, and may depend on any number of type definitions from another schema.
In example: A US citizen's SSN may be represented as a nine-integer value. That nine-integer value may be represented -- typically -- in a conventional CL environment, using a (VECTOR (UNSIGNED-BYTE 4) 9).
Typically, in an XML schema, that nine-integer value may be represented as it being an object of type 'SSN'.
An object may be initialized so as to represent an XSD-defined type 'SSN'. Given a mechanism for it, that SSN type may be mapped, explicitly, onto the type (VECTOR (UNSIGNED-BYTE 4) 9). Such a mechanism may be implemented directly onto an XML schema, but would have to be operable without requiring modification of an XML schema.
Perhaps the XSD-to-CL type-translation mechanism would be sufficiently operable, when operating in an automated manner. Some specialization might still be appropriate.
** XML-SCHEMA, slot TYPES ; class XML-SCHEMA-TYPE
To represent a schema's contained body of type definitions within a single unit, it may be approached with an index initialized into an XML-SCHEMA instance, that index containing of a set of values all of the same type -- perhaps, of a class XML-SCHEMA-TYPE.
Given an appropriate mechamism for it, such an index may be defined directly onto a VECTOR typed object, with approriate key functions being cached in the thing.
** Class XML-SCHEMA-TYPE
an XML-SCHEMA-TYPE object would contain information representative of: 1) the schema-local representation of the value -- supportive of marshaling of the XML-SCHEMA-TYPE onto an XSD document 2) the CL-local representation of the value -- supportive of unmarshaling of an XML node in a document using the associated schema.
** Questions
1) How would a document and a schema be associated, in the CL environment?
Each document is mapped to zero or one DTDs.
Each *element* in a document *may* be mapped to one schema (NB: I'm not sure if that's officially of the XSD spec, but it would be feasible. It would require some sort of a conventional approach for the identification of the schema that would be intended as to be assigned to a ndoe; one could propose that an xml:schema attribute might be proposed, for it. One could use any namespace, in the development of the proposal. One should have to require that the element containin gthe foo:schema attribute would be valid on the identified schema; something would have to be done, in regards to namespaces, to ensure that the element would also be valid within the schema for the node containing the element.)
(You know, I've tried to use Trac for project whiteboarding. I was thrown at the syntax used in the Trac wiki pages; it doesn't appear to support HTML markup, either, in the Trac wiki pages. I stopped short of requesting that the CLNet maintainers would consider providing a Cliki instance for each project; perhaps they would consider it to be a viable proposal, but I have not wanted to increase their workload, in any.)(
I think there's an XML processing instruction, specifically applicable for mapping an XML document to an XSD schema -- similar to the DTD declaration, though using a different syntax, like an <?xfoo whatfoo="URI"?> PI.
** Possible adaptation on a ??? class -- slot SCHEM
a ??? class would have to be modified so as to contain a slot SCHEMA.
slot value type for the SCHEMA slot : (OR NULL XML-SCHEMA) initial form: NIL
** Possible Adaptation on a DOM Parser
To handle an XMl schema on a document -- to associate an XML schema with an object representing the document, or something within the document -- A DOM parser would have towatch for that <?xfoo?> XML-schema PI.
One could specialize a method about XML processing-instruction DOM objects; one could check the 'name' processing instruction, then dispatch on when name would match the name of the <?xfoo?> XML-schema PI.
At that point, the XML schema would have to be already initialized -- then retrieved -- or would have to be initialized, newly, the whatfoo="URI" identified schema would be available.
*** Exceptional Situation
If the whatfoo="URI" identified schema would *not* be available, a condition should be signaled for it -- just a type of CONDITION.
The document must still be parsed, though it could not be validated.
** Identification of XML schema objects - XML-SCHEMA, slot URI
# related items : PURI; object indexing; CXML XML catalogues API
An XML-SCHEMA object may be identified according to a URI.
The mechanism for that identification -- for indexing an XML-SCHEMA object by its identity, and retriving an object by its identity (or triggering the initializationof a new object, then) -- that should be approached in integration with the XML catalogues mechanism.
* Onto Implementation
I'd like to approach this as into a prototype in the TAL codebaes to the Tioga project. I would propose to approach it as so, in order to facilitate:
1) that I would use the TAL-base system with it, of which I am familiar 2) that I would be able to implement it, without requiring any more work to the project administrators on the CXML project 3) that it would be integral with the documentary system proposed (*cough*) and being developed onto the Tioga projects
Of the said documentary system, the design of it is the last hangup before I may make a first release of the TAL codebase. Without a sufficient mechanism for processing the documentation and presenting it in HTML form, it would be in a release incomplete.
* Onto Conclusion
At that point, I am reminded of why I had wanted to address the initial inquiry, regarding supoprt for XML schemas in CXML. I want to extend the DocBook DTD; I intend to make the extension, firstly, with Common Lisp -- then generating any DTD contents, in the end.
If the code I propose to implement of it would require an application of code that I have not released, I must take the implemenation as it being, effectively, nonoperable.
To hold-up a release in concern about the documentation, then upon this item, I cannot. To approach this item, it would serve to support development of the system by which the documentation would be handled.
I will have to take the text, above, as it constituting a plain/text edition, as a first draft towards some items of reference documentation.
To approach it into implementation, I will have to have migrated some measure more of my own code into the TAL archives -- code that I am familiar with, and which I have tested and know the intended efficiency of, besides that it may require some minor refactoring and cleanup, before the source will be checked into the project's source archive.
Upon doing so, I must document each item, as in now, as while my attention is directed to it. I can glue the reference documents together, later -- will be producing one reference document per each distinct system defnition. Each system definition may be associated with a roadmap about milestones proposed for the implementation of the system, furthermore. Then, there will be a body of resource reference pages, a body of glossary entry pages, a body of bibliographic entries, and I may extract the code-item refentry pages into individual files.
I should address the above proposal about an XML-schema system as it being the body a system that would be named tdoc-xml-xsd.
I will have to make an index-reference pointing to the documentation that would be made about it.
I can send across what will be the Arch archive identifier for it, and what information would be necessary for making sure that one can access the archives (there's a point in regards to GPG signing on the archives, reuiring that a certain a script will be available in one's Arch configuration, with some range as for how that would be approached -- side-barred with a point about something called 'agpg' and the 'quintuple-agent' system containing it. Then, there's a point in regards to available interfaces on Arch -- e.g. xtla, xetla -- and a sidebar about key management, e.g. via Seahorse, and there's the clnet keyring, such that one should want to have imported into one's GPG configuration, in order to access the archives.)
That documentation should include explanation of each of those points, about how one would access and use the archives to the Tioga project -- using Arch, using some stuff in regards to GPG signing on the archives, and using probably XTLA or XETLA as an interface. Once those tools are installed and configured, then they are readily usable.
I will transpose the above design proposals into some DocBook SGML. The documents can be convered to XML, with no lossage in the conversion -- no SGML-specific features being used in the documents. As being SGML, the documents may be processed without issue, using the DSSSL stylesheets for DocBook.
After the material, denoted above, will be finally transposed into SGML, then I should want to put the material onto a side shelf; I should then move-in the code that I would intend to use for it, finally checkin-in that code into the TAL codebase. That would be a more viable approach than if I was to use source code from archives that I have not published, in the XSD implementation.
After that TAL-area material will be imported into the TAL archives, then I may side-shelve that mateiral and return my attention to the XSD work.
At some point, I will have to make a roadmap-milestone for testing the work onto the W3C's test code.
I may apologize if I have not edited the above to enough of a state of "doneness". I consider that it may be to the interest of the CXML project, if I would mention the design proposed for it, openly.
I regard the proposal, above, as it being material like onto a whiteboard. It is material about to be drafted into a working document. I consider that I have figured out some of how I may approach the matter. I should welcome response about the proposals.
If there is work already proposed about XSD support in CXML, I hope to denote: I do not intend that this would seem as if it had run over that work. I would be glad to hear of it, in consideration towards the design of an approach for it.
In regards to how the tdoc-xml-xsd system would be made most finally available, I propose to develop it in the TDoc codebase. In so doing, I will be able to develop it, using the tools that I will be using onto other Tioga projects. It could be made available from within that codebase, as a stand-alone distribution -- asdf-installable, if not debian-packaged.
Upon all consideration of the licensing terms about the item -- Franz LLGPL, namely -- It may be mirrored from within that codebase, furthermore.
I've an errand to follow-up about, before I will continue with the above.
Good evening
-- Sean Champ