*** Rationale ***
§1. A significant problem I perceive with CL-JSON in its present state is that the operations ENCODE-JSON and DECODE-JSON are not reasonably inverse. E.g.:
(ENCODE-JSON-TO-STRING (DECODE-JSON-FROM-STRING "{"foo":null,"bar ":null}")) => "[["foo"],["bar"]]"
and
(ENCODE-JSON-TO-STRING (DECODE-JSON-FROM-STRING "{}")) => "null"
§2. This is the most painful kind of example, because the JSON object having even a single non-null field results in a proper value. Thus, the type of a JSON object emitted by an application that uses ENCODE-JSON can depend not only on the type of the source Lisp object but on its internal structure as well. This is not to say that ENCODE-JSON fusing Lisp vectors with lists and symbols and characters with strings, or DECODE-JSON fusing false with null are particularly pleasant; still, in theses cases we are able to predict typing across JSON interface. Note, however, that
(ENCODE-JSON-TO-STRING (DECODE-JSON-FROM-STRING "[]"))
would also yield "null". This has to be dealt with as well.
§3. The root of the problem is the design decision of representing JSON objects as Lisp alists, and the solution I propose is to revert that decision and represent JSON objects as CLOS objects. This is quite feasible if we make use of meta-object protocol capabilities implemented in a more or less standardized way across most contemporary Common Lisps. In the simple form, decoding a well-formed JSON object should result in the creation of an anonymous class (i.e. an instance of STANDARD-CLASS) with such slots as found in the JSON object. Next, we create an instance of the anonymous class and populate its slots. Conversely, encoding a CLOS object Z should be done by iterating over the value of (CLASS-SLOTS (CLASS-OF Z)) and emitting a name:value pair for each slot bound in Z.
§4. Attached herewith is a series of patches implementing the idea. However, I have ventured a step further by allowing JSON objects to include metadata which specify that CLOS objects are to belong (or else inherit) to non-anonymous CLOS classes. The metadata are passed as the field of the object whose name is ``prototype''. (More accurately, the name is determined by the symbol which is the value of the special variable *PROTOTYPE-NAME*: the value (FUNCALL *SYMBOL-TO-STRING-FN* *PROTOTYPE-NAME*) must be STRING= to the name of the field. I have judged the default value 'PROTOTYPE more or less developer-friendly, as the field ``prototype'' in JavaScript objects is already reserved for special purposes and so is unlikely to conflict with anyone's user-level naming schemes. Below, I invariably call this field the prototype field.) The value of the prototype field should be an object, called the prototype, in which the following fields are meaningful: "lispClass":C, "lispSuperclasses":[C1,...,Cn] and "lispPackage":P (all other fields are ignored). The following rules are employed when decoding a JSON object X to a CLOS object Z:
(D1) If the prototype of X has a "lispClass":C field, Z shall be an instance of the class identified by the name C, and all fields of X which have no correspondence among the slots of the class C are discarded.
(D2) If the prototype of X has a "lispSuperclasses":[C1,...,Cn] field, Z shall be an instance of an anonymous class C whose superclasses are identified by the names C1,...,Cn. The fields of X which have no correspondence among any of the slots of the classes C1,...,Cn are defined in C as direct slots. As a special case, if n=1 and all of X's fields (omitting the prototype) are defined as slots in C1 then Z shall be an instance of C1.
(D3) If the prototype has both a "lispClass":C and a "lispSuperclasses":[C1,...,Cn] fields then the rule D1 applies and the latter field is ignored.
(D4) If the prototype has a "lispPackage":P field, then the names of the classes in both other fields and the names of the fields in X are interned in the package specified by P instead of the default package KEYWORD. Of course, all names (including P itself) are converted from camel case before using them in Lisp.
(D5) If the class of the resulting object Z provides for a slot whose name is the value of the special variable *PROTOTYPE-NAME* (JSON::PROTOTYPE by default) then that slot shall be bound to the object which internally represents the prototype of X (an instance of the class JSON::PROTOTYPE, qv. the code). NB: this slot is never created by the decoder on its own authority but always inherited from the class or superclasses specified.
(D6) If the prototype has a "lispClass":"cons" field and such "lispPackage":P field that the interned class name is COMMON-LISP:CONS, the rule D1 does not apply. Instead, Z shall be an alist whose conses' cars are the names and whose conses' cdrs are the respective values of the fields in X. The field names are interned in P, and the prototype field itself is omitted.
(D7) If the prototype has a "lispClass":"hashTable" field and such "lispPackage":P field that the interned class name is COMMON-LISP:HASH-TABLE, the rule D1 does not apply. Instead, Z shall be a hash table whose keys are the names and whose values are the respective values of the fields in X. The field names are interned in P, and the prototype field itself is omitted.
(D8) The value of null for any of the three fields of a prototype is equivalent to the field being absent. X lacking or having a null prototype is equivalent to the prototype having all null fields.
§5. Conversely, when a CLOS object Z is encoded, the encoded JSON object X shall include a prototype reconstructed from Z per following rules:
(E1) If the class of Z has a name, the prototype shall have a "lispClass":C field, where C is that name (it is assumed here and below that all names are converted to camel case).
(E2) If the class of Z does not have a name, the prototype shall have a "lispSuperclasses":[C1,...,Cn] field, where C1,...,Cn are the names of that class's superclasses.
(E3) If Z is an alist, the prototype shall have a "lispClass":"cons" field.
(E4) If Z is a hash table, the prototype shall have a "lispClass":"hashTable" field.
(E5) The prototype shall have a "lispPackage":P field, where P is the name of a package such that any one of the names C or C1,...,Cn, and any one of the names of the slots bound in Z, is a symbol (either direct or inherited) in the package. (Mutatis mutandis if Z is an alist or a hash table.) If there is no such package, P shall be an unspecific package name, and the program shall signal a warning condition.
§6. Below are some examples of the modified decoder / encoder. I have marked the printout with double bars, and the result with =>. This is from an SBCL session; I have also tested the examples in OpenMCL.
(IN-PACKAGE JSON) => #<PACKAGE "JSON">
(DECODE-JSON-FROM-STRING "{"foo":null,"bar":null}") => #<#<STANDARD-CLASS NIL {1268AC79}> {1268D979}>
(DESCRIBE *) || #<#<STANDARD-CLASS NIL {1268AC79}> {1268D979}> || is an instance of class #<STANDARD-CLASS NIL {1268AC79}>. || The following slots have :INSTANCE allocation: || BAR NIL || FOO NIL
(DESCRIBE (CLASS-OF **)) || #<STANDARD-CLASS NIL {1268AC79}> is a class. It is an instance of || STANDARD-CLASS. || It has no name (the name is NIL). || The direct superclasses are: (STANDARD-OBJECT), and the direct subclasses || are: (). || The class is finalized; its class precedence list is: || (#<STANDARD-CLASS NIL {1268AC79}> STANDARD-OBJECT SB-PCL::SLOT- OBJECT T). || There are 0 methods specialized for this class.
(ENCODE-JSON-TO-STRING ***) => "{"bar":null,"foo":null,"prototype":{"lispClass":null, "lispSuperclasses":["standardObject"],"lispPackage":"json"}}"
(DESCRIBE (DECODE-JSON-FROM-STRING "{}")) || #<STANDARD-OBJECT {11D01F71}> || is an instance of class #<STANDARD-CLASS STANDARD-OBJECT>.
(DEFSTRUCT FOO BAR BAZ) => FOO
(ENCODE-JSON-TO-STRING (MAKE-FOO :BAR 10 :BAZ 20)) => "{"bar":10,"baz":20,"prototype":{"lispClass":"foo", "lispSuperclasses":null,"lispPackage":"json"}}"
(DECODE-JSON-FROM-STRING *) => #S(FOO :BAR 10 :BAZ 20)
(DECODE-JSON-FROM-STRING "{"bar":10,"baz":20,"quux":50,"prototype":{"lispClass": "foo","lispPackage":"json"}}") => #S(FOO :BAR 10 :BAZ 20)
(DECODE-JSON-FROM-STRING "{"bar":10,"baz":20,"quux":50,"prototype":{"lispClass": "cons","lispPackage":"json"}}") => ((BAR . 10) (BAZ . 20) (QUUX . 50))
(MAPHASH (LAMBDA (K V) (PRINT K) (PRINT V)) (DECODE-JSON-FROM-STRING "{"bar":10,"baz":20,"quux":50,"prototype":{"lispClass": "hashTable","lispPackage":"json"}}")) || || BAR || 10 || BAZ || 20 || QUUX || 50 => NIL
§7. The following additional names are exported from the package JSON:
*PROTOTYPE-NAME* special variable (default: JSON::PROTOTYPE)
A symbol which determines the name of the prototype field in a JSON object and the name of a slot in a CLOS object which may hold prototype information. As a special case, if *PROTOTYPE-NAME* is NIL the encoder does not add a prototype field when encoding an object that misses a prototype slot or key, and the decoder does not search for or treat in a special manner a prototype field in the input. The latter behaviour results in the ``simple semantics'' of §3:
(LET ((*PROTOTYPE-NAME* NIL)) (DECODE-JSON-FROM-STRING "{"bar":10,"baz":20,"prototype":{"lispClass":"foo", "lispPackage":"json"}}")) => #<#<STANDARD-CLASS NIL {12524FA1}> {125290A9}>
(DESCRIBE *) || #<#<STANDARD-CLASS NIL {12524FA1}> {125290A9}> || is an instance of class #<STANDARD-CLASS NIL {12524FA1}>. || The following slots have :INSTANCE allocation: || BAR 10 || BAZ 20 || PROTOTYPE #<#<STANDARD-CLASS NIL {12521579}> {12524219}>
(DESCRIBE (SLOT-VALUE ** 'PROTOTYPE)) || #<#<STANDARD-CLASS NIL {12521579}> {12524219}> || is an instance of class #<STANDARD-CLASS NIL {12521579}>. || The following slots have :INSTANCE allocation: || LISP-CLASS "foo" || LISP-PACKAGE "json"
*JSON-ARRAY-TYPE* special variable (default: VECTOR)
A type specifier which determines what type to coerce JSON arrays to. The default value has been chosen to cope with the kind of problem mentioned in §2. Thus, unless *JSON-ARRAY-TYPE* is set to LIST the value of (DECODE-JSON-FROM-STRING "[]") shall be #(), yielding "[]" when re-encoded.
WITH-OLD-DECODER-SEMANTICS macro
Ensures backward compatibility with the old CL-JSON decoder. It can be called as (WITH-OLD-DECODER-SEMANTICS FORM*) where FORMs are an implicit PROGN. FORMs are executed in an environment where JSON objects are invariably decoded to alists, JSON arrays to lists, array keys interned in the package KEYWORD, and array fields called ``prototype'' (or whatever is specified by *PROTOTYPE-NAME*) receive no special treatment. E.g.:
(WITH-OLD-DECODER-SEMANTICS (DECODE-JSON-FROM-STRING "{"bar":10,"baz":20,"prototype":{"lispSuperclasses": ["foo","bar"],"lispPackage":"json"}}")) => ((:BAR . 10) (:BAZ . 20) (:PROTOTYPE (:LISP-SUPERCLASSES "foo" "bar") (:LISP-PACKAGE . "json")))
*** Summary of patches ***
Sun Dec 23 18:10:00 MSK 2007 boris.smilga@gmail.com * Added CLOS / MOP infrastructure.
Sun Dec 23 18:13:34 MSK 2007 boris.smilga@gmail.com * Added decoding of objects to CLOS objects, hash tables or alists.
Sun Dec 23 18:16:10 MSK 2007 boris.smilga@gmail.com * Added decoding of arrays to vectors or lists.
Sun Dec 23 18:17:05 MSK 2007 boris.smilga@gmail.com * Added WITH-OLD-DECODER-SEMANTICS.
Sun Dec 23 18:17:44 MSK 2007 boris.smilga@gmail.com * Added encoding of CLOS objects, and prototypes for alists and hash tables.
This seems to be a very useful addition, I know there have been some people complaining on comp.lang.lisp that cl-json does not decode to objects. I will look at the patches and integrate them soon, I want to make sure, if possible, that it is backwards compatible and there is an option to have it any way you want. I see that there is a backwards compatibility macro included, so I guess it is simple.
But anyway, I wanted to say thanks first before digging into the code, also for the great documentation!
/Henrik
About Boris CLOS encode/decoder patches, they are pushed to darcs, but need more some integration and testing. So, the old json functionality is still default in the codebase. I've added a TODO file about this.
/Henrik
Please find attached a patch bundle with testcases for CLOS semantics and prototypes.
I have added a score of new tests, and fixed some existing tests so that their results do not depend on the global (more precisely, dynamic) settings of the semantics variables, *JSON-SYMBOLS-PACKAGE* and *PROTOTYPE-NAME*. All tests now run OK (tested under both Clozure CL and SBCL on Darwin/PPC).
Sincerely, B. Smilga.