Hello all,
CL-JSON does not allow the user to customize the means used to decode the keys for object literals. It may be important to avoid interning in a web setting, for example, since interns of many unique symbols could potentially use a lot of memory. An attack could exploit this by submitting something that is passed through cl-json that has many very large, unique symbols.
There used to be a way to get around this with the factory method customization, but the current library does not include a means of changing the decoding behavior for a key to avoid interning it. Unless I am missing something, could this functionality be added?
Red
On 25 Jun 2009, at 22:27, Red Daly wrote:
CL-JSON does not allow the user to customize the means used to decode the keys for object literals. It may be important to avoid interning in a web setting, for example, since interns of many unique symbols could potentially use a lot of memory. An attack could exploit this by submitting something that is passed through cl-json that has many very large, unique symbols.
Indeed, thank you for pointing this out.
There used to be a way to get around this with the factory method customization, but the current library does not include a means of changing the decoding behavior for a key to avoid interning it. [...]
It is the same thing in the new version, except that customization works in a somewhat different way. Broadly speaking, you have to redefine the way a level of JSON Object structure is accumulated to form the corresponding Lisp structure. E. g., in the following example new (KEY . VALUE) pairs are clipped onto the end of a list accumulator, to form an alist:
(defvar *accumulator* nil) (defvar *accumulator-last* nil)
(defun init-accumulator () (setq *accumulator* (cons nil nil) *accumulator-last* *accumulator*))
(defun collect-key (key) (setq *accumulator-last* (setf (cdr *accumulator-last*) (cons (cons key nil) nil))))
(defun collect-value (value) (setf (cdar *accumulator-last*) value))
(defun accumulator-get-value () (cdr *accumulator*))
(json:bind-custom-vars (:beginning-of-object #'init-accumulator :object-key #'collect-key :object-value #'collect-value :end-of-object #'accumulator-get-value :object-scope '(*accumulator* *accumulator-last*)) (json:decode-json-from-string "{"foo": [{"bar": "xyzzy"}, {"baz": true}], "quux": 123}"))
=> (("foo" (("bar" . "xyzzy")) (("baz" . T))) ("quux" . 123))
[...] Unless I am missing something, could this functionality be added?
No problem, but could you maybe provide a sample of phantasy code which showed how this kind of customization interface should look from the user's side? With that, we'll be more able to meet your expectations. (I cannot, of course, guarantee that your interface shall be reproduced 100% exactly, rather it'll serve as a guideline.)
Yours, - B. Smilga.
I created this patch before I got your email. What my usage is now the following:
(let ((json:*json-identifier-name-to-lisp* 'identity) (json:*json-symbolize-lisp-key* 'identity)) (json:decode-json-from-string json))
The patch is attached:
I have not taken the time to read through how the accumulator works. This approach modifies the default behavior in a more fine-grained way.
On Thu, Jun 25, 2009 at 1:29 PM, Boris Smilga boris.smilga@gmail.comwrote:
On 25 Jun 2009, at 22:27, Red Daly wrote:
CL-JSON does not allow the user to customize the means used to decode the
keys for object literals. It may be important to avoid interning in a web setting, for example, since interns of many unique symbols could potentially use a lot of memory. An attack could exploit this by submitting something that is passed through cl-json that has many very large, unique symbols.
Indeed, thank you for pointing this out.
There used to be a way to get around this with the factory method
customization, but the current library does not include a means of changing the decoding behavior for a key to avoid interning it. [...]
It is the same thing in the new version, except that customization works in a somewhat different way. Broadly speaking, you have to redefine the way a level of JSON Object structure is accumulated to form the corresponding Lisp structure. E. g., in the following example new (KEY . VALUE) pairs are clipped onto the end of a list accumulator, to form an alist:
(defvar *accumulator* nil) (defvar *accumulator-last* nil)
(defun init-accumulator () (setq *accumulator* (cons nil nil) *accumulator-last* *accumulator*))
(defun collect-key (key) (setq *accumulator-last* (setf (cdr *accumulator-last*) (cons (cons key nil) nil))))
(defun collect-value (value) (setf (cdar *accumulator-last*) value))
(defun accumulator-get-value () (cdr *accumulator*))
(json:bind-custom-vars (:beginning-of-object #'init-accumulator :object-key #'collect-key :object-value #'collect-value :end-of-object #'accumulator-get-value :object-scope '(*accumulator* *accumulator-last*)) (json:decode-json-from-string "{"foo": [{"bar": "xyzzy"}, {"baz": true}], "quux": 123}"))
=> (("foo" (("bar" . "xyzzy")) (("baz" . T))) ("quux" . 123))
Thanks for an example of a custom accumulator modification. Just out of curiosity, have you seen the accumulator paradigm crop up in other contexts?
[...] Unless I am missing something, could this functionality be added?
No problem, but could you maybe provide a sample of phantasy code which showed how this kind of customization interface should look from the user's side? With that, we'll be more able to meet your expectations. (I cannot, of course, guarantee that your interface shall be reproduced 100% exactly, rather it'll serve as a guideline.)
It would be nice to have something like:
(let ((json:*json-symbolize-lisp-key* 'identity)) ...)
This is still a little confusing unless you are familiar with the *json-identifier-name-to-lisp* variable, in which case it makes a little more sense.
Yours,
- B. Smilga.
-Red
patch: diff -rN old-cl-json/src/common.lisp new-cl-json/src/common.lisp 94a95,98
(defvar *json-symbolize-lisp-key* 'json-intern "Designator for a function which, during decoding, maps the
*json-identifier-name-to-lisp*
-transformed key to the value it will have in the result object.")
\ No newline at end of file diff -rN old-cl-json/src/decoder.lisp new-cl-json/src/decoder.lisp 474c474 < (let ((key (json-intern (funcall *json-identifier-name-to-lisp* key)))) ---
(let ((key (funcall *json-identifier-name-to-lisp* key)))
604,605c604,605 < (string (json-intern < (funcall *json-identifier-name-to-lisp* value))) ---
(string (funcall *json-symbolize-lisp-key* (funcall *json-identifier-name-to-lisp* value)))
609c609 < collect (cons (json-intern key) value)))) ---
collect (cons (funcall *json-symbolize-lisp-key* key)
value)))) diff -rN old-cl-json/src/package.lisp new-cl-json/src/package.lisp 16a17
#:*json-symbolize-lisp-key*
On Thu, Jun 25, 2009 at 11:09 PM, Red Dalyreddaly@gmail.com wrote:
I created this patch before I got your email. What my usage is now the following:
(let ((json:*json-identifier-name-to-lisp* 'identity) (json:*json-symbolize-lisp-key* 'identity)) (json:decode-json-from-string json))
The patch is attached:
I agree with the problem and solution. Thanks for the patch.
I just want all code for features to have at least one testcase, so I'll apply it later when I have done one.
Thanks, Henrik
Just a minor cavil: does not *STRING-TO-KEY* (or maybe *IDENTIFIER- NAME-TO-KEY*) seem like a better name for this variable?
Rationale: I would say that the -JSON- and -LISP- parts of the original *JSON-SYMBOLIZE-LISP-KEY* are not in place here because, unlike *JSON-IDENTIFIER-NAME-TO-LISP*, the function does not translate between JSON and Lisp conventions. Further, -SYMBOLIZE- is a very imperative form, and imperatives should, in my humble opinion, be limited to functions with side effects.
What do you think?
- B. Sm.
On Fri, Jun 26, 2009 at 3:20 PM, Boris Smilga boris.smilga@gmail.comwrote:
Just a minor cavil: does not *STRING-TO-KEY* (or maybe *IDENTIFIER-NAME-TO-KEY*) seem like a better name for this variable?
I would prefer *IDENTIFIER-NAME-TO-KEY* since it gives more context about what the function is used for. *STRING-TO-KEY* is confusing since it's unclear where the "string" is coming from.
Rationale: I would say that the -JSON- and -LISP- parts of the original *JSON-SYMBOLIZE-LISP-KEY* are not in place here because, unlike *JSON-IDENTIFIER-NAME-TO-LISP*, the function does not translate between JSON and Lisp conventions. Further, -SYMBOLIZE- is a very imperative form, and imperatives should, in my humble opinion, be limited to functions with side effects.
What do you think?
- B. Sm.
-Red
On Tue, Jun 30, 2009 at 5:41 PM, Red Dalyreddaly@gmail.com wrote:
On Fri, Jun 26, 2009 at 3:20 PM, Boris Smilga boris.smilga@gmail.com wrote:
Just a minor cavil: does not *STRING-TO-KEY* (or maybe *IDENTIFIER-NAME-TO-KEY*) seem like a better name for this variable?
I would prefer *IDENTIFIER-NAME-TO-KEY* since it gives more context about what the function is used for. *STRING-TO-KEY* is confusing since it's unclear where the "string" is coming from.
I have now applied this (with one slight difference) and added a testcase. Also I have made a function safe-json-intern that I think should be safe from attacks. I pasted the testcases below.
Thanks! -Henrik
(test custom-identifier-name-to-key "Interns of many unique symbols could potentially use a lot of memory. An attack could exploit this by submitting something that is passed through cl-json that has many very large, unique symbols. See the safe-symbols-parsing function here for a cure." (with-decoder-simple-list-semantics (flet ((safe-symbols-parsing (name) (or (find-symbol name *json-symbols-package*) (error "unknown symbols not allowed")))) (let ((good-symbols "{"car":1,"cdr":2}") (bad-symbols "{"could-be":1,"a-denial-of-service-attack":2}") (*json-symbols-package* (find-package :cl)) (*identifier-name-to-key* #'safe-symbols-parsing)) (is (equal '((car . 1) (cdr . 2)) (decode-json-from-string good-symbols))) (signals error (decode-json-from-string bad-symbols))))))
(test safe-json-intern (with-decoder-simple-list-semantics (let ((good-symbols "{"car":1,"cdr":2}") (bad-symbols "{"could-be":1,"a-denial-of-service-attack":2}") (*json-symbols-package* (find-package :cl)) (*identifier-name-to-key* #'safe-json-intern)) (is (equal '((car . 1) (cdr . 2)) (decode-json-from-string good-symbols))) (signals unknown-symbol-error (decode-json-from-string bad-symbols)))))