One of the web sites started to give me cookies with commas and drakma:get-cookies just crashes on those requests. I distilled my case into a small example like this: (drakma::get-cookies '((:CONTENT-TYPE . "text/html; charset=utf-8") (:LOCATION . "http://www.test") (:SERVER . "Microsoft-IIS/7.0") (:CONTENT-LENGTH . "46") (:DATE . "Sat, 12 Sep 2009 14:58:04 GMT") (:CONNECTION . "close") (:SET-COOKIE . "domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; session=6,Direct,placeholder,test.com;") (:CACHE-CONTROL . "private")) (puri:parse-uri "http://www.test.com"))
It'll throw an exception trying to parse "session=6,Direct,placeholder, test.com" pair and will complain about the commas. I tried to capture the same page with FF Live Http Headers and it has no problems with that. Do you think we could change drakma to be able to digest it as well?
Thank you, Andrei
On Wed, Sep 16, 2009 at 10:59 PM, Andrei Stebakov lispercat@gmail.com wrote:
One of the web sites started to give me cookies with commas and drakma:get-cookies just crashes on those requests. I distilled my case into a small example like this: (drakma::get-cookies '((:CONTENT-TYPE . "text/html; charset=utf-8") (:LOCATION . "http://www.test") (:SERVER . "Microsoft-IIS/7.0") (:CONTENT-LENGTH . "46") (:DATE . "Sat, 12 Sep 2009 14:58:04 GMT") (:CONNECTION . "close") (:SET-COOKIE . "domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; session=6,Direct,placeholder,test.com;") (:CACHE-CONTROL . "private")) (puri:parse-uri "http://www.test.com"))
It'll throw an exception trying to parse "session=6,Direct,placeholder,test.com" pair and will complain about the commas. I tried to capture the same page with FF Live Http Headers and it has no problems with that. Do you think we could change drakma to be able to digest it as well?
Sorry for the late reply.
I was going to write that IIS sends a wrong header according to the RFCs, but after re-reading them I now think that one might interpret them in a different way and that Drakma's general handling of commas has to be reworked to accommodate this interpretation.
Stay tuned, I'll think about how this can best be achieved.
Edi.
On Wed, Sep 30, 2009 at 11:09 AM, Edi Weitz edi@agharta.de wrote:
I was going to write that IIS sends a wrong header according to the RFCs, but after re-reading them I now think that one might interpret them in a different way and that Drakma's general handling of commas has to be reworked to accommodate this interpretation.
No. In the meantime, I think this cookie really looks fishy.
In RFC 2109 (for "Set-Cookie") the syntax is defined as "1#cookie" which according to the HTTP specification this RFC refers to means a comma-separated list of values, i.e. if a comma is not quoted, it separates one Set-Cookie header from the next one. I understand that this is kind of sloppy already because lots of servers use a syntax were the date in "expires" uses a comma in the wrong place and Drakma caters to that. The question is how to deal with commas in general.
Consider this example:
Set-Cookie: domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; session=foo,bar=baz
If sent by IIS this probably means (?) that the cookie "domain" has an attribute "session" with the value "foo,bar=baz", right?
But it could also mean (see RFC) that the value of "session" is "foo" and that there's a second cookie "bar" with the value "baz". In fact, if Drakma reads two header lines like so
Set-Cookie: domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; session=foo Set-Cookie: bar=baz
it will actually join them with a comma before parsing them (in accordance with the HTTP RFC).
So, we could probably provide some special variable to make cookie parsing less restrictive, but I wonder what the exact semantics of this should be.
Any suggestions?
Thanks, Edi.
Looks like according to RFC 2109, "=" takes priority over "," so probably when we encounter something like session=foo,bar=baz, the parser should analyze sequences on both sides of an "=" character, so in this case comma becomes a separator of two different pairs.
On Wed, Sep 30, 2009 at 10:41 AM, Edi Weitz edi@agharta.de wrote:
On Wed, Sep 30, 2009 at 11:09 AM, Edi Weitz edi@agharta.de wrote:
I was going to write that IIS sends a wrong header according to the RFCs, but after re-reading them I now think that one might interpret them in a different way and that Drakma's general handling of commas has to be reworked to accommodate this interpretation.
No. In the meantime, I think this cookie really looks fishy.
In RFC 2109 (for "Set-Cookie") the syntax is defined as "1#cookie" which according to the HTTP specification this RFC refers to means a comma-separated list of values, i.e. if a comma is not quoted, it separates one Set-Cookie header from the next one. I understand that this is kind of sloppy already because lots of servers use a syntax were the date in "expires" uses a comma in the wrong place and Drakma caters to that. The question is how to deal with commas in general.
Consider this example:
Set-Cookie: domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; session=foo,bar=baz
If sent by IIS this probably means (?) that the cookie "domain" has an attribute "session" with the value "foo,bar=baz", right?
But it could also mean (see RFC) that the value of "session" is "foo" and that there's a second cookie "bar" with the value "baz". In fact, if Drakma reads two header lines like so
Set-Cookie: domain=test.com; expires=Thu, 12-Sep-2109 14:58:04 GMT; session=foo Set-Cookie: bar=baz
it will actually join them with a comma before parsing them (in accordance with the HTTP RFC).
So, we could probably provide some special variable to make cookie parsing less restrictive, but I wonder what the exact semantics of this should be.
Any suggestions?
Thanks, Edi.
drakma-devel mailing list drakma-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
On Wed, Sep 30, 2009 at 5:35 PM, Andrei Stebakov lispercat@gmail.com wrote:
Looks like according to RFC 2109, "=" takes priority over "," so probably when we encounter something like session=foo,bar=baz, the parser should analyze sequences on both sides of an "=" character, so in this case comma becomes a separator of two different pairs.
Ah, that's something I've been missing so far. Can you point to where exactly this can be found in the RFC? That should make the cookie parsing code clearer and I should be able to get rid of the comma workaround which is already in there.
Thanks, Edi.
I think those are two different issues we are taking about. In your case, yes, most likely bar=baz makes a new cookie (according to the RFC). In my case "session=6,Direct,placeholder,test.com;" is an obvious attribute-value pair followed by a ";". (as per RFC: "av-pairs = av-pair *(";" av-pair)" ) What I am trying to say that drakma shouldn't stumble upon the comma after "6", since the next construct is not "name=value", but only a token. I agree, that they break a rule set by the RFC 2068, which defines a token as token = 1*<any CHAR except CTLs or tspecials> where tspecials = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | "" | <"> | "/" | "[" | "]" | "?" | "=" | "{" | "}" | SP | HT
So, having a comma in a token "6,Direct,placeholder,test.com" is against the rule but in this case it's still easily identifiable as value for the "session=6,Direct,placeholder,test.com;" av-pair so it needs to be fixed or some api provided so this issue could be overcome.
Thank you, Andrei
On Wed, Sep 30, 2009 at 5:58 PM, Edi Weitz edi@agharta.de wrote:
On Wed, Sep 30, 2009 at 5:35 PM, Andrei Stebakov lispercat@gmail.com wrote:
Looks like according to RFC 2109, "=" takes priority over "," so probably when we encounter something like session=foo,bar=baz, the parser should analyze sequences on both sides of an "=" character, so in this case
comma
becomes a separator of two different pairs.
Ah, that's something I've been missing so far. Can you point to where exactly this can be found in the RFC? That should make the cookie parsing code clearer and I should be able to get rid of the comma workaround which is already in there.
Thanks, Edi.