Hello,
I believe the following behaviour is a bug:
(drakma::parse-set-cookie "session="1,2,3"; domain=example.com; path=/")
generates an error:
While parsing cookie header "session="1,2,3"; domain=example.com; path=/": Read character #,, but expected #=. [Condition of type SIMPLE-ERROR]
Restarts: 0: [ABORT] Return to SLIME's top level. 1: [TERMINATE-THREAD] Terminate this thread (#<THREAD "new-repl-thread" {1004560551}>)
Backtrace: 0: (CHUNGA::SIGNAL-UNEXPECTED-CHARS #, #=) 1: (CHUNGA:ASSERT-CHAR #<SB-IMPL::STRING-INPUT-STREAM {1004562A51}> #=) 2: (CHUNGA:READ-NAME-VALUE-PAIR #<SB-IMPL::STRING-INPUT-STREAM {1004562A51}>)[:EXTERNAL] 3: (DRAKMA::PARSE-SET-COOKIE "session="1,2,3"; domain=example.com; path=/")
According to RFC2965[1],
av-pairs = av-pair *(";" av-pair) av-pair = attr ["=" value] ; optional value attr = token value = token | quoted-string
value is either a token or a quoted string. According to RFC2616[2],
quoted-string = ( <"> *(qdtext | quoted-pair ) <"> ) qdtext = <any TEXT except <">>
quoted string can contain any text except ", including commas.
I know a particular web-site (reddit.com) that sets cookies of that form, and Firefox parses them without any problem.
The problem seem to be in read-name-value-pair from CHUNGA:
DRAKMA> (with-input-from-string (in "session="1,2,3"; domain=example.com; path=/") (read-name-value-pair in :cookie-syntax t))
("session" . ""1")
Best Regards, Victor.
[1] http://www.faqs.org/rfcs/rfc2965.html [2] http://www.faqs.org/rfcs/rfc2616.html
On Sat, 15 Mar 2008 22:59:09 -0500, Victor Kryukov victor.kryukov@gmail.com wrote:
According to RFC2965[1]
Unfortunately, it has been my experience that you can't rely on RFCs when parsing cookies as there are too many deviations out there.
Having said that, I'd be happy to accept a patch which fixed this particular issue without breaking "compatibility" with other widespread ways of setting cookies.
Edi.
Edi Weitz edi@agharta.de writes:
On Sat, 15 Mar 2008 22:59:09 -0500, Victor Kryukov victor.kryukov@gmail.com wrote:
According to RFC2965[1]
Unfortunately, it has been my experience that you can't rely on RFCs when parsing cookies as there are too many deviations out there.
First of all, let me correct myself: not only RFC 2965 (which is a new standard), but also RFC 2109 (which is the old standard for cookies that seem to be widespreadly used) requires cookie values to be either terms or quoted strings.
I understand why you may want to be less strict about following the standard and allow some deviations from it, but current DRAKMA behaviour is broken for the sites that _do follow the standard_, which I think is not acceptable, as least not as a default behaviour.
As a reference point, I've checked that python >= 2.4 follows RFC 2109 more or less (you can check lib/Cookie.py, _CookiePattern is their regexp for parsing name/value pairs), and Perl also parses quoted cookie values without any problems (I've used WWW::Mechanize which in turn uses HTTP::Cookie for cookie handling).
Having said that, I'd be happy to accept a patch which fixed this particular issue without breaking "compatibility" with other widespread ways of setting cookies.
I'd be happy to provide a patch once I better understand what are the "compatibility" requirements. Could you please point me in the right direction? Do you mean old-style Netscape cookies[1]?
Also, there are mutliple strategies to fix that. One is to follow RFC2109 by default, but allow for Netscape-style cookies after setting a certain parameter (I'd prefer this one). Another is to try to follow RFC 2109, but then rollback to Netscape-style cookies if parsing error occurs. There are probably other possible approaches. Which one would you prefer?
Regards, Victor.
On Mon, 17 Mar 2008 18:04:18 -0500, Victor Kryukov victor.kryukov@gmail.com wrote:
I understand why you may want to be less strict about following the standard and allow some deviations from it, but current DRAKMA behaviour is broken for the sites that _do follow the standard_, which I think is not acceptable, as least not as a default behaviour.
Agreed.
I'd be happy to provide a patch once I better understand what are the "compatibility" requirements. Could you please point me in the right direction?
I wanted to provide some examples this morning already, but unfortunately, I can't come up with any right now. I /do/ know that at one time I tested a lot of "real-life" cookies with Drakma, though, and quite a few weren't RFC-compliant.
Do you mean old-style Netscape cookies[1]?
Allowing old-style Netscape cookies - at least optionally - would be a good idea, yes.
Also, there are mutliple strategies to fix that. One is to follow RFC2109 by default, but allow for Netscape-style cookies after setting a certain parameter (I'd prefer this one). Another is to try to follow RFC 2109, but then rollback to Netscape-style cookies if parsing error occurs. There are probably other possible approaches. Which one would you prefer?
The ideal solution, I think, would be a combination of the first two strategies you mentioned. Follow RFC2109 by default, but provide one or more sensible restarts in case a parsing error occurs. Additionally, have some parameter (probably a global special variable) which if set automatically invokes the "use old Netscape-style" restart.
Does that sound reasonable?
Edi Weitz edi@agharta.de writes:
On Mon, 17 Mar 2008 18:04:18 -0500, Victor Kryukov victor.kryukov@gmail.com wrote:
I understand why you may want to be less strict about following the standard and allow some deviations from it, but current DRAKMA behaviour is broken for the sites that _do follow the standard_, which I think is not acceptable, as least not as a default behaviour.
Agreed.
Do you mean old-style Netscape cookies[1]?
Allowing old-style Netscape cookies - at least optionally - would be a good idea, yes.
It seems to me that this is exactly what read-cookie-value is doing. So we just need to wrap it up nicely.
Also, there are mutliple strategies to fix that. One is to follow RFC2109 by default, but allow for Netscape-style cookies after setting a certain parameter (I'd prefer this one). Another is to try to follow RFC 2109, but then rollback to Netscape-style cookies if parsing error occurs. There are probably other possible approaches. Which one would you prefer?
The ideal solution, I think, would be a combination of the first two strategies you mentioned. Follow RFC2109 by default, but provide one or more sensible restarts in case a parsing error occurs. Additionally, have some parameter (probably a global special variable) which if set automatically invokes the "use old Netscape-style" restart.
Does that sound reasonable?
Yes, that must be the optimal solution. I'll try to implement it as a patch soon, most likely early next week.
Best Regards, Victor