Hi Hans,
The whole thing I want is a a stable hunchentoot server which will be compatible with other web clients and a stable drakma client which will be be compatible with other web servers,whether the web clients/servers follows http protocols well or not should not be the reason which makes hunchentoot/drakma failed directly. I hope you can understand that if drakma/hunchentoot failed directly, my commercial business will fail too.
I must say that the codes from Edi has very high qulities,and I have high respect to Edi and you for that.
For the case of this question, I hope chunga/drakma/hunchentoot could accept a special feature or a speical variable to make them accept the content type header which not follows http protocols well,like cl-http does.
----------------------------------------------------------------------------- (parse-mime-content-type-header "application/x-www-form-urlencoded; text/html; charset=UTF-8") ==> (:APPLICATION :X-WWW-FORM-URLENCODED :CHARSET :UTF-8) -----------------------------------------------------------------------------
I think your solution(request-with-bad-content-type) will be a little trivial for me.
If you accept my suggestion, I can give you a patch for these three packages(chunga/drakma/hunchentoot).
With Best Regards,
At Sun, 26 May 2013 08:04:15 +0200, Hans Hübner wrote:
[1 <text/plain; ISO-8859-1 (quoted-printable)>]
[2 <text/html; ISO-8859-1 (quoted-printable)>] Jingtao,
please refer to http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7, it clearly describes that a media type consists of exactly one type/subtype indicator followed by optional attribute=value pairs. The content type that you have presented is not valid according to these rules. Neither a lax parser like the one in CL-HTTP nor the fact that a large site sends these bogus headers makes them valid. I do not want to include code in Hunchentoot that tries to interpret such bogus data.
However, if you cannot get your trading partner to fix their client, I can offer this solution:
(defclass request-with-bad-content-type (hunchentoot:request) ())
(defmethod hunchentoot:header-in :around ((name (eql :content-type)) (request request-with-bad-content-type)) (alexandria:when-let (content-type (call-next-method)) (ppcre:regex-replace-all "^([^/]+/[^/]+); *[^/]+/[^/;]+" content-type "\1")))
You'll then have to use the :request-class argument to your acceptor instantiation to make it use the request-with-bad-content-type class. You also want to review the regular expression carefully and maybe profile your application to see whether you need to cache or otherwise improve performance.
-Hans
On Sun, May 26, 2013 at 5:07 AM, Jingtao Xu jingtaozf@gmail.com wrote:
Hi Hans, I don't agree with you to say that this content type header is just bogus. As the content-type is sent by the largest B2B/B2C site in china, it must have a reason. And if you try cl-http, you can find that cl-http will parse such content type correctly. ----------------------------------------------------------------------------- (parse-mime-content-type-header "application/x-www-form-urlencoded; text/html; charset=UTF-8") ==> (:APPLICATION :X-WWW-FORM-URLENCODED :CHARSET :UTF-8) ----------------------------------------------------------------------------- You can find the definition in cl-http/server/headers.lisp ----------------------------------------------------------------------------- (define-header-type :content-type-header (:header) :parse-function parse-mime-content-type-header :print-function print-mime-content-type-header) ----------------------------------------------------------------------------- Even this content-type header is bogus(actually I don't think so), hunchentoot/drakma should parse the header without raising an error if one special variable like * accept-bogus-content-type* is true. With Best Regards, jingtao. On Sat, May 25, 2013 at 8:11 PM, Hans Hübner <hans.huebner@gmail.com> wrote: > Jingtao, > > the content-type header "application/x-www-form-urlencoded; text/html; > charset=UTF-8" is just bogus. I do not want to include code that makes > Hunchentoot work with clearly broken clients. Better error reporting would > be acceptable, though. > > -Hans > > > On Sat, May 25, 2013 at 12:38 PM, Jingtao Xu <jingtaozf@gmail.com> wrote: >> >> Hi all, >> >> I found the content type header which raise the bug in my message.log >> generated by hunchentoot. >> It happened when hunchentoot get following content type header: >> >> >> ----------------------------------------------------------------------------------------- >> application/x-www-form-urlencoded; text/html; charset=UTF-8 >> >> ----------------------------------------------------------------------------------------- >> >> I noticed that in package drakma's file read.lisp,function >> 'get-content-type' >> also assumed "/" as a token separator. >> >> I hope package chunga/drakma/hunchentoot could accept such content type >> header >> without raising an exception,As Edl said,a new special variable >> similar to *accept-bogus-eols* or >> *treat-semicolon-as-continuation* which only assume " ,;" as token >> separator may be a good idea and will fix my question. >> >> Any way, RFC standard is not well fit with the read world. >> >> Thanks very much. >> >> WIth Best Regards, >> jingtao. >> >> >> On Thu, May 23, 2013 at 2:01 PM, Edi Weitz <edi@agharta.de> wrote: >> > I'm not the maintainer anymore, but my take is that if some Ruby or >> > Java client misinterprets the RFC I wouldn't change Hunchentoot's (or >> > rather Chunga's) default behavior because of that. I'd rather >> > introduce a new special variable similar to *accept-bogus-eols* or >> > *treat-semicolon-as-continuation*. >> > >> > Just my .02 Euros, >> > Edi. >> > >> > >> > >> > On Thu, May 23, 2013 at 2:52 AM, Jingtao Xu <jingtaozf@gmail.com> wrote: >> >> Hi All, >> >> >> >> 1. The function `read-name-value-pair' is called by ` >> >> parse-content-type' in hunchentoo/util.lisp,not by my codes. >> >> 2. the slash is a token constituent in java/ruby implementation,and I >> >> think some web client/server treat it as a token constituent too, >> >> but I am waiting for the hunchentoot log to give us a live example. >> >> >> >> With Best Regards, >> >> jingtao >> >> >> >> >> >> On Wed, May 22, 2013 at 11:40 PM, Edi Weitz <edi@agharta.de> wrote: >> >>> If I'm not mistaken, the slash is a "separator" and thus not a token >> >>> constituent according to RFC 2616 which means "path=/foo" is not legal >> >>> input for READ-NAME-VALUE-PAIR. >> >>> >> >>> On Wed, May 22, 2013 at 5:27 PM, Ron Garret <ron@flownet.com> wrote: >> >>>> Very likely Jingtao's code is calling READ-NAME-VALUE-PAIR without >> >>>> being wrapped in this macro >> >>>> >> >>>> But there's still a bug in READ-NAME-VALUE-PAIR: >> >>>> >> >>>> ? (WITH-INPUT-FROM-VECTOR (S (MAP '(VECTOR (UNSIGNED-BYTE 8)) >> >>>> 'CHAR-CODE "path=/foo")) >> >>>> (chunga:with-character-stream-semantics >> >>>> (CHUNGA:READ-NAME-VALUE-PAIR S))) >> >>>> ("path" . "") >> >>>> >> >>>> On May 22, 2013, at 8:19 AM, Edi Weitz wrote: >> >>>> >> >>>>> On Wed, May 22, 2013 at 4:18 PM, Ron Garret <ron@flownet.com> wrote: >> >>>>>> I found a bug in CHUNGA:READ-NAME-VALUE-PAIR. >> >>>>> >> >>>>> It's not quite clear to me yet what the bug is supposed to be. >> >>>>> >> >>>>> The documentation clearly says that calls to READ-NAME-VALUE-PAIR >> >>>>> and >> >>>>> friends must be wrapped with this macro: >> >>>>> >> >>>>> http://weitz.de/chunga/#with-character-stream-semantics >> >>>>> >> >>>>> (You might argue that this isn't very user-friendly, but Chunga >> >>>>> wasn't >> >>>>> really intended to be used that way.) >> >>>> >> >> > >