New subject: A bug in functon parse-content-type.

8 Apr 2015

      Hi Hans,

The whole thing I want is a a stable hunchentoot server which will be compatible with other web clients
and a stable drakma client which will be be compatible with other web servers,whether the web clients/servers
follows http protocols well or not should not be the reason which makes hunchentoot/drakma failed directly.
I hope you can understand that if drakma/hunchentoot failed directly, my commercial business will fail too.

I must say that the codes from Edi has very high qulities,and I have high respect to Edi and you for that.

For the case of this question, I hope chunga/drakma/hunchentoot could accept a special feature or a speical variable
to make them accept the content type header which not follows http protocols well,like cl-http does.

     -----------------------------------------------------------------------------
     (parse-mime-content-type-header "application/x-www-form-urlencoded;
     text/html; charset=UTF-8")
        ==> (:APPLICATION :X-WWW-FORM-URLENCODED :CHARSET :UTF-8)
     -----------------------------------------------------------------------------

I think your solution(request-with-bad-content-type) will be a little trivial for me.

If you accept my suggestion, I can give you a patch for these three packages(chunga/drakma/hunchentoot).

With Best Regards,

At Sun, 26 May 2013 08:04:15 +0200,
Hans Hübner wrote:
...
[1  <text/plain; ISO-8859-1 (quoted-printable)>]
[2  <text/html; ISO-8859-1 (quoted-printable)>]
Jingtao,
please refer to http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7, it clearly
describes that a media type consists of exactly one type/subtype indicator followed by
optional attribute=value pairs.  The content type that you have presented is not valid
according to these rules.   Neither a lax parser like the one in CL-HTTP nor the fact
that a large site sends these bogus headers makes them valid.  I do not want to include
code in Hunchentoot that tries to interpret such bogus data.
However, if you cannot get your trading partner to fix their client, I can offer this
solution:
(defclass request-with-bad-content-type (hunchentoot:request)
  ())
(defmethod hunchentoot:header-in :around ((name (eql :content-type)) (request
request-with-bad-content-type))
  (alexandria:when-let (content-type (call-next-method))
    (ppcre:regex-replace-all "^([^/]+/[^/]+); *[^/]+/[^/;]+" content-type "\\1")))
You'll then have to use the :request-class argument to your acceptor instantiation to
make it use the request-with-bad-content-type class.  You also want to review the regular
expression carefully and maybe profile your application to see whether you need to cache
or otherwise improve performance.
-Hans
On Sun, May 26, 2013 at 5:07 AM, Jingtao Xu <jingtaozf@gmail.com> wrote:
Hi Hans,
I don't agree with you to say that this content type header is just bogus.
    As the content-type is sent by the largest B2B/B2C site in china, it
    must have a reason.
And if you try cl-http, you can find that cl-http will parse such
    content type correctly.
-----------------------------------------------------------------------------
    (parse-mime-content-type-header "application/x-www-form-urlencoded;
    text/html; charset=UTF-8")
       ==> (:APPLICATION :X-WWW-FORM-URLENCODED :CHARSET :UTF-8)
    -----------------------------------------------------------------------------
You can find the definition in cl-http/server/headers.lisp
    -----------------------------------------------------------------------------
    (define-header-type :content-type-header (:header)
      :parse-function parse-mime-content-type-header
      :print-function print-mime-content-type-header)
    -----------------------------------------------------------------------------
Even this content-type header is bogus(actually I don't think so),
    hunchentoot/drakma should parse
    the header without raising an error if one special variable like *
    accept-bogus-content-type* is true.
With Best Regards,
    jingtao.
On Sat, May 25, 2013 at 8:11 PM, Hans Hübner <hans.huebner@gmail.com> wrote:
    > Jingtao,
    >
    > the content-type header "application/x-www-form-urlencoded; text/html;
    > charset=UTF-8" is just bogus.  I do not want to include code that makes
    > Hunchentoot work with clearly broken clients.  Better error reporting would
    > be acceptable, though.
    >
    > -Hans
    >
    >
    > On Sat, May 25, 2013 at 12:38 PM, Jingtao Xu <jingtaozf@gmail.com> wrote:
    >>
    >> Hi all,
    >>
    >> I found the content type header which raise the bug in my message.log
    >> generated by hunchentoot.
    >> It happened when hunchentoot get following content type header:
    >>
    >>
    >>
    -----------------------------------------------------------------------------------------
    >> application/x-www-form-urlencoded; text/html; charset=UTF-8
    >>
    >>
    -----------------------------------------------------------------------------------------
    >>
    >> I noticed that in package drakma's file read.lisp,function
    >> 'get-content-type'
    >> also assumed "/" as a token separator.
    >>
    >> I hope package chunga/drakma/hunchentoot could accept such content type
    >> header
    >> without raising an exception,As Edl said,a new special variable
    >> similar to *accept-bogus-eols* or
    >> *treat-semicolon-as-continuation* which only assume " ,;" as token
    >> separator may be a good idea and will fix my question.
    >>
    >> Any way, RFC standard is not well fit with the read world.
    >>
    >> Thanks very much.
    >>
    >> WIth Best Regards,
    >> jingtao.
    >>
    >>
    >> On Thu, May 23, 2013 at 2:01 PM, Edi Weitz <edi@agharta.de> wrote:
    >> > I'm not the maintainer anymore, but my take is that if some Ruby or
    >> > Java client misinterprets the RFC I wouldn't change Hunchentoot's (or
    >> > rather Chunga's) default behavior because of that.  I'd rather
    >> > introduce a new special variable similar to *accept-bogus-eols* or
    >> > *treat-semicolon-as-continuation*.
    >> >
    >> > Just my .02 Euros,
    >> > Edi.
    >> >
    >> >
    >> >
    >> > On Thu, May 23, 2013 at 2:52 AM, Jingtao Xu <jingtaozf@gmail.com> wrote:
    >> >> Hi All,
    >> >>
    >> >> 1. The function `read-name-value-pair' is called by `
    >> >> parse-content-type' in hunchentoo/util.lisp,not by my codes.
    >> >> 2. the slash is a token constituent in java/ruby implementation,and I
    >> >> think some web client/server treat it as a token constituent too,
    >> >>     but I am waiting for the hunchentoot log to give us a live example.
    >> >>
    >> >> With Best Regards,
    >> >> jingtao
    >> >>
    >> >>
    >> >> On Wed, May 22, 2013 at 11:40 PM, Edi Weitz <edi@agharta.de> wrote:
    >> >>> If I'm not mistaken, the slash is a "separator" and thus not a token
    >> >>> constituent according to RFC 2616 which means "path=/foo" is not legal
    >> >>> input for READ-NAME-VALUE-PAIR.
    >> >>>
    >> >>> On Wed, May 22, 2013 at 5:27 PM, Ron Garret <ron@flownet.com> wrote:
    >> >>>> Very likely Jingtao's code is calling READ-NAME-VALUE-PAIR without
    >> >>>> being wrapped in this macro
    >> >>>>
    >> >>>> But there's still a bug in READ-NAME-VALUE-PAIR:
    >> >>>>
    >> >>>> ? (WITH-INPUT-FROM-VECTOR (S (MAP '(VECTOR (UNSIGNED-BYTE 8))
    >> >>>> 'CHAR-CODE "path=/foo"))
    >> >>>>   (chunga:with-character-stream-semantics
    >> >>>>       (CHUNGA:READ-NAME-VALUE-PAIR S)))
    >> >>>> ("path" . "")
    >> >>>>
    >> >>>> On May 22, 2013, at 8:19 AM, Edi Weitz wrote:
    >> >>>>
    >> >>>>> On Wed, May 22, 2013 at 4:18 PM, Ron Garret <ron@flownet.com> wrote:
    >> >>>>>> I found a bug in CHUNGA:READ-NAME-VALUE-PAIR.
    >> >>>>>
    >> >>>>> It's not quite clear to me yet what the bug is supposed to be.
    >> >>>>>
    >> >>>>> The documentation clearly says that calls to READ-NAME-VALUE-PAIR
    >> >>>>> and
    >> >>>>> friends must be wrapped with this macro:
    >> >>>>>
    >> >>>>>  http://weitz.de/chunga/#with-character-stream-semantics
    >> >>>>>
    >> >>>>> (You might argue that this isn't very user-friendly, but Chunga
    >> >>>>> wasn't
    >> >>>>> really intended to be used that way.)
    >> >>>>
    >> >>
    >
    >

Re: A bug in functon parse-content-type.

Jingtaqo Xu

Hans Hübner

Hans Hübner

Ron Garret

Jingtao Xu

Hans Hübner

Edi Weitz

Edi Weitz

Jingtao Xu

Hans Hübner

Jingtao Xu

Ron Garret

Jingtao Xu

Hans Hübner

Jingtao Xu

Raymond Wiker

Ron Garret

Jingtao Xu

Edi Weitz

Ron Garret

tags

participants (6)