Re: A bug in functon parse-content-type.

8 Apr 2015

      Hi Hans,

The whole thing I want is a a stable hunchentoot server which will be compatible with other web clients
and a stable drakma client which will be be compatible with other web servers,whether the web clients/servers
follows http protocols well or not should not be the reason which makes hunchentoot/drakma failed directly.
I hope you can understand that if drakma/hunchentoot failed directly, my commercial business will fail too.

I must say that the codes from Edi has very high qulities,and I have high respect to Edi and you for that.

For the case of this question, I hope chunga/drakma/hunchentoot could accept a special feature or a speical variable
to make them accept the content type header which not follows http protocols well,like cl-http does.

     -----------------------------------------------------------------------------
     (parse-mime-content-type-header "application/x-www-form-urlencoded;
     text/html; charset=UTF-8")
        ==> (:APPLICATION :X-WWW-FORM-URLENCODED :CHARSET :UTF-8)
     -----------------------------------------------------------------------------

I think your solution(request-with-bad-content-type) will be a little trivial for me.

If you accept my suggestion, I can give you a patch for these three packages(chunga/drakma/hunchentoot).

With Best Regards,

At Sun, 26 May 2013 08:04:15 +0200,
Hans Hübner wrote:
...
[1  <text/plain; ISO-8859-1 (quoted-printable)>]
[2  <text/html; ISO-8859-1 (quoted-printable)>]
Jingtao,
please refer to http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7, it clearly
describes that a media type consists of exactly one type/subtype indicator followed by
optional attribute=value pairs.  The content type that you have presented is not valid
according to these rules.   Neither a lax parser like the one in CL-HTTP nor the fact
that a large site sends these bogus headers makes them valid.  I do not want to include
code in Hunchentoot that tries to interpret such bogus data.
However, if you cannot get your trading partner to fix their client, I can offer this
solution:
(defclass request-with-bad-content-type (hunchentoot:request)
  ())
(defmethod hunchentoot:header-in :around ((name (eql :content-type)) (request
request-with-bad-content-type))
  (alexandria:when-let (content-type (call-next-method))
    (ppcre:regex-replace-all "^([^/]+/[^/]+); *[^/]+/[^/;]+" content-type "\\1")))
You'll then have to use the :request-class argument to your acceptor instantiation to
make it use the request-with-bad-content-type class.  You also want to review the regular
expression carefully and maybe profile your application to see whether you need to cache
or otherwise improve performance.
-Hans
On Sun, May 26, 2013 at 5:07 AM, Jingtao Xu <jingtaozf@gmail.com> wrote:
Hi Hans,
I don't agree with you to say that this content type header is just bogus.
    As the content-type is sent by the largest B2B/B2C site in china, it
    must have a reason.
And if you try cl-http, you can find that cl-http will parse such
    content type correctly.
-----------------------------------------------------------------------------
    (parse-mime-content-type-header "application/x-www-form-urlencoded;
    text/html; charset=UTF-8")
       ==> (:APPLICATION :X-WWW-FORM-URLENCODED :CHARSET :UTF-8)
    -----------------------------------------------------------------------------
You can find the definition in cl-http/server/headers.lisp
    -----------------------------------------------------------------------------
    (define-header-type :content-type-header (:header)
      :parse-function parse-mime-content-type-header
      :print-function print-mime-content-type-header)
    -----------------------------------------------------------------------------
Even this content-type header is bogus(actually I don't think so),
    hunchentoot/drakma should parse
    the header without raising an error if one special variable like *
    accept-bogus-content-type* is true.
With Best Regards,
    jingtao.
On Sat, May 25, 2013 at 8:11 PM, Hans Hübner <hans.huebner@gmail.com> wrote:
    > Jingtao,
    >
    > the content-type header "application/x-www-form-urlencoded; text/html;
    > charset=UTF-8" is just bogus.  I do not want to include code that makes
    > Hunchentoot work with clearly broken clients.  Better error reporting would
    > be acceptable, though.
    >
    > -Hans
    >
    >
    > On Sat, May 25, 2013 at 12:38 PM, Jingtao Xu <jingtaozf@gmail.com> wrote:
    >>
    >> Hi all,
    >>
    >> I found the content type header which raise the bug in my message.log
    >> generated by hunchentoot.
    >> It happened when hunchentoot get following content type header:
    >>
    >>
    >>
    -----------------------------------------------------------------------------------------
    >> application/x-www-form-urlencoded; text/html; charset=UTF-8
    >>
    >>
    -----------------------------------------------------------------------------------------
    >>
    >> I noticed that in package drakma's file read.lisp,function
    >> 'get-content-type'
    >> also assumed "/" as a token separator.
    >>
    >> I hope package chunga/drakma/hunchentoot could accept such content type
    >> header
    >> without raising an exception,As Edl said,a new special variable
    >> similar to *accept-bogus-eols* or
    >> *treat-semicolon-as-continuation* which only assume " ,;" as token
    >> separator may be a good idea and will fix my question.
    >>
    >> Any way, RFC standard is not well fit with the read world.
    >>
    >> Thanks very much.
    >>
    >> WIth Best Regards,
    >> jingtao.
    >>
    >>
    >> On Thu, May 23, 2013 at 2:01 PM, Edi Weitz <edi@agharta.de> wrote:
    >> > I'm not the maintainer anymore, but my take is that if some Ruby or
    >> > Java client misinterprets the RFC I wouldn't change Hunchentoot's (or
    >> > rather Chunga's) default behavior because of that.  I'd rather
    >> > introduce a new special variable similar to *accept-bogus-eols* or
    >> > *treat-semicolon-as-continuation*.
    >> >
    >> > Just my .02 Euros,
    >> > Edi.
    >> >
    >> >
    >> >
    >> > On Thu, May 23, 2013 at 2:52 AM, Jingtao Xu <jingtaozf@gmail.com> wrote:
    >> >> Hi All,
    >> >>
    >> >> 1. The function `read-name-value-pair' is called by `
    >> >> parse-content-type' in hunchentoo/util.lisp,not by my codes.
    >> >> 2. the slash is a token constituent in java/ruby implementation,and I
    >> >> think some web client/server treat it as a token constituent too,
    >> >>     but I am waiting for the hunchentoot log to give us a live example.
    >> >>
    >> >> With Best Regards,
    >> >> jingtao
    >> >>
    >> >>
    >> >> On Wed, May 22, 2013 at 11:40 PM, Edi Weitz <edi@agharta.de> wrote:
    >> >>> If I'm not mistaken, the slash is a "separator" and thus not a token
    >> >>> constituent according to RFC 2616 which means "path=/foo" is not legal
    >> >>> input for READ-NAME-VALUE-PAIR.
    >> >>>
    >> >>> On Wed, May 22, 2013 at 5:27 PM, Ron Garret <ron@flownet.com> wrote:
    >> >>>> Very likely Jingtao's code is calling READ-NAME-VALUE-PAIR without
    >> >>>> being wrapped in this macro
    >> >>>>
    >> >>>> But there's still a bug in READ-NAME-VALUE-PAIR:
    >> >>>>
    >> >>>> ? (WITH-INPUT-FROM-VECTOR (S (MAP '(VECTOR (UNSIGNED-BYTE 8))
    >> >>>> 'CHAR-CODE "path=/foo"))
    >> >>>>   (chunga:with-character-stream-semantics
    >> >>>>       (CHUNGA:READ-NAME-VALUE-PAIR S)))
    >> >>>> ("path" . "")
    >> >>>>
    >> >>>> On May 22, 2013, at 8:19 AM, Edi Weitz wrote:
    >> >>>>
    >> >>>>> On Wed, May 22, 2013 at 4:18 PM, Ron Garret <ron@flownet.com> wrote:
    >> >>>>>> I found a bug in CHUNGA:READ-NAME-VALUE-PAIR.
    >> >>>>>
    >> >>>>> It's not quite clear to me yet what the bug is supposed to be.
    >> >>>>>
    >> >>>>> The documentation clearly says that calls to READ-NAME-VALUE-PAIR
    >> >>>>> and
    >> >>>>> friends must be wrapped with this macro:
    >> >>>>>
    >> >>>>>  http://weitz.de/chunga/#with-character-stream-semantics
    >> >>>>>
    >> >>>>> (You might argue that this isn't very user-friendly, but Chunga
    >> >>>>> wasn't
    >> >>>>> really intended to be used that way.)
    >> >>>>
    >> >>
    >
    >

Re: A bug in functon parse-content-type.

Jingtaqo Xu