[drakma-devel] Unwanted url-encoding of GET parameters.

Hello folks, I'm trying to use drakma to fetch urls that contain utf8 characters but HTTP-REQUEST automatically url encodes any non latin-1 ascii characters. On my cursory reading of the RFCs, this seems conforming behavor, but in this case it is definitely unwanted. For example, the following url http://translate.google.com/translate_tts?tl=ru&q=вы if entered directly into the browser correctly returns the text-to-speech audo file but when attempting to use HTTP-REQUEST, the url is being url encoded into http://translate.google.com/translate_tts?tl=ru&q=%D0%B2%D1%8B of which google does not url-decode and fails to return the correct data. So, for this case, the url-encoding is unwanted. I am willing to submit patch an additional argument into HTTP-REQUEST to disallow the encoding. Thoughts? Thank you, William

You are not allowed to send arbitrary characters in the request line. FWIW, I just tried your example with Firefox and this is what the browser sends according to LiveHttpHeaders: http://translate.google.com/translate_tts?tl=ru&q=%D0%B2%D1%8B And Google returns the requested audio file. Edi. On Sat, Mar 10, 2012 at 1:08 AM, William Halliburton <whalliburton@gmail.com> wrote:
Hello folks,
I'm trying to use drakma to fetch urls that contain utf8 characters but HTTP-REQUEST automatically url encodes any non latin-1 ascii characters.
On my cursory reading of the RFCs, this seems conforming behavor, but in this case it is definitely unwanted.
For example, the following url
http://translate.google.com/translate_tts?tl=ru&q=вы
if entered directly into the browser correctly returns the text-to-speech audo file but
when attempting to use HTTP-REQUEST, the url is being url encoded into
http://translate.google.com/translate_tts?tl=ru&q=%D0%B2%D1%8B
of which google does not url-decode and fails to return the correct data.
So, for this case, the url-encoding is unwanted.
I am willing to submit patch an additional argument into HTTP-REQUEST to disallow the encoding.
Thoughts?
Thank you, William
_______________________________________________ drakma-devel mailing list drakma-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel

Thanks much. After some wiresharking, I found that google doesn't like the drakma user-agent and changing it to firefox did the trick. On Sat, Mar 10, 2012 at 2:46 AM, Edi Weitz <edi@agharta.de> wrote:
You are not allowed to send arbitrary characters in the request line. FWIW, I just tried your example with Firefox and this is what the browser sends according to LiveHttpHeaders:
http://translate.google.com/translate_tts?tl=ru&q=%D0%B2%D1%8B
And Google returns the requested audio file.
Edi.
On Sat, Mar 10, 2012 at 1:08 AM, William Halliburton <whalliburton@gmail.com> wrote:
Hello folks,
I'm trying to use drakma to fetch urls that contain utf8 characters but HTTP-REQUEST automatically url encodes any non latin-1 ascii characters.
On my cursory reading of the RFCs, this seems conforming behavor, but in this case it is definitely unwanted.
For example, the following url
http://translate.google.com/translate_tts?tl=ru&q=вы
if entered directly into the browser correctly returns the text-to-speech audo file but
when attempting to use HTTP-REQUEST, the url is being url encoded into
http://translate.google.com/translate_tts?tl=ru&q=%D0%B2%D1%8B
of which google does not url-decode and fails to return the correct data.
So, for this case, the url-encoding is unwanted.
I am willing to submit patch an additional argument into HTTP-REQUEST to disallow the encoding.
Thoughts?
Thank you, William
_______________________________________________ drakma-devel mailing list drakma-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
_______________________________________________ drakma-devel mailing list drakma-devel@common-lisp.net http://lists.common-lisp.net/cgi-bin/mailman/listinfo/drakma-devel
participants (3)
-
Edi Weitz
-
Edi Weitz
-
William Halliburton