I'm using url-encode to encode strings that I use for keys (see e.g. http://octopodial-chrome.com/tasting-notes/beer/Fuller%26rsquo%3Bs/2000%20Vintage%20Ale; the brewer and beer name strings are both Unicode/HTML and contain characters meaningful in a URL). While running my pages through the w3c validator, I discovered that url-encode doesn't encode apostrophes ('), so now I run my strings through (cl-who:encode-string (hunchentoot:url-encode string)); is this the appropriate way to do things?
It might be: url-encode turns a string into a string suitable for a URL and encode-string turns a string into a string suitable for an HTML attribute value. Still, it seems a bit...complex.
While running my pages through the w3c validator, I discovered that url-encode doesn't encode apostrophes ('), so now I run my strings through (cl-who:encode-string (hunchentoot:url-encode string)); is this the appropriate way to do things?
I'm trying to follow along here, but I wasn't aware of cl-who had an encode-string function. What version are you using?
Second question: why does the apostrophe need to be encoded? Is it that the w3c validator wants it to be?
Cheers, Chris Dean
On Fri, 21 Dec 2007 13:57:15 -0700, Robert Uhl eadmund42@gmail.com wrote:
I'm using url-encode to encode strings that I use for keys (see e.g. http://octopodial-chrome.com/tasting-notes/beer/Fuller%26rsquo%3Bs/2000%20Vintage%20Ale; the brewer and beer name strings are both Unicode/HTML and contain characters meaningful in a URL). While running my pages through the w3c validator, I discovered that url-encode doesn't encode apostrophes ('), so now I run my strings through (cl-who:encode-string (hunchentoot:url-encode string)); is this the appropriate way to do things?
It might be: url-encode turns a string into a string suitable for a URL and encode-string turns a string into a string suitable for an HTML attribute value. Still, it seems a bit...complex.
It seems complex, but it is the right way to do it. FWIW, the CL-WHO function is called ESCAPE-STRING, not ENCODE-STRING, and that shows its intent more clearly.
As you said, you URL-encode the string to make it suitable for a URL. You might want to use the result for a header value in an HTTP reply, or as a URL you're giving to a client like Drakma. That's fine. But if you want to put the URL-encoded string into an HTML page, then the HTML rules apply, and you might have to escape the string in order not to create conflicts. That's life... :)
Still, I'm surprised to see parts like "r%26rsquo%3Bs" in your example URL. With recent Hunchentoot and CL-WHO I get this:
CL-USER 4 > (hunchentoot:url-encode "Fuller's") "Fuller's"
CL-USER 5 > (cl-who:escape-string *) "Fuller's"
And if I put a link like
<a href="http://weitz.de/foo?a=Fuller's">click me</a>
into an HTML file, then Firefox will go to
http://weitz.de/foo?a=Fuller%27s
if I click the link. Your example almost looks as if you have it the other way around.
Edi.
On Sat, 22 Dec 2007 04:03:53 +0100, Edi Weitz edi@agharta.de wrote:
FWIW, the CL-WHO function is called ESCAPE-STRING
And, I forgot to mention, you don't need CL-WHO for this:
http://weitz.de/hunchentoot/#escape-for-html
Edi.
Edi Weitz edi@agharta.de writes:
And, I forgot to mention, you don't need CL-WHO for this:
Ah, thanks. I'm using CL-WHO and Hunchentoot, but in this particular place within my code it's a lot more logical to use Hunchentoot. I hadn't found ESCAPE-FOR-HTML because I was searching for 'encode' rather than 'escape' (silly me).
Thanks again for the good work. I couldn't have gotten where I am without it.
Edi Weitz edi@agharta.de writes:
It seems complex, but it is the right way to do it. FWIW, the CL-WHO function is called ESCAPE-STRING, not ENCODE-STRING, and that shows its intent more clearly.
Doh! Typing from memory. Because of course with a computer capable of billions of operations per second it's just too painful to look up these minor details *grin*
As you said, you URL-encode the string to make it suitable for a URL. You might want to use the result for a header value in an HTTP reply, or as a URL you're giving to a client like Drakma. That's fine. But if you want to put the URL-encoded string into an HTML page, then the HTML rules apply, and you might have to escape the string in order not to create conflicts. That's life... :)
Yeah, it makes sense--just a bit surprising I guess.
Still, I'm surprised to see parts like "r%26rsquo%3Bs" in your example URL. With recent Hunchentoot and CL-WHO I get this:
CL-USER 4 > (hunchentoot:url-encode "Fuller's") "Fuller's"
CL-USER 5 > (cl-who:escape-string *) "Fuller's"
The '’' is part of the original string (for hysterical raisins all strings in the database are HTML strings); it then gets encoded into %26rsquo%3B by URL-ENCODE; OTOH when I have an ASCII apostrophe (_not_ a right single quote) then it gets passed as-is by URL-ENCODE but gets escaped by ESCAPE-STRING.
On Fri, 21 Dec 2007 21:30:53 -0700, Robert Uhl eadmund42@gmail.com wrote:
OTOH when I have an ASCII apostrophe (_not_ a right single quote) then it gets passed as-is by URL-ENCODE
Yes, that's correct behaviour. You can test it here:
http://www.albionresearch.com/misc/urlencode.php
Edi.