On Tue, 29 Nov 2005 00:18:08 +0200, Ignas Mikalajunas ignas.mikalajunas@gmail.com wrote:
Content length is calculated by calling (length content) which produces wrong results with unicode characters in the string. Piso on #lisp proposed a solution - using (length (string-to-octets string :external-format :utf-8)) which translates to just (length (string-to-octets string :external-format)) in the code.
I won't do that because it's most likely a terrible performance hog if you convert each page to octets be default (assuming that most users already send octets).
I also don't understand why
(length (string-to-octets string :external-format :utf-8))
translates to
(length (string-to-octets string :external-format))
The true way to solve this would be using (file-string-length), but the function is not working properly on sbcl yet.
Huh? How is that supposed to work (even if it would work on SBCL)? *TBNL-STREAM* is a binary stream which accepts octets, isn't it?
So could you please fix the (send-output),
IMHO there's nothing to "fix" because TBNL works as expected. The docs clearly say that you're supposed to send octets, see for example here:
Note that the UTF-8 example that comes with TBNL sends a correct header.
FWIW, I've just released a new version where you can manually set the CONTENT-LENGTH slot of the REPLY object. If it is not NIL TBNL won't bother to compute the content length so you can set it to any value you want. Note, though, that you'll run into trouble w.r.t. TBNL/Apache interaction if you set a wrong value there.
because with current setup browsers that strictly adhere to the content-lenght (IE 6.0, Opera) would trim 1 character of the responses body for each UTF-8 character in it.
Nope, that's not how UTF-8 works.
Cheers, Edi.