Hello,
I am trying to make Shibboleth authentification work for an application of mine. For this I am using the Apache2 Shibboleth module (my server sits behind Apache). Everything works fine, but for one thing. The authentification data is passed in the headers section in UTF-8 format (probably).
What I receive using (CDR (ASSOC :SN (HEADERS-IN*))) in my server is wrong, namely my lastname which is correctly spelled "Neuß" (a German sharp-s) is read as "Neuß" which looks as if UTF-8 is read as LATIN-1.
I don't quite know how this arises. I have
SB-IMPL::*DEFAULT-EXTERNAL-FORMAT* = :UTF-8
HUNCHENTOOT:*HUNCHENTOOT-DEFAULT-EXTERNAL-FORMAT* = #<FLEXI-STREAMS::FLEXI-UTF-8-FORMAT (:UTF-8 :EOL-STYLE :LF) {B1595E9}>
Can anyone give me a hint what I could try?
Thank you, Nicolas
P.S.: I am using Hunchentoot-1.1.0/SBCL 1.0.18 on this machine.
Hi Nicolas,
On Thu, Oct 7, 2010 at 13:35, Nicolas Neuss neuss@kit.edu wrote:
I am trying to make Shibboleth authentification work for an application of mine. For this I am using the Apache2 Shibboleth module (my server sits behind Apache). Everything works fine, but for one thing. The authentification data is passed in the headers section in UTF-8 format (probably).
Before going into any depth on this, can you have a look at the headers and find out in what character set they are encoded? HTTP headers must be encoded as ISO-8859-1. If a client wants to send data in different encoding, quoted-printable encoding as described in RFC 2047 must be used.
Thus, it would be a client error if you saw UTF-8 encoded data in headers. If the data is properly encoded using RFC 2047 rules, the behavior that you see would be a bug.
-Hans
Hans Hübner hans.huebner@gmail.com writes:
Hi Nicolas,
On Thu, Oct 7, 2010 at 13:35, Nicolas Neuss neuss@kit.edu wrote:
I am trying to make Shibboleth authentification work for an application of mine. For this I am using the Apache2 Shibboleth module (my server sits behind Apache). Everything works fine, but for one thing. The authentification data is passed in the headers section in UTF-8 format (probably).
Before going into any depth on this, can you have a look at the headers and find out in what character set they are encoded? HTTP headers must be encoded as ISO-8859-1. If a client wants to send data in different encoding, quoted-printable encoding as described in RFC 2047 must be used.
Thus, it would be a client error if you saw UTF-8 encoded data in headers. If the data is properly encoded using RFC 2047 rules, the behavior that you see would be a bug.
-Hans
Thanks Hans,
I think this helps to pin down the problem.
For information about headers: I have called the server information page while being Shibboleth authentificated and obtained the following:
http://ruprecht.mathematik.uni-karlsruhe.de/misc/info.html
I have the following in my Apache configuration file
<VirtualHost *:443> ... <Location /sso> AuthType shibboleth ShibRequestSetting requireSession 1 require valid-user </Location>
ProxyVia On ProxyPassInterpolateEnv On ProxyPass /sso http://127.0.0.1:8004 ProxyPassReverse /sso http://127.0.0.1:8004 RequestHeader set sn %{sn}e ... </VirtualHost>
Maybe the data in the sn environment variable is in UTF-8 format and is not encoded correctly for the header?
Nicolas
Hi Nicolas,
I'm sorry, but I don't know anything about "Shibboleth authentification" and not enough about Apache configuration to be able to help you from here.
On Thu, Oct 7, 2010 at 3:42 PM, Nicolas Neuss neuss@kit.edu wrote:
RequestHeader set sn %{sn}e ... </VirtualHost>
Maybe the data in the sn environment variable is in UTF-8 format and is not encoded correctly for the header?
Maybe, and if this would be the case, it would be the source of the problem.
-Hans
On 7 October 2010 15:42, Nicolas Neuss neuss@kit.edu wrote:
Nicolas,
I think that log from livehttpheaders Firefox plug-in/logging web proxy would be more useful. With that, you can check what is sent from the browser to the server directly in the HTTP layer.
Best regards,
Tomek Lipski
Tomasz Lipski tomek.lipski@gmail.com writes:
On 7 October 2010 15:42, Nicolas Neuss neuss@kit.edu wrote:
Nicolas,
I think that log from livehttpheaders Firefox plug-in/logging web proxy would be more useful. With that, you can check what is sent from the browser to the server directly in the HTTP layer.
Best regards,
Tomek Lipski
IIUC, that doesn't help. The result is:
Antwort-Header - https://ruprecht.mathematik.uni-karlsruhe.de/sso/test-shibboleth
Date: Thu, 07 Oct 2010 14:09:45 GMT Server: Hunchentoot 1.1.0 Content-Length: 4687 Content-Type: text/html; charset=utf-8 Via: 1.1 ruprecht.mathematik.uni-karlsruhe.de Keep-Alive: timeout=15, max=100 Connection: Keep-Alive
which shows that the encoding of the page shown in the browser should be correct.
Thanks, Nicolas
Nicolas,
I'd use Wireshark or another packet sniffing tool to capture the request as it is exchanged between Apache and Hunchentoot. In that capture, it would be obvious what encoding would be used for the headers. Lacking that, you could also try setting HUNCHENTOOT:*HEADER-STREAM* to *STANDARD-OUTPUT* or a stream of your liking and inspect its contents. Using an external sniffer would be safer in that you'd know that nothing has processed the headers, though.
Given that you already speculated at the wrong encoding used in an environment variable that Apache uses, why don't you change that to use proper RFC2047 encoding?
-Hans
On Thu, Oct 7, 2010 at 4:12 PM, Nicolas Neuss neuss@kit.edu wrote:
Tomasz Lipski tomek.lipski@gmail.com writes:
On 7 October 2010 15:42, Nicolas Neuss neuss@kit.edu wrote:
Nicolas,
I think that log from livehttpheaders Firefox plug-in/logging web proxy would be more useful. With that, you can check what is sent from the browser to the server directly in the HTTP layer.
Best regards,
Tomek Lipski
IIUC, that doesn't help. The result is:
Antwort-Header - https://ruprecht.mathematik.uni-karlsruhe.de/sso/test-shibboleth
Date: Thu, 07 Oct 2010 14:09:45 GMT Server: Hunchentoot 1.1.0 Content-Length: 4687 Content-Type: text/html; charset=utf-8 Via: 1.1 ruprecht.mathematik.uni-karlsruhe.de Keep-Alive: timeout=15, max=100 Connection: Keep-Alive
which shows that the encoding of the page shown in the browser should be correct.
Thanks, Nicolas
tbnl-devel site list tbnl-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/tbnl-devel
Hans Hübner hans.huebner@gmail.com writes:
Nicolas,
I'd use Wireshark or another packet sniffing tool to capture the request as it is exchanged between Apache and Hunchentoot. In that capture, it would be obvious what encoding would be used for the headers. Lacking that, you could also try setting HUNCHENTOOT:*HEADER-STREAM* to *STANDARD-OUTPUT* or a stream of your liking and inspect its contents. Using an external sniffer would be safer in that you'd know that nothing has processed the headers, though.
Given that you already speculated at the wrong encoding used in an environment variable that Apache uses, why don't you change that to use proper RFC2047 encoding?
-Hans
Hans,
thanks for the hints. I'll try it, but it may take a while.
As for why I don't teach Apache to properly encode, I simply don't know how:-) Googling didn't help.
Nicolas
P.S.: A simple preliminary solution might be to convert the string back myself using something like
(flexi-streams:octets-to-string (flexi-streams:string-to-octets *x* :external-format :latin1) :external-format :utf-8)