Has anyone else noticed excessive visitation of pages served dynamically by Hunchentoot? On a couple of small sites virtually every page gets hit every five minutes by http://www.google.com/bot.html, regular as clockwork.
Normally, I setup robots.txt to allow google to walk these sites but unless I can think of a way to tell them to back off to once a week or something I'm about to Disallow them. They're just little private sites and Google ends up being 99% of the traffic. It's silly.
--Jeff
On Sun, Jun 29, 2008 at 04:31:47PM -0700, Jeff Cunningham wrote:
Has anyone else noticed excessive visitation of pages served dynamically by Hunchentoot? On a couple of small sites virtually every page gets hit every five minutes by http://www.google.com/bot.html, regular as clockwork.
Normally, I setup robots.txt to allow google to walk these sites but unless I can think of a way to tell them to back off to once a week or something I'm about to Disallow them. They're just little private sites and Google ends up being 99% of the traffic. It's silly.
FWIW, I haven't had any trouble like that with two different Hunchentoot sites.
Zach
Zach Beane wrote:
On Sun, Jun 29, 2008 at 04:31:47PM -0700, Jeff Cunningham wrote:
Has anyone else noticed excessive visitation of pages served dynamically by Hunchentoot? On a couple of small sites virtually every page gets hit every five minutes by http://www.google.com/bot.html, regular as clockwork.
Normally, I setup robots.txt to allow google to walk these sites but unless I can think of a way to tell them to back off to once a week or something I'm about to Disallow them. They're just little private sites and Google ends up being 99% of the traffic. It's silly.
FWIW, I haven't had any trouble like that with two different Hunchentoot sites.
Is there something you do to indicate that the page has a fixed date? I was wondering if maybe the fact that dynamically generated pages look "changed" to Google, so they get thrown back in the queue to be rechecked.
On Sun, Jun 29, 2008 at 05:01:06PM -0700, Jeff Cunningham wrote:
Is there something you do to indicate that the page has a fixed date? I was wondering if maybe the fact that dynamically generated pages look "changed" to Google, so they get thrown back in the queue to be rechecked.
Nope, I don't do anything like that.
Zach
Zach Beane wrote:
On Sun, Jun 29, 2008 at 05:01:06PM -0700, Jeff Cunningham wrote:
Is there something you do to indicate that the page has a fixed date? I was wondering if maybe the fact that dynamically generated pages look "changed" to Google, so they get thrown back in the queue to be rechecked.
Nope, I don't do anything like that.
Well, here's another hypothesis: maybe Google isn't happy with the response it is getting from my sites. My error log shows this sequence:
[2008-06-29 16:25:23 [INFO]] No session for session identifier '491:2BB8EA11136C90E6BA7D7F466951E370' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-29 16:25:23 [WARNING]] Warning while processing connection: Unexpected character , after <meta [2008-06-29 16:25:43 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {195190D9}>. [2008-06-29 16:27:27 [INFO]] No session for session identifier '481:C9244EC27C31213FFE797F9E2ABE1535' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-29 16:27:27 [WARNING]] Warning while processing connection: Unexpected character , after <meta
I've looked high and low for bad syntax involving generation of a meta tag, but it isn't there. I think it is an artifact of the timeout or something. Anyway, I'm wondering if the googlebot doesn't like the response my server is giving it, doesn't respond, the server waits, times out, then finally the googlebot gets back to it and by that time the session identifier is bad. Any thoughts?
--Jeff
Zach _______________________________________________ tbnl-devel site list tbnl-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/tbnl-devel
On Mon, Jun 30, 2008 at 2:15 AM, Jeff Cunningham jeffrey@cunningham.net wrote:
[2008-06-29 16:25:23 [INFO]] No session for session identifier '491:2BB8EA11136C90E6BA7D7F466951E370' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-29 16:25:23 [WARNING]] Warning while processing connection: Unexpected character , after <meta [2008-06-29 16:25:43 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {195190D9}>. [2008-06-29 16:27:27 [INFO]] No session for session identifier '481:C9244EC27C31213FFE797F9E2ABE1535' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-29 16:27:27 [WARNING]] Warning while processing connection: Unexpected character , after <meta
I've looked high and low for bad syntax involving generation of a meta tag, but it isn't there. I think it is an artifact of the timeout or something. Anyway, I'm wondering if the googlebot doesn't like the response my server is giving it, doesn't respond, the server waits, times out, then finally the googlebot gets back to it and by that time the session identifier is bad. Any thoughts?
I'd set HUNCHENTOOT:*HEADER-STREAM* to *STANDARD-OUTPUT* and *BREAK-ON-SIGNALS* to 'WARNING, then wait for the Googlebot request to come in. The headers printed to the console may give you a clue what the request looks like and maybe a way to initiate such a failing request yourself, maybe with Drakma or wget. You'll may also be able to get a clue from looking at the backtrace in a debugger.
I find it curious that Google retries every five minutes. Did you verify that the request is coming from a Google IP address? It may also be a prankster's script gone wild, in which case I'd block the IP address.
Or ignore the issue. The Internet _is_ silly, after all.
-Hans
Hans Hübner wrote:
I'd set HUNCHENTOOT:*HEADER-STREAM* to *STANDARD-OUTPUT* and *BREAK-ON-SIGNALS* to 'WARNING, then wait for the Googlebot request to come in. The headers printed to the console may give you a clue what the request looks like and maybe a way to initiate such a failing request yourself, maybe with Drakma or wget. You'll may also be able to get a clue from looking at the backtrace in a debugger.
Good suggestion, Hans. I found a contraction inside a meta content string which was demarked with single quotes.
I find it curious that Google retries every five minutes. Did you verify that the request is coming from a Google IP address? It may also be a prankster's script gone wild, in which case I'd block the IP address.
It is curious. I'd checked it right off - it is a Google IP address. Seems like a terrible waste of their bandwidth.
--Jeff
On 6/30/08, Jeff Cunningham jeffrey@cunningham.net wrote:
Has anyone else noticed excessive visitation of pages served dynamically by Hunchentoot?
My site is not very popular, and googlebot visits it from time to time. The shortest interval was 36 hours.
Stas Boukarev wrote:
On 6/30/08, Jeff Cunningham jeffrey@cunningham.net wrote:
Has anyone else noticed excessive visitation of pages served dynamically by Hunchentoot?
My site is not very popular, and googlebot visits it from time to time. The shortest interval was 36 hours.
Hmmm. Now I'm feeling real "special" and I'll be my sites are even less popular.
Normally, I setup robots.txt to allow google to walk these sites but unless I can think of a way to tell them to back off to once a week or something I'm about to Disallow them. They're just little private sites and Google ends up being 99% of the traffic. It's silly.
You can set the crawl rate on your site a couple of ways. I think (but am not sure) that Google supports the the Crawl-delay robots.txt directive. You can also set the googlebot rate for 90 days in the Webmaster tools.
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=4862...
BTW, I have not experienced this problem myself.
Cheers, Chris Dean
Jeff Cunningham jeffrey@cunningham.net writes:
Has anyone else noticed excessive visitation of pages served dynamically by Hunchentoot? On a couple of small sites virtually every page gets hit every five minutes by http://www.google.com/bot.html, regular as clockwork.
Normally, I setup robots.txt to allow google to walk these sites but unless I can think of a way to tell them to back off to once a week or something I'm about to Disallow them. They're just little private sites and Google ends up being 99% of the traffic. It's silly.
--Jeff
I do have the same problem: my log-File is 26 MB now, most of the stuff is things like
... [2008-06-30 09:15:26 [NOTICE]] No session for session identifier '29894:F42573291C09820112231789C68D301A' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-30 09:15:26] 127.0.0.1 (66.249.72.235) - "GET /vws/select-event?hunchentoot-session=29894%3AF42573291C09820112231789C68D301A HTTP/1.1" 200 1783 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" [2008-06-30 09:17:20 [NOTICE]] No session for session identifier '30162:09996937F640DAF47EE8B2855A18D04F' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-30 09:17:20] 127.0.0.1 (66.249.72.235) - "GET /vws/select-event?hunchentoot-session=30162%3A09996937F640DAF47EE8B2855A18D04F HTTP/1.1" 200 1783 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" [2008-06-30 09:19:15 [NOTICE]] No session for session identifier '30110:EE7334ECC6F70F8D5E6D1FB65CE62636' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-30 09:19:15] 127.0.0.1 (66.249.72.235) - "GET /vws/select-event?hunchentoot-session=30110%3AEE7334ECC6F70F8D5E6D1FB65CE62636 HTTP/1.1" 200 1783 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" [2008-06-30 09:20:46 [NOTICE]] No session for session identifier '27870:BFFDF4E7BA37F6508E41566C1E84BE70' (User-Agent: 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X; de; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14', IP: '127.0.0.1') ...
Unfortunately, the site is much in use administrating courses at our institute (http://www.vorlesungsverwaltung.de). So I don't want to experiment at the moment. Maybe we should contact Google and ask what could be the reason. I did not do it up to now, mostly because I am quite short at time.
Yours, Nicolas
I've cleaned up a couple small problems using page validators but the net result has been that the Googlebot has gone from visiting my site every minute or so to every second or so. No kidding.
Before I block them altogether, there is one thing I don't understand that I'm hoping someone can explain to me. What does it mean exactly when I get a "No session for session identifier" INFO message in my error_log? There is one of these for each of the Googlebot hits.
--Jeff
Here is a snippet of the access_log (followed by the corresponding error_log snippet):
127.0.0.1 (66.249.66.195) - [2008-07-04 07:39:32] "GET /zippy.html?hunchentoot-session=482%3A1424DD49911CD85 E35CE168E602686DC HTTP/1.1" 200 4284 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot .html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:39:32] "GET /robogames-2008.html?hunchentoot-session=483%3AFBDE26 FEB3370A5DC63A565E2611DD5A HTTP/1.1" 200 12724 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.goog le.com/bot.html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:39:33] "GET /robogames-2008.html?hunchentoot-session=482%3A1424DD 49911CD85E35CE168E602686DC HTTP/1.1" 200 12724 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.goog le.com/bot.html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:39:34] "GET /zippy.html?hunchentoot-session=485%3A97280661B72EBFC 07C48B8F100FCB60C HTTP/1.1" 200 4284 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot .html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:39:34] "GET /robogames-2008.html?hunchentoot-session=456%3A211AA2 04AD4D3A3703E0FA80DF3D0D51 HTTP/1.1" 200 12724 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.goog le.com/bot.html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:39:35] "GET /zippy.html?hunchentoot-session=289%3A8676A2453C327D6 10DDB4F79A48A61F6 HTTP/1.1" 200 4284 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot .html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:39:36] "GET /zippy.html?hunchentoot-session=177%3A6EF66F58D2E0A59 19AACB700DDB5D435 HTTP/1.1" 200 4284 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot .html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:39:37] "GET /robogames-2008.html?hunchentoot-session=464%3AC5740E 8F90480BCF319EFCD9C7A8EE88 HTTP/1.1" 200 12724 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.goog le.com/bot.html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:42:39] "GET /zippy.html?hunchentoot-session=483%3AFBDE26FEB3370A5 DC63A565E2611DD5A HTTP/1.1" 200 4284 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot .html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:46:28] "GET /zippy.html?hunchentoot-session=478%3AB00C2E319A15E56 58D72E232C36F30DD HTTP/1.1" 200 4284 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot .html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:50:18] "GET /zippy.html?hunchentoot-session=230%3A85588DC64DDDBF9 219B71F4DFA746B50 HTTP/1.1" 200 4284 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot .html)" 127.0.0.1 (66.249.66.195) - [2008-07-04 07:54:08] "GET /zippy.html?hunchentoot-session=145%3A4C7B146A126FC14 ACFC7BFA8100C07D8 HTTP/1.1" 200 4284 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot .html)"
HERE'S THE CORRESPONDING ERROR LOG ENTRIES:
[2008-07-04 07:39:32 [INFO]] No session for session identifier '483:FBDE26FEB3370A5DC63A565E2611DD5A' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:39:33 [INFO]] No session for session identifier '482:1424DD49911CD85E35CE168E602686DC' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:39:34 [INFO]] No session for session identifier '485:97280661B72EBFC07C48B8F100FCB60C' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:39:34 [INFO]] No session for session identifier '456:211AA204AD4D3A3703E0FA80DF3D0D51' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:39:35 [INFO]] No session for session identifier '289:8676A2453C327D610DDB4F79A48A61F6' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:39:36 [INFO]] No session for session identifier '177:6EF66F58D2E0A5919AACB700DDB5D435' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:39:37 [INFO]] No session for session identifier '464:C5740E8F90480BCF319EFCD9C7A8EE88' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:39:57 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {133B61D9}>. [2008-07-04 07:42:38 [INFO]] No session for session identifier '483:FBDE26FEB3370A5DC63A565E2611DD5A' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:42:59 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {13C504E1}>. [2008-07-04 07:46:28 [INFO]] No session for session identifier '478:B00C2E319A15E5658D72E232C36F30DD' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:46:48 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {13C740D1}>. [2008-07-04 07:50:18 [INFO]] No session for session identifier '230:85588DC64DDDBF9219B71F4DFA746B50' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:50:38 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {13C50F59}>. [2008-07-04 07:54:08 [INFO]] No session for session identifier '145:4C7B146A126FC14ACFC7BFA8100C07D8' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:54:28 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {13C84E89}>. [2008-07-04 07:57:59 [INFO]] No session for session identifier '478:B00C2E319A15E5658D72E232C36F30DD' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 07:58:19 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {13C5BF61}>. [2008-07-04 08:01:49 [INFO]] No session for session identifier '493:640F2D5F184258D62527F8BD4AA0B91B' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 08:02:09 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {13C735C9}>. [2008-07-04 08:05:40 [INFO]] No session for session identifier '308:FA7CD1A4C4592F69DA33BE6826383D46' (User- Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-07-04 08:06:00 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {13C624A9}>.
On 7/4/08, Jeff Cunningham jeffrey@cunningham.net wrote:
Before I block them altogether, there is one thing I don't understand that I'm hoping someone can explain to me. What does it mean exactly when I get a "No session for session identifier" INFO message in my error_log? There is one of these for each of the Googlebot hits.
It means that googlebot presented a session identifier string as a hunchentoot-session parameter that is not valid. You are propably using sessions very frequently and the Google crawler managed to hit one of the URLs of your server that starts a session. As the crawler did not accept the Cookie that Hunchentoot sent, Hunchentoot fell back to attaching the session identifier to all URLs in the outgoing HTML as a parameter. The crawler saved the URLs it saw including the session identifier and now tries to crawl using these identifiers, which are propably old and no longer valid.
First off, I would recommend that you switch of URL-REWRITE (http://weitz.de/hunchentoot/#*rewrite-for-session-urls*). I am not using it myself precisely because it confuses simple crawlers. If a user does not accept the cookies my site sends, they will not be able to use it with sessions. For me, this has never been a problem. This will propably not help you with your current problem, but it will make things easier in the future.
In general, crawlers do not support cookies or session ids in GET parameters. Thus, if you want to support crawlers, you need to make them work without sessions. Note that if you just do nothing except switching off URL-REWRITE; every request from a crawler will create a new session. This may or may not be a problem.
I guess that Google now has a lot of your URLs it wants to crawl because the different session identifiers made it think that all of them are pointing to different resource. I am kind of wondering whether that is standard googlebot behaviour.
Lastly, I would vote for switching off URL-REWRITE by default.
-Hans
Hans Hübner wrote:
It means that googlebot presented a session identifier string as a hunchentoot-session parameter that is not valid. You are propably using sessions very frequently and the Google crawler managed to hit one of the URLs of your server that starts a session. As the crawler did not accept the Cookie that Hunchentoot sent, Hunchentoot fell back to attaching the session identifier to all URLs in the outgoing HTML as a parameter. The crawler saved the URLs it saw including the session identifier and now tries to crawl using these identifiers, which are propably old and no longer valid.
First off, I would recommend that you switch of URL-REWRITE (http://weitz.de/hunchentoot/#*rewrite-for-session-urls*). I am not using it myself precisely because it confuses simple crawlers. If a user does not accept the cookies my site sends, they will not be able to use it with sessions. For me, this has never been a problem. This will propably not help you with your current problem, but it will make things easier in the future.
In general, crawlers do not support cookies or session ids in GET parameters. Thus, if you want to support crawlers, you need to make them work without sessions. Note that if you just do nothing except switching off URL-REWRITE; every request from a crawler will create a new session. This may or may not be a problem.
I guess that Google now has a lot of your URLs it wants to crawl because the different session identifiers made it think that all of them are pointing to different resource. I am kind of wondering whether that is standard googlebot behaviour.
Lastly, I would vote for switching off URL-REWRITE by default.
Thanks for the excellent explanation. It fits all the available facts. I've turned off *REWRITE-FOR-SESSION-URLS* so presumably, google should eventually out that the URL's it has are bad and drop them in favor of the sessionless ones (I hope).
I switched to a non-googlebotted site to experiment with and for some reason even when I'm not using sessions, I see a message about "No session for session identifier..." when I browse a page myself. I cleared my cache, here's an example:
[2008-07-04 14:46:34 [WARNING]] Fake session identifier '1:D5C66E2968BE2162C3164 B39B9029F13' (User-Agent: 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.14 ) Gecko/20080404 Iceweasel/2.0.0.14 (Debian-2.0.0.14-2)', IP: '127.0.0.1')
That error message corresponds to this access log entry and this header output:
127.0.0.1 (192.168.1.1) - [2008-07-04 14:46:34] "GET / HTTP/1.1" 200 9195 "-" "M ozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.14) Gecko/20080404 Iceweasel/2 .0.0.14 (Debian-2.0.0.14-2)"
GET / HTTP/1.1
Host: 127.0.0.1:4242
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.14) Gecko/20080404 Iceweasel/2.0.0.14\ (Debian-2.0.0.14-2)
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/\ *;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Cookie: hunchentoot-session=1%3AD5C66E2968BE2162C3164B39B9029F13
Max-Forwards: 10
X-Forwarded-For: 192.168.1.1
X-Forwarded-Host: cunningham.homeip.net
X-Forwarded-Server: test.com
Connection: Keep-Alive
HTTP/1.1 200 OK
Content-Length: 9195^M
Date: Fri, 04 Jul 2008 21:46:34 GMT^M Server: Hunchentoot 1.0.0^M
Keep-Alive: timeout=20^M
Connection: Keep-Alive^M
Content-Type: text/html; charset=iso-8859-1^M
--Jeff
Never mind on the last question, I still had a cookie set in the browser which it was sending back to the server. I thought I'd cleared it. Sorry about the noise.
--Jeff