Jeff Cunningham jeffrey@cunningham.net writes:
Has anyone else noticed excessive visitation of pages served dynamically by Hunchentoot? On a couple of small sites virtually every page gets hit every five minutes by http://www.google.com/bot.html, regular as clockwork.
Normally, I setup robots.txt to allow google to walk these sites but unless I can think of a way to tell them to back off to once a week or something I'm about to Disallow them. They're just little private sites and Google ends up being 99% of the traffic. It's silly.
--Jeff
I do have the same problem: my log-File is 26 MB now, most of the stuff is things like
... [2008-06-30 09:15:26 [NOTICE]] No session for session identifier '29894:F42573291C09820112231789C68D301A' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-30 09:15:26] 127.0.0.1 (66.249.72.235) - "GET /vws/select-event?hunchentoot-session=29894%3AF42573291C09820112231789C68D301A HTTP/1.1" 200 1783 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" [2008-06-30 09:17:20 [NOTICE]] No session for session identifier '30162:09996937F640DAF47EE8B2855A18D04F' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-30 09:17:20] 127.0.0.1 (66.249.72.235) - "GET /vws/select-event?hunchentoot-session=30162%3A09996937F640DAF47EE8B2855A18D04F HTTP/1.1" 200 1783 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" [2008-06-30 09:19:15 [NOTICE]] No session for session identifier '30110:EE7334ECC6F70F8D5E6D1FB65CE62636' (User-Agent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', IP: '127.0.0.1') [2008-06-30 09:19:15] 127.0.0.1 (66.249.72.235) - "GET /vws/select-event?hunchentoot-session=30110%3AEE7334ECC6F70F8D5E6D1FB65CE62636 HTTP/1.1" 200 1783 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" [2008-06-30 09:20:46 [NOTICE]] No session for session identifier '27870:BFFDF4E7BA37F6508E41566C1E84BE70' (User-Agent: 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X; de; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14', IP: '127.0.0.1') ...
Unfortunately, the site is much in use administrating courses at our institute (http://www.vorlesungsverwaltung.de). So I don't want to experiment at the moment. Maybe we should contact Google and ask what could be the reason. I did not do it up to now, mostly because I am quite short at time.
Yours, Nicolas