On Wed, Feb 20, 2008 at 08:09:18AM -0800, Jeff Cunningham wrote:
I am having a problem that I can't replicate on demand but has been happening with increasing frequency. I am hoping some of you might suggest ways to troubleshoot it. I don't know that it is a Hunchentoot problem, per se, but it may be an interaction problem between Hunchentoot and Apache2 via mod_lisp2.
The symptom is the server is hung first thing in the morning when I check it. The cpu is at 99% activity on the server image (sbcl). When I look in the error_log I see dozens of these:
[snip]
I had trouble like this in the past, and so far in every case it was due to me making a mistake in the handler and going into an infinite loop. I don't know if that's the case for you, but I had good luck debugging it by adding extra info to thread names so I could identify handlers that might be stuck.
http://xach.livejournal.com/132391.html has a writeup.
You can also troubleshoot by interrupting all threads and requesting their backtrace (best to write it to a file or save them to a table or something like that). That can also help identify problems.
Zach