On Nov 16, 2008, at 11:08 AM, Hans Hübner wrote:
Hi Cyrus,
when you see worker threads accumulate, do you also see that there is a large number of connections to Hunchentoot that are not being closed? Or are there just threads, but no connections? Are you running Hunchentoot behind a proxy/frontend, or standalone? How many dead workers did you see?
So, how do I see if the connections are still open? It's only two so far, but here's what it looks like:
CL-USER> (sb-thread:list-all-threads) (#<SB-THREAD:THREAD "repl-thread" RUNNING {5CBD0411}> #<SB-THREAD:THREAD "reader-thread" RUNNING {5CBD02C1}> #<SB-THREAD:THREAD "control-thread" RUNNING {5CBD0169}> #<SB-THREAD:THREAD "Hunchentoot worker (client: 66.255.53.123:35892)" RUNNING {5CDF8E91}> #<SB-THREAD:THREAD "Hunchentoot worker (client: 66.255.53.123:34650)" RUNNING {5CD329A1}> #<SB-THREAD:THREAD "auto-flush-thread" RUNNING {5CBBFF79}> #<SB-THREAD:THREAD "Hunchentoot acceptor (*:443)" RUNNING {5A5C7019}> #<SB-THREAD:THREAD "Hunchentoot acceptor (*:80)" RUNNING {5A3D80B9}> #<SB-THREAD:THREAD "Swank 4005" RUNNING {5A2304A1}> #<SB-THREAD:THREAD "initial thread" RUNNING {598D7BE1}>)
Here's a backtrace from thread 4:
0: (SB-DEBUG::MAP-BACKTRACE #<CLOSURE (LAMBDA #) {5A4E49D5}>)[:EXTERNAL] 1: (BACKTRACE 536870911 #<SYNONYM-STREAM :SYMBOL *TERMINAL-IO* {5812FA99}>) 2: ((FLET SB-UNIX::INTERRUPTION)) 3: ((FLET #:WITHOUT-INTERRUPTS-BODY-[INVOKE-INTERRUPTION]10)) 4: (SB-SYS:INVOKE-INTERRUPTION #<FUNCTION (FLET SB-UNIX::INTERRUPTION) {58049035}>) 5: ("foreign function: call_into_lisp") 6: ("foreign function: post_signal_tramp") 7: ("foreign function: nanosleep") 8: ((LAMBDA ())) 9: (SB-C::INVOKE-WITH-SAVED-FP-AND-PC #<CLOSURE (LAMBDA #) {5A4D929D}>) 10: (SB-UNIX:NANOSLEEP 0 150000016) 11: (SLEEP 0.15) 12: (SWANK-BACKEND::FLUSH-STREAMS) 13: (SWANK-BACKEND::FLUSH-STREAMS)[:EXTERNAL] 14: ((FLET SB-THREAD::WITH-MUTEX-THUNK)) 15: ((FLET #:WITHOUT-INTERRUPTS-BODY-[CALL-WITH-MUTEX]477)) 16: (SB-THREAD::CALL-WITH-MUTEX #<CLOSURE (FLET SB-THREAD::WITH-MUTEX-THUNK) {29E3DDE5}> #S(SB-THREAD:MUTEX :NAME "thread result lock" :%OWNER #<SB-THREAD:THREAD "auto-flush-thread" RUNNING {5CBBFF79}> :STATE 1) #<SB-THREAD:THREAD "auto-flush-thread" RUNNING {5CBBFF79}> T) 17: ((LAMBDA ())) 18: ("foreign function: call_into_lisp") 19: ("foreign function: funcall0") 20: ("foreign function: new_thread_trampoline") 21: ("foreign function: _pthread_create")
Also, I don't know if it's related, but I see lots of these messages in the log:
[2008-11-17 06:59:47 [ERROR]] Error while processing connection: I/O timeout reading #<SB-SYS:FD-STREAM for "a socket" {5A465A29}>.
Thanks for taking a look at this!
Cyrus
NB: I do certainly think that it is possible to write threaded web servers that work, I just believe that it is hard, in many respects. Threads, in my personal opinion, are not the best hammer to solve the I/O multiplexing problem that needs to be solved in a web server. Certainly, the unbounded worker thread creation strategy that Hunchentoot uses is not suitable for servers that see load peaks, which is why I recommend not using threaded Hunchentoot for such sites.
That said: I do have stability issues with my non-threaded Hunchentoot installation on FreeBSD, too. I use a multi threaded SBCL, but run Hunchentoot in a single thread behind squid. In some situations, no new connections are being accepted for no apparent reasons, but I failed to properly analyze the problem last time it happened as the customer was already nervous. I have seen this happen two times in the last four months, on a moderately busy site. Thus, the problem may actually not be related to Hunchentoot's threads usage (I'm running non-threaded, you run threaded), but could as well be located in SBCL's thread implementation of FreeBSD, in usocket or somewhere else.
Any further data points would help getting down to the bottom of this.
As for future work on Hunchentoot: We do have the new connection manager class in place which is meant to support the implementation of thread pools. Thread pools would help putting limits on the number of threads created, helping with getting through load peaks. I do not personally need such a connection manager, but rather want to spend some time on making Hunchentoot be able to use single threaded I/O multiplexing using select/kpoll/whatever.
-Hans
On Sun, Nov 16, 2008 at 18:03, Cyrus Harmon ch-tbnl@bobobeach.com wrote:
I seem to recall that, a long time ago, TBNL/hunchentoot required threads. At some point Edi rigged up a singled threaded version but claimed that that was just for development/testing work and shouldn't be used in the real world. Now I understand that Hans has an aversion to threads as the multiplexing abstraction for webservers and that single threaded is "the one true way." I bring this up as background to me real problem, which is that, at least on FreeBSD, I seem to have a number of zombie threads that stick around after requests are made. Before I dig into the problem, I'd be curious to hear what folks' recommendations for running a low-traffic, but nevertheless hopefully reliable, site are. I had reasonably good luck with SBCL+threads and hunchentoot before the big rewrite, but since then, it has been reliable refusing to accept new connections after a week or two of use.
Everything starts out fine and I can field some traffic and all is good:
CL-USER> (sb-thread:list-all-threads) (#<SB-THREAD:THREAD "reader-thread" RUNNING {59B6B241}> #<SB-THREAD:THREAD "repl-thread" RUNNING {59B6B0F1}> #<SB-THREAD:THREAD "control-thread" RUNNING {59B60359}> #<SB-THREAD:THREAD "auto-flush-thread" RUNNING {59B601D9}> #<SB-THREAD:THREAD "initial thread" RUNNING {598D79B1}>)
However, after some period, I see an accumulation of worker threads that won't die and eventually I'm unable to start a new thread to field the request. Who knows where the problem is... Could be in hunchentoot proper, in usocket, in SBCL's FreeBSD threads, etc... At the moment I don't have any debugging info on the problem, as all seems to be working ok since I restarted the web server a few minutes ago. The next time the problem crops up, I'll reply to this with some debugging information. This seems to be the one major "regression" from the old hunchentoot that has me considering going back to the stable release (although I've already ported my code forward for the new API, so I'd hate to have to do that). Better yet would be to either track down and fix the problem or to be convinced that this is just broken and is never going to work and figure out how to make my stuff work in a single threaded lisp.
One more question regarding threads, is it possible to use a threaded lisp to act like the single threaded server does? That is to use threads, but to only have one thread doing the hard work, using serve- event for the multiplexing? My initialization code relies on setting some state after starting to listen for requests, so I can't just flip the switch to single-threaded mode and have everything work out of the box.
Thanks for all of the hunchentoot rewrite efforts! On the whole it seems like a good thing, but fixing this issue would make me a much happier hunchentoot user.
Cyrus
tbnl-devel site list tbnl-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/tbnl-devel
tbnl-devel site list tbnl-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/tbnl-devel