Hi Hans,
Faré on #lisp pointed out that sb-ext:run-process has a number of race conditions and goes so far as to suggest that the whole SIGCHLD handling of child processes is pretty fundamentally broken. So, indeed, it does seem that this is the case. The good (?) news is that I can reliably trigger a failure using ab.
Sorry to blame hunchentoot. I'll take this up on sbcl-devel.
thanks,
Cyrus
On Dec 25, 2008, at 12:58 AM, Hans Huebner wrote:
Slyrus,
from looking at the failure description and the stack backtrace, I would think that the root cause of the problem is in the FreeBSD port of SBCL, maybe related to signal handling with subprocesses. I know that this is not a very satisfying response. Maybe you can try to verify that subprocess handling works properly in a multithreaded environment by spawning a lot of them from multiple threads, in a loop, and see if that triggers the same problem.
-Hans
On Wed, Dec 24, 2008 at 20:09, Cyrus Harmon ch-tbnl@bobobeach.com wrote:
FWIW,
here's a backtrace:
0: Foreign function ldb_monitor, fp = 0x2ad87f98, ra = 0x8055be3 1: Foreign function lose, fp = 0x2ad87fb8, ra = 0x80534ef 2: Foreign function handle_trap, fp = 0x2ad88028, ra = 0x80552c9 3: Foreign fp = 0x2ad8837c, ra = 0xbfbfff94 4: SB-KERNEL::INTERNAL-ERROR 5: Foreign function call_into_lisp, fp = 0x2ad88438, ra = 0x806041c 6: Foreign function funcall2, fp = 0x2ad88458, ra = 0x805113c 7: Foreign function interrupt_internal_error, fp = 0x2ad884a8, ra = 0x8053bef 8: Foreign function handle_trap, fp = 0x2ad88518, ra = 0x8055201 9: Foreign fp = 0x2ad88868, ra = 0xbfbfff94 10: SB-IMPL::REFILL-INPUT-BUFFER 11: SB-IMPL::INPUT-UNSIGNED-8BIT-BYTE 12: (SB-C::HAIRY-ARG-PROCESSOR CHUNGA::READ-CHAR*) 13: (SB-C::HAIRY-ARG-PROCESSOR CHUNGA::READ-LINE*) 14: (SB-C::HAIRY-ARG-PROCESSOR HUNCHENTOOT-CGI::HANDLE-CGI-SCRIPT) 15: HUNCHENTOOT::PROCESS-REQUEST 16: (SB-PCL::FAST-METHOD HUNCHENTOOT::PROCESS-CONNECTION (COMMON- LISP::T COMMON-LISP::T)) 17: (SB-PCL::FAST-METHOD HUNCHENTOOT::PROCESS-CONNECTION KEYWORD::AROUND (COMMON-LISP::T COMMON-LISP::T)) 18: (COMMON-LISP::FLET SB-THREAD::WITH-MUTEX-THUNK) 19: (COMMON-LISP::FLET WITHOUT-INTERRUPTS-BODY-[CALL-WITH-MUTEX]477) 20: SB-THREAD::CALL-WITH-MUTEX 21: (COMMON-LISP::LAMBDA ()) 22: Foreign function call_into_lisp, fp = 0x2ad88f58, ra = 0x806041c 23: Foreign function funcall0, fp = 0x2ad88f78, ra = 0x80510fe 24: Foreign function new_thread_trampoline, fp = 0x2ad88f98, ra = 0x8059985 25: Foreign function _pthread_create, fp = 0x2ad88fe8, ra = 0x2809d5cf
On Dec 24, 2008, at 8:57 AM, Cyrus Harmon wrote:
so, just when I managed to squash one persistent failure (running out of threads due to sb-ext:run-process not returning), I seem to have awoken another. I've got two, possibly unrelated, issues. First, the more benign one:
I see the following message quite often in the logs:
[2008-12-23 18:47:38 [ERROR]] Error while processing connection: The value 1058 is not of type
(MOD 1025). occasionally the message is:
[2008-12-23 18:46:06 [ERROR]] Error while processing connection: The value 1065 is not of type
(UNSIGNED-BYTE
10).
any idea where these might be coming from?
The second, more serious, problem is that I seem to end up in ldb after a few days of usage:
- Argh! corrupted error depth, halting
fatal error encountered in SBCL pid 60417(tid 134658560): %PRIMITIVE HALT called; the party is over.
Welcome to LDB, a low-level debugger for the Lisp runtime environment. ldb> Argh! corrupted error depth, halting fatal error encountered in SBCL pid 60417(tid 134659072): %PRIMITIVE HALT called; the party is over.
Welcome to LDB, a low-level debugger for the Lisp runtime environment. ldb> Argh! corrupted error depth, halting fatal error encountered in SBCL pid 60417(tid 136982272): %PRIMITIVE HALT called; the party is over.
Welcome to LDB, a low-level debugger for the Lisp runtime environment. ldb> >
This is a fairly unhelpful error, but I'm wondering if the above- mentioned error isn't tickling some bug in the sbcl/freebsd error handling code. This ldb droppage is fairly new, AFAICT. There have certainly been other failure modes, but this one is new and I'm wondering if might be due to recent changes in either SBCL or hunchentoot. In any event, I'll try to get some more info from ldb to see if I can track this down.
Cyrus
tbnl-devel site list tbnl-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/tbnl-devel
tbnl-devel site list tbnl-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/tbnl-devel
tbnl-devel site list tbnl-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/tbnl-devel