Not sure if this is going to get threading hooked up correctly as I just joined the list.
Recently when I evaluate a relatively big function in slime (more that 6 lines of code) I lose connection with swank. I tried Emacs with different versions and slime with different versions, so I don't know where else I can look for an error. Could it be something with network on this particular workstation?
I had this same behavior when evaluating a ~30k section of code. After a bit of debugging , I found that
My setup: sbcl 1.0.29 (both binary x86 and local built 64bit - no threads enabled) slime/swank from git (or was that cvs?)
1) Threads were not available 2) the read-packet was getting interrupted by another swank event, it would read the packet length from the stream, but the first event had not finished yet so it was reading random characters in the first event to determine the length of the next packet and it crashed in parse-integer 3) wait-for-event was being recalled before the event had completely decoded. 4) I had turned logging on and I had seen (READ: 003999 - 6 digit packet length) (wait-for-event: ~s ~s~%) (READ: strjunk) being printed, instead of (READ:003999) (READ:validpacket). (I added some other logging points as well) 5) I think the problem was somewhere in wait-for-event/event-loop - but I didn't totally understand the code enough (and didn't have time enough) do debug any further. 6) I enabled threads in sbcl and the problem went away.
Sorry I could not provide a patch or more information. If you have something for me to try, please let me know and I will test. Kelly McDonald
* Kelly McDonald [2009-06-26 02:03+0200] writes:
- Threads were not available
- the read-packet was getting interrupted by another swank event, it
would read the packet length from the stream, but the first event had not finished yet so it was reading random characters in the first event to determine the length of the next packet and it crashed in parse-integer 3) wait-for-event was being recalled before the event had completely decoded.
In theory, interrupts are queued during wait-for-event and processed at the next safe point. If the queue grows to large (3) we invoke the debugger immediately. In other words, the user needs to press C-c C-b three times in a row to get a partially decoded packet. At least that's the theory.
Helmut.
Helmut, Thanks for your reply. I believe what is happening is that read-sequence is running out of bytes to read from the socket, resulting in a top-level error, causing it to go back to process-requests which tries to decode-event on a socket that has a partially read packet queued up.
I'm guessing that there is something that needs to be done to turn the socket into a somewhat friendlier stream in swank-sbcl/socket-fd so that you don't get the toplevel error when read-sequence out-runs the socket. - but that is just a guess.
any pointers would be welcome!
Thanks, Kelly
On Fri, Jun 26, 2009 at 11:39 AM, Helmut Ellerheller@common-lisp.net wrote:
- Kelly McDonald [2009-06-26 02:03+0200] writes:
- Threads were not available
- the read-packet was getting interrupted by another swank event, it
would read the packet length from the stream, but the first event had not finished yet so it was reading random characters in the first event to determine the length of the next packet and it crashed in parse-integer 3) wait-for-event was being recalled before the event had completely decoded.
In theory, interrupts are queued during wait-for-event and processed at the next safe point. If the queue grows to large (3) we invoke the debugger immediately. In other words, the user needs to press C-c C-b three times in a row to get a partially decoded packet. At least that's the theory.
Helmut.
slime-devel site list slime-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/slime-devel
* Kelly McDonald [2009-06-27 12:50+0200] writes:
Helmut, Thanks for your reply. I believe what is happening is that read-sequence is running out of bytes to read from the socket, resulting in a top-level error, causing it to go back to process-requests which tries to decode-event on a socket that has a partially read packet queued up.
I'm guessing that there is something that needs to be done to turn the socket into a somewhat friendlier stream in swank-sbcl/socket-fd so that you don't get the toplevel error when read-sequence out-runs the socket. - but that is just a guess.
any pointers would be welcome!
Judging from your backtrace and assuming you have *communication-style* set to :fd-handler, it looks indeed like read-sequence is running out of bytes and then calls sb-sys:wait-until-fd-usable and then serve-event which recursively calls Swank's fd-handler.
I think we have no other choice than to prevent our fd-handlers from be called recursively. Can you reproduce the problem and try if it goes away with the following modification in swank-sbcl.lisp?
(defimplementation add-fd-handler (socket fn) (declare (type function fn)) (let ((fd (socket-fd socket)) (handler nil)) (labels ((add () (setq handler (sb-sys:add-fd-handler fd :input #'run))) (run (fd) (sb-sys:remove-fd-handler handler) ; prevent recursion (unwind-protect (funcall fn) (when (sb-unix:unix-fstat fd) ; still open? (add))))) (add))))
Helmut.
Sorry about that-
Forgot to append the stack trace
(14 (SWANK::PROCESS-REQUESTS T)) (15 (SB-IMPL::SUB-SUB-SERVE-EVENT NIL NIL)) (16 (SB-IMPL::SUB-SERVE-EVENT NIL NIL NIL)) (17 (SB-SYS:WAIT-UNTIL-FD-USABLE 5 :INPUT NIL)) (18 (SB-IMPL::REFILL-INPUT-BUFFER #<SB-SYS:FD-STREAM for "a socket" {1003DF1011}>)) (19 (SB-IMPL::INPUT-CHAR/LATIN-1 #<SB-SYS:FD-STREAM for "a socket" {1003DF1011}> NIL :EOF)) (20 (SB-IMPL::ANSI-STREAM-READ-CHAR #<SB-SYS:FD-STREAM for "a socket" {1003DF1011}> NIL :EOF #<unavailable argument>)) (21 (SB-IMPL::ANSI-STREAM-READ-SEQUENCE ..)) (22 (READ-SEQUENCE ..)[:EXTERNAL]) (23 (SWANK::READ-CHUNK #<SB-SYS:FD-STREAM for "a socket" {1003DF1011}> 31591)) (24 (SWANK::READ-PACKET #<SB-SYS:FD-STREAM for "a socket" {1003DF1011}>)) (25 (SWANK::DECODE-MESSAGE #<SB-SYS:FD-STREAM for "a socket" {1003DF1011}>)) (26 (SWANK::WAIT-FOR-EVENT/EVENT-LOOP (OR (:EMACS-REX . SWANK::_) (:EMACS-CHANNEL-SEND . SWANK::_)) T)) (27 (SWANK::WAIT-FOR-EVENT (OR (:EMACS-REX . SWANK::_) (:EMACS-CHANNEL-SEND . SWANK::_)) T)) (28 (SWANK::PROCESS-REQUESTS T))