I am having trouble expanding frames to see local variables in SLDB using SLIME in XEmacs 21.4.13 with CLISP 2.33.1 on Cygwin. If I define a function with a break expression,
(defun foo (n) (break) n)
then call that function,
(foo 1)
SLDB presents the backtrace:
,---- | Break | [Condition of type SIMPLE-CONDITION] | | Restarts: | 0: [CONTINUE] Return from BREAK loop | 1: [ABORT] Abort handling SLIME request. | | Backtrace: | 0: frame binding variables (~ = dynamically): | | ~ SYSTEM::*FASOUTPUT-STREAM* <--> NIL | 1: EVAL frame for form (BREAK) | 2: EVAL frame for form (PROGN (BREAK) N) | 3: APPLY frame for call (FOO '1) | 4: EVAL frame for form (FOO 1) | 5: EVAL frame for form (SWANK:INTERACTIVE-EVAL "(foo 1)") | 6: EVAL frame for form (SWANK:START-SERVER "/tmp/seh/slime.800" :EXTERNAL-FORMAT :ISO-LATIN-1-UNIX) `----
Moving the cursor down to, say, the "1:" line and pressing enter to expand the frame causes XEmacs to hang. The minibuffer reports:
; pipelined request... (swank:frame-locals-for-emacs 1)
Interrupting XEmacs with C-g produces the following backtrace:
Debugger entered--Lisp error: (quit) accept-process-output(nil 0 10000) (while t (accept-process-output nil 0 10000)) (let ((debug-on-quit t) (inhibit-quit nil)) (while t (accept-process-output nil 0 10000))) (catch tag (slime-rex (tag sexp) (sexp package) (... ... ...) (... ...)) (let (... ...) (while t ...))) (apply (function funcall) (catch tag (slime-rex ... ... ... ...) (let ... ...))) (let* ((tag ...) (slime-stack-eval-tags ...)) (apply (function funcall) (catch tag ... ...))) slime-eval((swank:frame-locals-for-emacs 1)) sldb-frame-locals(1) (let* ((i 0) (G43782 ...) (var nil)) (while (consp G43782) (setq var ...) (destructuring-bind ... var ... ...) (setq i ...) (setq G43782 ...)) nil) (catch (quote --cl-block-nil--) (let* (... ... ...) (while ... ... ... ... ...) nil)) (cl-block-wrapper (catch (quote --cl-block-nil--) (let* ... ... nil))) (block nil (let* (... ... ...) (while ... ... ... ... ...) nil)) (loop for i from 0 for var in (sldb-frame-locals frame) do (destructuring-bind (&key name id value) var (slime-propertize-region ... ... ... ...) (insert "\n"))) sldb-insert-locals(1 " ") (progn (sldb-insert-frame frame t) (insert indent1 (in-sldb-face section "Locals:") "\n") (sldb-insert-locals frame-number indent2) (when sldb-show-catch-tags (let ... ...)) (unless sldb-enable-styled-backtrace (terpri)) (point)) (prog1 (progn (sldb-insert-frame frame t) (insert indent1 ... "\n") (sldb-insert-locals frame-number indent2) (when sldb-show-catch-tags ...) (unless sldb-enable-styled-backtrace ...) (point)) (add-text-properties G43777 (point) (backquote ...))) (let ((G43777 ...)) (prog1 (progn ... ... ... ... ... ...) (add-text-properties G43777 ... ...))) (slime-propertize-region (backquote (frame ... details-visible-p t)) (sldb-insert-frame frame t) (insert indent1 (in-sldb-face section "Locals:") "\n") (sldb-insert-locals frame-number indent2) (when sldb-show-catch-tags (let ... ...)) (unless sldb-enable-styled-backtrace (terpri)) (point)) (let* ((props ...) (frame ...) (frame-number ...) (standard-output ...) (indent1 " ") (indent2 " ")) (delete-region start end) (slime-propertize-region (backquote ...) (sldb-insert-frame frame t) (insert indent1 ... "\n") (sldb-insert-locals frame-number indent2) (when sldb-show-catch-tags ...) (unless sldb-enable-styled-backtrace ...) (point))) (save-excursion (goto-char start) (let* (... ... ... ... ... ...) (delete-region start end) (slime-propertize-region ... ... ... ... ... ... ...))) (let* ((G43776 ...) (start ...) (end ...)) (save-excursion (goto-char start) (let* ... ... ...))) (multiple-value-bind (start end) (sldb-frame-region) (save-excursion (goto-char start) (let* ... ... ...))) sldb-show-frame-details() (if (or on (not ...)) (sldb-show-frame-details) (sldb-hide-frame-details)) (let ((inhibit-read-only t) (column ...)) (if (or on ...) (sldb-show-frame-details) (sldb-hide-frame-details)) (move-to-column column)) sldb-toggle-details() funcall(sldb-toggle-details) (if fn (funcall fn)) (let ((fn ...)) (if fn (funcall fn))) (lambda nil "Invoke the action at point." (interactive) (let (...) (if fn ...)))() call-interactively(sldb-default-action)
If I press "c" continue from that backtrace, focus returns to the SLDB buffer, from which I can still invoke restarts. That is, the Lisp connection is still alive and synchronized with SLIME.
Recent activity in the *slime-events* buffer is:
,---- | (:emacs-rex | (swank:interactive-eval "(foo 1)") | nil t 32) | (:debug 0 1 | ("Break" " [Condition of type SIMPLE-CONDITION]" nil nil) | (("CONTINUE" "Return from BREAK loop") | ("ABORT" "Abort handling SLIME request.")) | ((0 "frame binding variables (~ = dynamically):\n | ~ SYSTEM::*FASOUTPUT-STREAM* <--> NIL") | (1 "EVAL frame for form (BREAK)") | (2 "EVAL frame for form (PROGN (BREAK) N)") | (3 "APPLY frame for call (FOO '1)") | (4 "EVAL frame for form (FOO 1)") | (5 "EVAL frame for form (SWANK:INTERACTIVE-EVAL "(foo 1)")") | (6 "EVAL frame for form (SWANK:START-SERVER "/tmp/seh/slime.800" :EXTERNAL-FORMAT :ISO-LATIN-1-UNIX)")) | (32)) | (:debug-activate 0 1) | (:emacs-rex | (swank:frame-locals-for-emacs 1) | nil 0 33) | (:return | (:ok | ((:name "N" :id 0 :value "1"))) | 33) `----
"Steven E. Harris" seh@panix.com writes:
Moving the cursor down to, say, the "1:" line and pressing enter to expand the frame causes XEmacs to hang. The minibuffer reports:
; pipelined request... (swank:frame-locals-for-emacs 1)
Sick. Does this happen consistently? I don't this problem with CLISP 2.33.2 on Debian.
The backtrace makes it look like Emacs is waiting to be told what the local variables are, but the event trace shows that this has been given. The "pipelined request" message is what you get when you make a request to Lisp before the result of the previous one has arrived.
Possibly the Emacs side is failing to recognise the RPC result, but that would be pretty extreme.
Luke Gorrie luke@synap.se writes:
Sick. Does this happen consistently?
Yes. I can start up a fresh XEmacs, start SLIME, and evaluate those two forms, getting stuck in the same way every time.
The backtrace makes it look like Emacs is waiting to be told what the local variables are, but the event trace shows that this has been given.
It does? I would expect to see some indication that "n" is bound to 1. Where do you see confirmation that the info has been given?
The "pipelined request" message is what you get when you make a request to Lisp before the result of the previous one has arrived.
Whenever I see that, I figure my SLIME connection is now messed up beyond repair. Here, it's strange that I can seemingly recover.
Possibly the Emacs side is failing to recognise the RPC result, but that would be pretty extreme.
What other diagnostic information can I provide?
"Steven E. Harris" seh@panix.com writes:
Luke Gorrie luke@synap.se writes:
Sick. Does this happen consistently?
Yes. I can start up a fresh XEmacs, start SLIME, and evaluate those two forms, getting stuck in the same way every time.
I fixed the "pipelined request" message, but that was harmless.
Overall the problem seems very weird and doesn't happen for me in CLISP 2.33.2 with GNU or XEmacs on Debian.
Please try making this change in slime-dispatch-event in the elisp code. You can C-M-x the definition without needing to restart Emacs. This should put more debug info in the *slime-events* buffer so it will be interesting to see what that looks like after reproducing the problem again.
--- slime.el.~1.471.~ 2005-03-13 20:38:20.000000000 +0100 +++ slime.el 2005-03-13 20:54:34.000000000 +0100 @@ -2195,6 +2195,7 @@ (slime-send `(:emacs-rex ,form ,package ,thread ,id)))) ((:return value id) (let ((rec (assq id (slime-rex-continuations)))) + (slime-log-event (list 'DEBUG id value rec)) (cond (rec (setf (slime-rex-continuations ) (remove rec (slime-rex-continuations))) (when (null (slime-rex-continuations))
Luke Gorrie luke@synap.se writes:
Overall the problem seems very weird and doesn't happen for me in CLISP 2.33.2 with GNU or XEmacs on Debian.
Of course, I'd rather be working with a fresher CLISP and using Debian, but today I'm stuck with this lot.
it will be interesting to see what that looks like after reproducing the problem again.
How's this? I see the debug output at the end.
(:emacs-rex (swank:interactive-eval "(foo 1)") nil t 14) (:debug 0 1 ("Break" " [Condition of type SIMPLE-CONDITION]" nil nil) (("CONTINUE" "Return from BREAK loop") ("ABORT" "Abort handling SLIME request.")) ((0 "frame binding variables (~ = dynamically):\n | ~ SYSTEM::*FASOUTPUT-STREAM* <--> NIL") (1 "EVAL frame for form (BREAK)") (2 "EVAL frame for form (PROGN (BREAK) N)") (3 "APPLY frame for call (FOO '1)") (4 "EVAL frame for form (FOO 1)") (5 "EVAL frame for form (SWANK:INTERACTIVE-EVAL "(foo 1)")") (6 "EVAL frame for form (SWANK:START-SERVER "/tmp/seh/slime.4468" :EXTERNAL-FORMAT :ISO-LATIN-1-UNIX)")) (14)) (:debug-activate 0 1) (:emacs-rex (swank:frame-locals-for-emacs 1) nil 0 15) (:return (:ok ((:name "N" :id 0 :value "1"))) 15) (DEBUG 15 (:ok ((:name "N" :id 0 :value "1"))) (15 lambda (&rest --cl-rest--) (apply '(lambda (G43299 G43300 G43289) (let* ... ...)) '--sexp-- '--tag-- --cl-rest--)))
"Steven E. Harris" seh@panix.com writes:
Luke Gorrie luke@synap.se writes:
Overall the problem seems very weird and doesn't happen for me in CLISP 2.33.2 with GNU or XEmacs on Debian.
Of course, I'd rather be working with a fresher CLISP and using Debian, but today I'm stuck with this lot.
it will be interesting to see what that looks like after reproducing the problem again.
How's this? I see the debug output at the end.
Everything looks as expected. I don't understand why Emacs would be hanging. Do you have another copy of Emacs you can test with - e.g. GNU Emacs?
Luke Gorrie luke@synap.se writes:
I don't understand why Emacs would be hanging. Do you have another copy of Emacs you can test with - e.g. GNU Emacs?
Not installed here, no. Given more time, I could set up the Cygwin Emacs package. I'll try this same test at work tomorrow morning (same software set, different hardware) to see if the same problem arises.
Unfortunately, I can't say how recently this behavior started. I haven't had to invoke the SLIME debugger on this computer for at least a month.
I tried some other commands in the debugger to see which elicited strange behavior. Evaluating a form in a frame produced the "pipelined request" message in the minibuffer, but did display the result without hanging. Inspect worked fine. Toggling details, listing locals, and attempting to expand any single frame (all related commands, I'd assume) all hang as described. Perhaps there's something specific about the message exchange to list local bindings.
Can you help interpret what's failing here? My assumption is that SLIME is hanging waiting for some reply to arrive from the swank server, and it finds no such reply to read. What SLIME component is responsible for writing to *slime-events*?
"Steven E. Harris" seh@panix.com writes:
Can you help interpret what's failing here? My assumption is that SLIME is hanging waiting for some reply to arrive from the swank server, and it finds no such reply to read. What SLIME component is responsible for writing to *slime-events*?
The pattern is that asynchronous RPCs succeed by synchronous ones fail.
Under the hood the Emacs RPC interface is asynchronous - you tell Lisp what to do and pass a callback function to be called with the result. Looks basically like:
(slime-eval-async '(my-lisp-function arg1 arg2) (lambda (result) ...do something...))
but sometimes you want to make a synchronous RPC where instead of passing a callback you just wait and have the value returned. For this we have a simpler interface, basically:
(slime-eval '(my-function arg1 arg2)) => result
We use a trick to build synchronous evaluation on top of asynchronous. slime-eval is implemented similar to this:
(catch request-id (slime-eval-async form (lambda (result) (throw request-id result))) (loop (poll-for-response)))
i.e. it sends an asynchronous request and then enters an "infinite" loop. The loop will actually terminate when the RPC's result is delivered to the 'lambda' function because it will throw it up to the 'catch' which is outside the loop.
For some reason this trick doesn't seem to be working in your Emacs. The loop is not terminating so apparently no result is being thrown. Here is the code that makes the throw, from slime-dispatch-event, with added commentary:
(destructure-case event ... ((:return value id)
;; Lisp has sent us the result (value) for an RPC identified by `id' ;; This `id' is the same tag in slime-eval's 'catch' on the stack. (let ((rec (assq id (slime-rex-continuations))))
;; Here is the extra event trace we added. ;; The output from this showed that the 'assq' above did indeed ;; find the continuation function - that's the function that ;; will 'throw' when we call it with the value. (slime-log-event (list 'DEBUG id value rec))
;; We also know from the output that `rec' is not nil so the ;; first cond-clause will be selected: (cond (rec (setf (slime-rex-continuations ) (remove rec (slime-rex-continuations))) (when (null (slime-rex-continuations)) (slime-set-state ""))
;; Here is where we end up: call the continuation ;; function with the value. This function is ;; supposed to 'throw' its way out of the loop. (funcall (cdr rec) value)) (t (error "Unexpected reply: %S %S" id value)))))
I'm tempted to accuse your Emacs of having a broken throw/catch, but that's dangerous talk. Here's another debug info patch that's worth a try - it will print a message when the throw and catch are used:
--- slime.el.~1.471.~ 2005-03-13 20:38:20.000000000 +0100 +++ slime.el 2005-03-13 22:54:35.000000000 +0100 @@ -2102,6 +2102,7 @@ (slime-stack-eval-tags (cons tag slime-stack-eval-tags))) (apply #'funcall + (prog1 (catch tag (slime-rex (tag sexp) (sexp package) @@ -2109,12 +2110,14 @@ (unless (member tag slime-stack-eval-tags) (error "tag = %S eval-tags = %S sexp = %S" tag slime-stack-eval-tags sexp)) + (message "Throwing result to ~S" tag) (throw tag (list #'identity value))) ((:abort) (throw tag (list #'error "Synchronous Lisp Evaluation aborted.")))) (let ((debug-on-quit t) (inhibit-quit nil)) - (while t (accept-process-output nil 0 10000))))))) + (while t (accept-process-output nil 0 10000)))) + (message "Caught %S" tag)))))
(defun slime-eval-async (sexp &optional cont package) "Evaluate EXPR on the superior Lisp and call CONT with the result."
Luke Gorrie luke@synap.se writes:
I'm tempted to accuse your Emacs of having a broken throw/catch, but that's dangerous talk.
Maybe not. Macroexpanding catch/throw reveals that the trow tag is (symbol-value tag). It seems that by the time that throw gets invoked, tag has no symbol-value, so an error is thrown (silently) and gets trapped somewhere.
I've tried all sorts of ways of printing out the tag's value at various points inside the slime-rex expression. Any attempt to print the tag's value works, but printing its symbol-value fails.
I have to call it quits on the investigation for tonight, and maybe I'm barking up the wrong tree, but my best guess right now is that the throw expression turns out to be a no-op because the destination tag does not exist. Actually, it's simpler than that: The callback function fails (again, silently) before throw even executes, because looking up the tag's symbol-value fails first. Maybe.
Thanks for your help so far.
Luke Gorrie luke@synap.se writes:
I'm tempted to accuse your Emacs of having a broken throw/catch, but that's dangerous talk.
Forget my previous message.
I found the problem, or at least I found a remedy that makes the problem go away. In slime-process-available-input, there's an extra attempt to read each buffer associated with slime-net-processes after the first set of events have been dispatched. If some input is available in these buffers, a follow-on call to slime-process-available-input gets scheduled.
,----[ slime-process-available-input ] | (dolist (p slime-net-processes) | (with-current-buffer (process-buffer p) | (when (slime-net-have-input-p) | (run-at-time 0 nil 'slime-process-available-input)))) `----
If I comment out this second attempt at reading, XEmacs no longer hangs. Putting some instrumentation in slime-process-available-input showed that slime-net-have-input always returns true (at least after reading the result of a frame-locals query per my example), indicating there's another event ready for reading.
If I replace the above form by:
,----[ slime-process-available-input (modified) ] | (dolist (p slime-net-processes) | (with-current-buffer (process-buffer p) | (when (slime-net-have-input-p) | (message "s-p-a-i: more available in process %s: %s." p | (slime-net-decode-length)) | ; (run-at-time 0 nil 'slime-process-available-input) | ))) `----
I see the following log message after each "throw" in slime-eval:
,---- | s-p-a-i: more available in process SLIME Lisp: 22. `----
The 22 count would match events like this:
,----[ *slime-events* sample ] | (:debug-activate 0 2) `----
that appear in *slime-events* after each local listing. Here are the events produced by hitting enter on a frame in SLDB:
,----[ *slime-events* sample ] | (:emacs-rex | (swank:frame-locals-for-emacs 1) | nil 0 86) | (:debug-activate 0 2) | (:return | (:ok | ((:name "N" :id 0 :value "1"))) | 86) | (:emacs-rex | (swank:frame-catch-tags-for-emacs 1) | nil 0 87) | (:debug-activate 0 2) | (:return | (:ok nil) | 87) `----
The corresponding messages in *Message Log* are:
,---- | s-p-a-i: have input in process SLIME Lisp. | s-p-a-i: have input in process SLIME Lisp. | Throwing result nil to tag slime-result-48471 | s-p-a-i: more available in process SLIME Lisp: 22. | Caught slime-result-48471 `----
Perhaps there's no appropriate handler for these "debug-activate" events. Please let me know if you need more information to help further diagnose the problem.
"Steven E. Harris" seh@panix.com writes:
I found the problem, or at least I found a remedy that makes the problem go away. In slime-process-available-input, there's an extra attempt to read each buffer associated with slime-net-processes after the first set of events have been dispatched. If some input is available in these buffers, a follow-on call to slime-process-available-input gets scheduled.
Sorry but I need to drop the thread here and go impersonate a telecom technician at short notice. :-) Back early next month.
Don't know if I'll have a chance to do anything slimey between now and then but if I do it'll be trying to help make a new release.
If really necessary Peter can make his own "1.0-peter" release for Lisp-in-a-Box and set a tag for it in CVS and we worry about 1.1 later.
P.S. any Lispers in Muskat or Amman fancy a coffe? :-)
P.P.S. I *really* wish SLIME worked as well with SBCL as with CMUCL.
"Steven E. Harris" seh@panix.com writes:
If I replace the above form by:
,----[ slime-process-available-input (modified) ] | (dolist (p slime-net-processes) | (with-current-buffer (process-buffer p) | (when (slime-net-have-input-p) | (message "s-p-a-i: more available in process %s: %s." p | (slime-net-decode-length)) | ; (run-at-time 0 nil 'slime-process-available-input) | ))) `----
The problem is with `run-at-time' on XEmacs. Please see my bug report¹ posted to the xemacs-beta mailing list.
Until there's a change to `run-at-time' or `start-itimer' on XEmacs, we can work around the problem by substituting the 0 in the `run-at-time' call above with `itimer-short-interval'.
I noticed that there are a few places in SLIME that use `run-at-time' with a expiration of 0 that should be subject to similar failure, but the debugger hang described in this thread is the only problem I've experienced so far.
Footnotes: ¹ http://thread.gmane.org/gmane.emacs.xemacs.beta/18637 http://thread.gmane.org/gmane.emacs.xemacs.beta/18638 (Accidentally posted the message twice.)
"Steven E. Harris" seh@panix.com writes:
Until there's a change to `run-at-time' or `start-itimer' on XEmacs, we can work around the problem by substituting the 0 in the `run-at-time' call above with `itimer-short-interval'.
Thank you for the detective work. Should be fixed in CVS.
Helmut.