I'm trying to use CXML, specifically the Klacks parser, to read streaming XML from a network socket (I'm writing a Jabber library), but I'm having problems with buffering. I'm using this code for testing:
(defun cxml-test (host port &key (buffering nil)) (let ((socket (usocket:socket-connect host port :element-type '(unsigned-byte 8)))) (cxml:make-source (usocket:socket-stream socket) :buffering buffering :pathname "foo")))
And then I run (klacks:consume src) repeatedly.
On the sending end, I use "nc -l -w 5 -p 42000", either typing in XML by hand or by redirecting a real XML file to netcat.
With buffering enabled, this basically works. However, I need to process "stanzas" (Jabber term for complete elements that are children of the root element) as soon as they come in, not when the buffer is full or when the stream or the root element is closed. But when I disable buffering, I get (using CLISP from recent CVS):
AREF: index 1 for #(60) is out of range [Condition of type SIMPLE-TYPE-ERROR]
Restarts: 0: [ABORT] Return to SLIME's top level. 1: [CLOSE-CONNECTION] Close SLIME connection
Backtrace: [ SLIME parts skipped ] 5: INVOKE-DEBUGGER 6: AREF 7: REPLACE 8: #<COMPILED-FUNCTION #:|257 305 (DEFMETHOD XSTREAM-UNDERFLOW (#) ...)-25-1-1|> 9: CXML:MAKE-SOURCE 10: CXML:MAKE-SOURCE 11: (CXML:MAKE-SOURCE (USOCKET:SOCKET-STREAM SOCKET) :BUFFERING BUFFERING :PATHNAME "foo") 12: LET 13: (CXML-TEST '"localhost" '42000 ':BUFFERING 'NIL)
SBCL 1.0.8.46 gets a similar error:
The bounding indices 0 and 2 are bad for a sequence of length 1. [Condition of type SB-KERNEL:BOUNDING-INDICES-BAD-ERROR]See also: Common Lisp Hyperspec, bounding index designator [glossary] Common Lisp Hyperspec, SUBSEQ-OUT-OF-BOUNDS:IS-AN-ERROR [issue]
Restarts: 0: [ABORT] Return to SLIME's top level. 1: [ABORT] Exit debugger, returning to top level.
Backtrace: 0: (SB-IMPL::SIGNAL-BOUNDING-INDICES-BAD-ERROR #(60) 0 2) 1: (SB-IMPL::SIGNAL-BOUNDING-INDICES-BAD-ERROR #(60) 0 2) 2: ((SB-PCL::FAST-METHOD RUNES::XSTREAM-UNDERFLOW (RUNES:XSTREAM)) #<unavailable argument> #<unavailable argument> #<RUNES:XSTREAM [main document :MAIN file://+/home/foo]>) 3: (CXML:MAKE-SOURCE #<RUNES:XSTREAM [main document :MAIN file://+/home/foo]>) 4: (NIL) 5: (SB-INT:SIMPLE-EVAL-IN-LEXENV (SETF SRC (CXML-TEST "localhost" 42000 :BUFFERING NIL)) #<NULL-LEXENV>)
It would be nice if this worked, and if additionally KLACKS:PEEK would do a non-blocking read.
Magnus
Quoting Magnus Henoch (mange@freemail.hu):
With buffering enabled, this basically works. However, I need to process "stanzas" (Jabber term for complete elements that are children of the root element) as soon as they come in, not when the buffer is full or when the stream or the root element is closed. But when I disable buffering, I get (using CLISP from recent CVS):
Thanks for the report. I have committed a bugfix to CVS. Please test.
It would be nice if this worked, and if additionally KLACKS:PEEK would do a non-blocking read.
I will have think about that one. Would a function similar to LISTEN also be okay, which would return T if data is available and NIL otherwise?
(Note that it would not guarantee a non-blocking read if the server has already sent some characters but not an entire event, and implementing that cleanly would be a lot more difficult to do.)
d.
David Lichteblau david@lichteblau.com writes:
Quoting Magnus Henoch (mange@freemail.hu):
With buffering enabled, this basically works. However, I need to process "stanzas" (Jabber term for complete elements that are children of the root element) as soon as they come in, not when the buffer is full or when the stream or the root element is closed. But when I disable buffering, I get (using CLISP from recent CVS):
Thanks for the report. I have committed a bugfix to CVS. Please test.
It works. Thanks!
It would be nice if this worked, and if additionally KLACKS:PEEK would do a non-blocking read.
I will have think about that one. Would a function similar to LISTEN also be okay, which would return T if data is available and NIL otherwise?
(Note that it would not guarantee a non-blocking read if the server has already sent some characters but not an entire event, and implementing that cleanly would be a lot more difficult to do.)
Thinking about it, it struck me that I could as well call LISTEN myself; that would work for me in almost all cases. The remaining case is when I have received some character data - CONSUME could immediately return what it has received so far, without waiting for the CHARACTERS event to end. Or could I work around that with PEEK-CHAR and READ-CHAR on the underlying stream, discarding characters until the next <, without confusing CXML?
Magnus