Hi Chun,

I've had a bit more time to look into this and I think I've got a better idea of what's going on.

Firstly, I had a really obvious typo above (doh!) which caused the error I saw in socket-send (2, above). Putting in the count parameter and the socket-send succeeds, as expected.

Replacing the handler-case with handler-bind and dropping into the debugger, I identified where the strange error I was seeing with socket-receive comes from (4, above). It seems the recvfrom is returning an error code of 10035, (WSAEWOULDBLOCK), which you are signalling with a condition. But the ns-try-again-condition only has a :HOST-OR-IP initarg, not a :SOCKET initarg. So I commented out that bit of code and the condition is signalled ok (although I'm not handling it, but that's fine). 

So my test function is 

(defun usocket-test ()
  (let ((s (usocket:socket-connect nil nil
                                   :protocol :datagram
                                   :element-type '(unsigned-byte 8)
                                   :local-port 8001)))
    (unwind-protect
        (do ((i 0 (1+ i))
             (buffer (make-array 1024 :element-type '(unsigned-byte 8)
                                 :initial-element 0))
             (now (get-universal-time))
             (done nil))
            ((or done (= i 4))
             nil)
          (format t "~Ds ~D Waiting state ~S~%" (- (get-universal-time) now) i (usocket::state s))
          (when (usocket:wait-for-input s :ready-only t :timeout 5)
            (format t "~D state ~S~%" i (usocket::state s))
            (handler-bind 
                ((error (lambda (c) 
                          (format t "socket-receive error: ~A~%" c)
                          (break)
                          nil)))
              (multiple-value-bind (buffer count remote-host remote-port)
                  
                  (usocket:socket-receive s buffer 1024)
                (handler-bind
                    ((error (lambda (c)
                               (format t "socket-send error: ~A~%" c)
                               (break))))                             
                  (when buffer 
                    (usocket:socket-send s (subseq buffer 0 count) count
                                         :host remote-host
                                         :port remote-port)))))))
      (usocket:socket-close s))))



My output is now:

0s 0 Waiting state NIL
0 state :READ
2s 1 Waiting state :READ
1 state :READ
2s 2 Waiting state :READ
2 state :READ
2s 3 Waiting state :READ
3 state :READ
NIL


We are getting closer. The read state is still staying in :READ even after a successful socket-receive call, so the loop is still spinning. I can fix this easily by passing in the socket instance directly to the receive-message function and manually setting the %ready-p flag to nil. This gives me the correct behaviour:

0s 0 Waiting state NIL
0 state :READ
4s 1 Waiting state :READ
9s 2 Waiting state NIL
14s 3 Waiting state NIL
NIL

I've attached a diff of backend/lispworks.lisp showing what I describe above, feel free to use/ignore it.


Frank.



On 24 May 2015 at 20:57, Frank James <frank.a.james@gmail.com> wrote:
Hi Chun,

Thanks for looking into this. I've just pulled the most recent codes from github and have had a little play with them. I now get an error signalled on the socket-receive when I expect an error (rather than a simple -1 count as before). This is good.

However, I am still seeing the socket-state behaviour I described before, along with a new bug that has probably been introduced by the recent changes.

Consider the following simple function which listens for UDP port 8001 for up to 4 iterations:

(defun usocket-test ()
 (let ((s (usocket:socket-connect nil nil 
  :protocol :datagram
 :element-type '(unsigned-byte 8)
 :local-port 8001)))
   (unwind-protect 
       (do ((i 0 (1+ i))
            (buffer (make-array 1024 :element-type '(unsigned-byte 8) :initial-element 0))
            (done nil))
           ((or done (= i 4))
            nil)
         (when (usocket:wait-for-input s :ready-only t :timeout 10)
           (format t "~D state ~S~%" i (usocket::state s))
           (handler-case 
               (multiple-value-bind (buffer count remote-host remote-port)               
                   (usocket:socket-receive s buffer 1024)
                 (handler-case 
                     (usocket:socket-send s (subseq buffer 0 count)
                                          :host remote-host
                                          :port remote-port)
                   (error (c)
                     (format t "socket-send error: ~A~%" c))))
             (error (c)
               (format t "socket-receive error: ~A~%" c)))))
     (usocket:socket-close s))))


after calling this from your repl, from another process send a UDP packet to port 8001 to get things going. I get the following output:

0 state :READ
socket-send error: #<STANDARD-GENERIC-FUNCTION USOCKET:SOCKET-SEND 21CADCFA> is called with unpaired keyword in (2130706433 :PORT 58279).
1 state :READ
socket-receive error: MAKE-INSTANCE is called with keyword :SOCKET among the arguments (USOCKET:NS-TRY-AGAIN-CONDITION :SOCKET 2604) which is not one of (:HOST-OR-IP).
2 state :READ
socket-receive error: MAKE-INSTANCE is called with keyword :SOCKET among the arguments (USOCKET:NS-TRY-AGAIN-CONDITION :SOCKET 2604) which is not one of (:HOST-OR-IP).
3 state :READ
socket-receive error: MAKE-INSTANCE is called with keyword :SOCKET among the arguments (USOCKET:NS-TRY-AGAIN-CONDITION :SOCKET 2604) which is not one of (:HOST-OR-IP).
NIL

My observations:
1. The initial socket-receive succeeds (as expected).
2. The initial socket-send fails with an error I've not seen before and can't diagnose... 
3. Subsequent calls to wait-for-input return immediately because the socket is still in a :READ state, even though I already successfully called socket-receive in the first iteration.
4. Subsequent calls to socket-receive now fail with a different error I can't diagnose.

If I didn't have the max iteration cap, this would spin using 100% of a CPU core and adding around 2MB to the heap per second (on my machine), maxing out the 1GB personal edition heap limit in a short space of time. So I think it's a pretty serious issue. I mainly use SBCL, I use LispWorks occasionally to make sure my codes work on at least 1 other implementation, so I'm not exactly a LispWorks expert.

If you want any more explanations, clarifications or examples I'll be happy to do my best to help. 

As before, I'm using LispWorks personal edition 6.1.1 on Windows 8.1. 

Frank.



On 22 May 2015 at 09:24, Chun Tian (binghe) <binghe.lisp@gmail.com> wrote:
Hi Frank,

Today I have modified SOCKET-SEND and SOCKET-RECEIVE for LispWorks, to be able to detect and report socket errors.  I think in this way, these APIs on LispWorks could behavior closer to other platforms.

Now on Windows, your test code should return a USOCKET:CONNECTION-RESET-ERROR condition, which equals to WSA error ECONNRESET.  However, when I test the same code on Mac OS X, it instead returns USOCKET:CONNECTION-REFUSED-ERROR, I think this is a behavior of BSD sockets.

I hope you can try latest usocket code from Git [1], to see if it works better for your case now. My code changes commit can be seen here [2].

For wait-for-input, I can’t see what you see (the socket object remains in a :READ state), if you still think this is an issue, I hope you can create another test code to demonstrate this issue.

Regards,

Chun

[1] https://github.com/usocket/usocket
[2] https://github.com/usocket/usocket/commit/13386639889fa812540fc4f77824c47e7616db37

Il giorno 10/apr/2015, alle ore 23:00, Frank James <frank.a.james@gmail.com> ha scritto:

> I've been testing some UDP codes on Lispworks (personal 32bit Windows version) and have encountered some undocumented behaviour, I think it's a matter of opinion whether it's a bug or not but it should probably have a sentence or two documenting it somewhere.
>
> Basically I'm sending a UDP packet to a host and listening for a reply, using socket-send and socket-receive. If the host in question is not listening for UDP traffic on that particular port, the Windows socket API says it should return an ECONNRESET error immediately on the socket-receive call, this is indicated by a -1 return value from the underlying recvfrom call.
>
> When this happens the Lispworks backend code returns a length of -1 (and buffer nil). This is perfectly acceptable behaviour I think (although it'd be somewhat nicer to signal an error, but that's a matter of taste). But it should be documented that checking the length here is the correct way to detect an error occurred.
>
> Example code would be something like:
>
> (let ((sock (usocket:socket-connect "localhost" 1234 :protocol :datagram :element-type '(unsigned-byte 8))))
>   (unwind-protect
>     (progn
>       (usocket:socket-send sock (make-array 16 :element-type '(unsigned-byte 8) :initial-element 0) 16)
>       (let ((buffer (make-array 16 :element-type '(unsigned-byte 8) :initial-element 0)))
>         (usocket:socket-receive sock buffer 16)))
>     (usocket:socket-close sock)))
>
>
> What is somewhat more annoying is that the socket object remains in a :READ state. This means that a polling loop using wait-for-input spins continuously, with each socket-receive returning -1 (as explained above). Probably the socket state should be cleared if a socket-receive fails.
>
> Apologies if this is all well-known to those reading this list, but it caused me 10 minutes of head scratching earlier today and thought it was worth mentioning.
>
> Frank.
>