Re: [slime-devel] New wire format

6 Nov 2011


      On Sun, 06 Nov 2011 12:13:07 -0500, Helmut Eller <heller@common-lisp.net>  
wrote:
...
Counting characters was problematic, especially with Lisps that use
UTF16 internally (Allegro, CMUCL, JVM based Lisps).  Emacs counts the
length of strings in Unicode code points, while in UTF16 a single code
point may occupy either 1 or 2 indexes (code units) and so CL:LENGTH may
return something different as Emacs expected.  For the same reason we
can't use READ-SEQUENCE to read a specified number of code points.
The new format looks so:
| byte0 | 3 bytes length |
  |    ... payload ...     |
The 3 bytes length header specify the length of the payload in bytes.
Is there a reason to start using a binary encoding of the message length?   
This makes the messages less easy to inspect, and less easy to write  
integration tests for.
...
The playload is an s-exp encoded as UTF8 text.
Normalising on utf-8 and counting bytes sounds like it would solve the  
original issue without changing to a binary encoding of the message length.


-- 
Hugo Duncan