Re: [slime-devel] New wire format

7 Nov 2011


      * Hugo Duncan [2011-11-07 04:04] writes:
...
On Sun, 06 Nov 2011 12:13:07 -0500, Helmut Eller
<heller@common-lisp.net> wrote:
...
Counting characters was problematic, especially with Lisps that use
UTF16 internally (Allegro, CMUCL, JVM based Lisps).  Emacs counts the
length of strings in Unicode code points, while in UTF16 a single code
point may occupy either 1 or 2 indexes (code units) and so CL:LENGTH may
return something different as Emacs expected.  For the same reason we
can't use READ-SEQUENCE to read a specified number of code points.
The new format looks so:
| byte0 | 3 bytes length |
  |    ... payload ...     |
The 3 bytes length header specify the length of the payload in bytes.
Is there a reason to start using a binary encoding of the message
length?
No deep reason.  We actually used binary encoding before we used
hex-strings.  That worked fine with latin-1 but not with utf-8.  I guess
it's just instinct; now that we explicitly work on a byte stream it's
even more natural.  Should probably have used network byte order.
...
This makes the messages less easy to inspect, and less easy
to write integration tests for.
Only marginally.  Shifting 3 bytes together is not exactly rocket since.
...
...
The playload is an s-exp encoded as UTF8 text.
Normalising on utf-8 and counting bytes sounds like it would solve the
original issue without changing to a binary encoding of the message
length.
Right.  It would not be backward compatible, tho.

Helmut