James Bielman wrote:
>I'm working on (to begin with), a UTF8-STRING type
>which converts Lisp strings to/from UTF-8 on Unicode Lisps.
>(Does the CLISP FFI provide something like memcpy?)
No. Some voice in me says I should rather implement the vector<->memory functions for CLISP instead of playing around with cffi, Iterate, reading interesting papers etc. when I have a little time.
>So, I think I need that block interface we've talked about.
Reviewing the proposal has been on my TODO list for a long time as well, I'm sorry.
> (ffi:with-foreign-string (ptr chars bytes s :encoding charset:utf-8)
> (let ((buf (foreign-alloc :unsigned-char :count bytes)))
> (memcpy buf ptr bytes)
I think I'd rather use ext:convert-string-to-bytes, then use my non-existent vector->memory function. Sometimes I feel uneasy about stack-allocating possibly huge strings (e.g. 1MB or more!).
In the meantime,
(let* ((bytes (ext:convert ...))
(len (length bytes))
(buf foreign-alloc len))
(setf (memory-as buf (parse-c-type `(c-array uint8 ,len)))
bytes)
should be among the fastest ones (CLISP can copy the (c-array uint8 N) type fast). And people keep saying the generational GC should be able to free the garbage vector easily.
>However, I haven't been able to find an inverse for
>FFI:WITH-FOREIGN-STRING.
Indeed, I should ... (see above) and implement string-from-foreign and (foreign-string-length :encoding) as a means to interface to mblen().
>I'd like to be able to convert a pointer back
>to a Lisp string without looping in bytecode to create a vector of
>octets from the pointer.
I suggest the converse of the above, via an array of (unsigned-byte 8).
> I tried a
>whole bunch of combinations of FFI:MEMORY-AS with FFI:C-ARRAY-PTR types
>and got nothing but segfaults.
Please report a bug, but possibly you just did not re-read recent impnotes closely enough. For instance, while the FFI now accepts non 1:1 encodings, it's use with c-array-ptr and c-array[-max] is mostly broken and some parts revert to a 1:1 encoding (see *foreign-8bit-encoding*). I would have left the 1:1 restriction and take more time to think about the problems. :-(
The operators that explicitly take an :encoding are safe, like with-foreign-string.
Also, custom:*foreign-encoding* is a symbol macro. Thus (let ((custom:*foreign-encoding* charset:foo))) won't work as expected, you need setq and unwind-protect. :-(
> Is there something I can use to convert
>the pointer to either a vector of octets (which I can pass to
>EXT:CONVERT-STRING-FROM-BYTES, or to a Lisp string directly?
(memory-as pointer (parse-c-type `(c-array uint8 ,len)))
Looking at the clisp sources, (c-array character N) -> Lisp string seems ok as well. But e.g. don't use (c-array-max character N) with UTF-16!
As you can see, there's room for improvement.
Summary:
ptr -> Lisp string:
either ext:convert + memory-as uint8
or (let ((old *foreign-encoding*))
(unwind-protect (progn (setq *foreign-encoding* utf-8)
(memory-as pointer (parse-c-type `(c-array character ,known-length)))
)(setq *foreign-encoding* old))) ; looks scary :-(
Do you need unknown-length as well?
Lisp string -> ptr:
ext:convert + memory-as uint8 as mentioned above.
Regards,
Jörg Höhle.