James Bielman wrote:
I'm working on (to begin with), a UTF8-STRING type which converts Lisp strings to/from UTF-8 on Unicode Lisps.
(Does the CLISP FFI provide something like memcpy?)
No. Some voice in me says I should rather implement the vector<->memory functions for CLISP instead of playing around with cffi, Iterate, reading interesting papers etc. when I have a little time.
So, I think I need that block interface we've talked about.
Reviewing the proposal has been on my TODO list for a long time as well, I'm sorry.
(ffi:with-foreign-string (ptr chars bytes s :encoding charset:utf-8) (let ((buf (foreign-alloc :unsigned-char :count bytes))) (memcpy buf ptr bytes)
I think I'd rather use ext:convert-string-to-bytes, then use my non-existent vector->memory function. Sometimes I feel uneasy about stack-allocating possibly huge strings (e.g. 1MB or more!). In the meantime, (let* ((bytes (ext:convert ...)) (len (length bytes)) (buf foreign-alloc len)) (setf (memory-as buf (parse-c-type `(c-array uint8 ,len))) bytes) should be among the fastest ones (CLISP can copy the (c-array uint8 N) type fast). And people keep saying the generational GC should be able to free the garbage vector easily.
However, I haven't been able to find an inverse for FFI:WITH-FOREIGN-STRING.
Indeed, I should ... (see above) and implement string-from-foreign and (foreign-string-length :encoding) as a means to interface to mblen().
I'd like to be able to convert a pointer back to a Lisp string without looping in bytecode to create a vector of octets from the pointer.
I suggest the converse of the above, via an array of (unsigned-byte 8).
I tried a whole bunch of combinations of FFI:MEMORY-AS with FFI:C-ARRAY-PTR types and got nothing but segfaults.
Please report a bug, but possibly you just did not re-read recent impnotes closely enough. For instance, while the FFI now accepts non 1:1 encodings, it's use with c-array-ptr and c-array[-max] is mostly broken and some parts revert to a 1:1 encoding (see *foreign-8bit-encoding*). I would have left the 1:1 restriction and take more time to think about the problems. :-(
The operators that explicitly take an :encoding are safe, like with-foreign-string.
Also, custom:*foreign-encoding* is a symbol macro. Thus (let ((custom:*foreign-encoding* charset:foo))) won't work as expected, you need setq and unwind-protect. :-(
Is there something I can use to convert the pointer to either a vector of octets (which I can pass to EXT:CONVERT-STRING-FROM-BYTES, or to a Lisp string directly?
(memory-as pointer (parse-c-type `(c-array uint8 ,len)))
Looking at the clisp sources, (c-array character N) -> Lisp string seems ok as well. But e.g. don't use (c-array-max character N) with UTF-16! As you can see, there's room for improvement.
Summary: ptr -> Lisp string: either ext:convert + memory-as uint8 or (let ((old *foreign-encoding*)) (unwind-protect (progn (setq *foreign-encoding* utf-8) (memory-as pointer (parse-c-type `(c-array character ,known-length))) )(setq *foreign-encoding* old))) ; looks scary :-( Do you need unknown-length as well? Lisp string -> ptr: ext:convert + memory-as uint8 as mentioned above.
Regards, Jörg Höhle.
On Mon, 2006-01-02 at 14:18 +0100, Hoehle, Joerg-Cyril wrote:
Summary: ptr -> Lisp string: either ext:convert + memory-as uint8 or (let ((old *foreign-encoding*)) (unwind-protect (progn (setq *foreign-encoding* utf-8) (memory-as pointer (parse-c-type `(c-array character ,known-length))) )(setq *foreign-encoding* old))) ; looks scary :-( Do you need unknown-length as well? Lisp string -> ptr: ext:convert + memory-as uint8 as mentioned above.
Thanks for the detailed explanation. I did need unknown length, here's what I'm using now to convert to/from:
#+clisp (defmethod translate-to-foreign ((s string) (name (eql 'utf8-string))) (let* ((bytes (ext:convert-string-to-bytes s charset:utf-8)) (length (length bytes)) (buf (foreign-alloc :unsigned-char :count (1+ length)))) (setf (ffi:memory-as ptr (ffi:parse-c-type `(ffi:c-array ffi:uint8 ,length))) bytes) (setf (mem-aref buf :unsigned-char length) 0) buf))
#+clisp (defcfun "strlen" :unsigned-int (s :pointer))
#+clisp (defmethod translate-from-foreign (ptr (name (eql 'utf8-string))) (let* ((length (strlen ptr)) (bytes (ffi:memory-as ptr (ffi:parse-c-type `(ffi:c-array ffi:uint8 ,length))))) (ext:convert-string-from-bytes bytes charset:utf-8)))
Now that this is working, I will start pulling the implementation-specific code into an encoding-aware %FOREIGN-STRING-TO-LISP function in CFFI-SYS.
James
Hoehle, Joerg-Cyril wrote:
Also, custom:*foreign-encoding* is a symbol macro. Thus (let ((custom:*foreign-encoding* charset:foo))) won't work as expected, you need setq and unwind-protect. :-(
It does ext:letf[*]
Thanks!