A little background: I am trying to get RDNZL working on Clozure CL 1.4 running on my Win32 box using CFFI. It gets an "exception occurred while executing foreign code" that causes the listener to drop into the kernel debugger. All I can tell is that the error occurs in %invoke-static-member and I believe the error is because the function expects a utf-16 (wchar) string, but for some reason the encoding is not working as expected in CFFI.
I went and did a little experimentation and found the following:
CFFI> *default-foreign-encoding* :UTF-8 CFFI> (foreign-funcall "_wcsdup" (:string :encoding :utf-16) "foo" :string) "f" ;; That looks like the manual CFFI> (let ((*default-foreign-encoding* :utf-16)) (foreign-funcall "_wcsdup" (:string :encoding :utf-16) "foo" :string)) "" ;; Oops, foo in Sumerian cuneiform, maybe?
Note that strdup won't work with utf-16 on windows. You use _wcsdup instead, apparently.
CL-USER> (with-encoded-cstrs :utf-16 ((x "foo")) (#__wcsdup x)) #<A Foreign Pointer #x117C730> CL-USER> (ccl::%get-native-utf-16-cstring *) "foo" ;; Almost got it!
CL-USER> (ccl::with-native-utf-16-cstr (x "foo") (#__wcsdup x)) #<A Foreign Pointer #x1171590> CL-USER> (ccl::%get-native-utf-16-cstring *) "foo" ;; Bulls-eye!
Any idea on why this is happening?
Many thanks, John Miller
On Mon, Aug 31, 2009 at 7:47 PM, John Miller millejoh@mac.com wrote:
CFFI> *default-foreign-encoding* :UTF-8 CFFI> (foreign-funcall "_wcsdup" (:string :encoding :utf-16) "foo" :string) "f" ;; That looks like the manual CFFI> (let ((*default-foreign-encoding* :utf-16)) (foreign-funcall "_wcsdup" (:string :encoding :utf-16) "foo" :string)) " " ;; Oops, foo in Sumerian cuneiform, maybe?
The problem is that the :UTF-16 encoding assumes big-endianness when a BOM is not present, and that clearly is not what we want to happen in this case. Perhaps, in the context of CFFI, we should assume native endianness? In any case, it should be possible to specify the desired endianness: native, big, or little.
This is an essential feature I've been meaning to add to Babel. However, I'll only be able to get to it later next week.
If you need a quick workaround, you can edit the DEFINE-DECODER :UTF-16 definition in babel/src/enc-unicode.lisp by changing the default CASE clause (t #+little-endian t) to (t #+little-endian NIL).
HTH.