Hi, [I'm not member of cl-ssl@ and cross-posting is hairy, so I'd prefer this to remain in cffi-devel. I CC'ed to David Lichteblau, author of cl+ssl for comments.]
CL+SSL's using that macro caused me to think as follows:
CFFI-SYS says: ;;;# Shareable Vectors ;;; This interface is very experimental. WITH-POINTER-TO-VECTOR-DATA ;;; should be defined to perform a copy-in/copy-out if the Lisp ;;; implementation can't do this.
cffi-sys::with-pointer-to-vector-data, as is, is highly problematic. Trying to get the base address of a Lisp vector in memory is unportable and subject to subtle errors with a moving GC. I'll try and suggest a better API below.
On CMUCL, it's implementend using sys:without-gcing, which is sign enough of a problem.
This remembers me, 20 years ago, of similar bad use of SI::DISABLE-XYZ (I forgot the name) on Symbolics machines, and on the Amiga computer, where programmers were repeatedly told not to make wrong (ab)use of Forbid() (disable multitasking) or Disable() (disable interrupts).
I could understand it if it were restricted to CFFI internal use, for a couple of highly optimized vector copying operations. Everything else should be a big NO-NO. It's use is worse than the often repeated "EVAL is EVIL".
Sadly, its mere existance encourages some use. E.g. CL+SDL uses it :-(
This is a very bad idea. Let me explain. Cl+ssl:stream-read-byte (in stream.lisp) uses it for ssl-read and installs a callback handler. UNIX signaling seems alo involved, as the error codes and loop structure indicate.
Now consider what happens when a user has installed a SERVE-FD-EVENT handler (e.g. running SLIME). While cl+ssl is looping and waiting for ssl-read to complete, arbitrary Lisp code can be called by the signal handlers and the event dispatcher. All of this within the context of sys:without-gcing. This is calling for trouble. And I did not yet mention other treads getting to run.
Back on the Amiga computer, Forbid/Disable() were accepted in a few cases, and programmers were told to quickly exit the protected section. People used it mostly from assembly, but also C. OS calls within this section almost always are a bad idea, since it might either break out of Forbid() and reenable multitasking, or hang because important interrupts would not get served. This could break invariants of the OS or the application, which would cause random crashes some random time later.
As noted, I could understand use of gc:witout-gcing and with-pointer-to-data when kept strictly inside CFFI and used locally, e.g. to copy data from foreign to/from a Lisp array.
Any other use is leading to problems, typically hard to debug. I hope I made this clear.
There is IMHO no reason for CL+SSL to use such a function. If it wants objects at a fixed address, it should use foreign-alloc'ed memory.
Try to explain your C/Java/Perl/tcl programmer friend: "I can only use SSL when resorting to sys::without-gc". "Huh? In my language & environment, no such dirty hack is necessary. It's clearly superior".
BTW, cl+ssl:stream-read-sequence uses (replace thing buf :start1 start :end1 (+ start length)) anyway. So there's copying even when sharing. Instead it could copy from the foreign buffer to the user supplied "thing". This raises the question of whether copying to an arbitrary array-element-type is supported by CFFI's emerging memory<->vector block copy API.
Now let's move toward a better design:
From a CFFI perspective, the following comment:
;;; WITH-POINTER-TO-VECTOR-DATA should be defined to perform a ;;; copy-in/copy-out if the Lisp implementation can't do this. IMHO shows an inversed design.
With the interface and recommendation as is, nobody knows which of a copy-in and/or copy-out is needed. The macro would do both, just to be safe. Implementations would suffer a double speed penalty.
The need for copy-in or copy-out must be indicated by the programmer, similarly to :in and :out parameter modes.
I believe the design should be the opposite: an efficient copy-in or copy-out may resort to with-pointer-to-vector-data and possibly to si::without-gcing (is that thread-safe at all?) to quickly copy the vector and do nothing more than that (no callbacks, no signals, etc.).
To return to the CL+SSL example, I suggest to use a foreign-alloc'ed buffer for ssl-read etc., then copy that into a Lisp vector.
This copying could use CFFI's emerging memory block interface and rely on CFFI to be fast.
To implement that block copy, CFFI could use with-pointer-to-vector-data. However I believe it's now superfluous: In implementations were such a function is available at all (e.g. cmucl), the native code compiler can already translate Lisp code to an efficient loop from one array to the other. Safely.
I mean that what w-p-t-v-d does (sys:vector-SAP etc.) can be restricted to a few internal functions within cffi-{cmucl,*}.lisp and need no dangerous general macro wrapper.
I think this is the best one can achieve portably, without resorting to very specialised features like IIRC Allegro's ability to allocate Lisp vectors at non-moving locations. I haven't yet investigated how one could make transparent use of such a feature, even though any Allegro user would find it suboptimal if cl+ssl (or any other library) would not make best use of her/his Lisp implementation. (Oh well, I'll rant about portable libraries another time. :)
On a final note, cffi-cmucl contains
(let ((,ptr-var (sys:vector-sap ,vector)))
This won't work with displaced etc. (non simple) vectors. It's not the common case, but it should be supported, or the restriction documented (not that it matters if w-p-t-v-d is dropped anyway).
I always welcome comments.
Regards, Jörg Höhle
"Hoehle, Joerg-Cyril" Joerg-Cyril.Hoehle@t-systems.com writes:
cffi-sys::with-pointer-to-vector-data, as is, is highly problematic. Trying to get the base address of a Lisp vector in memory is unportable and subject to subtle errors with a moving GC. I'll try and suggest a better API below.
I think you make a lot of good points here, and I mostly agree with your conclusions about using the block interface instead.
I had another idea though---what if, instead of WITH-SHAREABLE-VECTOR or whatever, there was a CFFI VECTOR foreign type with an appropriate type translator to allow passing a Lisp vector as a pointer to a C function? Something like (just making up syntax as I go along here):
(defcfun ("read" unix-read) :int (fd :int) (buffer (vector (unsigned-byte 8) :out)) (bufsize :long))
The VECTOR type translator would now have enough information to safely disable the GC, pin the vector, or whatever is necessary only during the duration of the foreign function call. And since we know the direction (and presumably it would be required) we can do the correct copy in or out on Lisps that can't do the optimization. I don't know how this would work on Allegro though if a vector _must_ be created in a static area in order to do this.
Obviously all bets are off if the vector isn't big enough, etc, but that's the risk you take when you call C functions anyway...
Thoughts?
James
On Wed, 25 Jan 2006 11:30:33 -0800, James Bielman jamesjb@jamesjb.com said:
"Hoehle, Joerg-Cyril" Joerg-Cyril.Hoehle@t-systems.com writes:
cffi-sys::with-pointer-to-vector-data, as is, is highly problematic. Trying to get the base address of a Lisp vector in memory is unportable and subject to subtle errors with a moving GC. I'll try and suggest a better API below.
I think you make a lot of good points here, and I mostly agree with your conclusions about using the block interface instead.
I had another idea though---what if, instead of WITH-SHAREABLE-VECTOR or whatever, there was a CFFI VECTOR foreign type with an appropriate type translator to allow passing a Lisp vector as a pointer to a C function? Something like (just making up syntax as I go along here):
(defcfun ("read" unix-read) :int (fd :int) (buffer (vector (unsigned-byte 8) :out)) (bufsize :long))
The VECTOR type translator would now have enough information to safely disable the GC, pin the vector, or whatever is necessary only during the duration of the foreign function call.
This design interacts badly with multithreading if it has to disable the GC.
__Martin
Quoting Hoehle, Joerg-Cyril (Joerg-Cyril.Hoehle@t-systems.com):
cffi-sys::with-pointer-to-vector-data, as is, is highly problematic. Trying to get the base address of a Lisp vector in memory is unportable and subject to subtle errors with a moving GC. I'll try and suggest a better API below.
On CMUCL, it's implementend using sys:without-gcing, which is sign enough of a problem.
I think you have argued quite convincingly that WITHOUT-GCING is a very bad implementation strategy for pinned vectors.
However, I don't think that the concept of WITH-POINTER-TO-VECTOR-DATA is entirely bogus just because some backends use a bad implementation.
I believe that, on implementations which don't support this idiom natively, the right thing is to temporarily create a foreign vector and copy from/to that foreign scratch space in the implementation of the macro. That's not as fast as possible, but it is safe and portable.
Note that Java's JNI uses this strategy, too.
However, there is a big difference between JNI and CFFI's current proposal: For JNI, the primitives in question can be used for *any* array, even arrays that were created by random user code. So there is an equivalent of WITH-POINTER-TO-VECTOR-DATA, but no equivalent of MAKE-SHARABLE-BYTE-VECTOR is necessary.
And here's why this matters:
BTW, cl+ssl:stream-read-sequence uses (replace thing buf :start1 start :end1 (+ start length)) anyway. So there's copying even when sharing. [...]
... the problem is that we would *like* to simply pass the user's vector into the those functions, but we cannot expect the user to have created a sharable vector. (So we are currently copying the vector manually instead, but that's obviously bogus.)
To make the sharable vector interface in CFFI useful for CL+SSL, the interface would have to be changed so that WITH-POINTER-TO-VECTOR-DATA is guaranteed to work for any vector.
(It could make sense to keep MAKE-SHARABLE-BYTE-VECTOR, and say that applications will have a greater chance to actually get non-copying behaviour if they create their vectors this way. As you mentioned, ACL has such vectors. To make that work with the new interface, there would have to be same way to look at a vector someone else created, and find out whether it is sharable or not, so that WITH-POINTER-TO-VECTOR-DATA could decide at runtime which implementation stragety to use.)
I believe the design should be the opposite: an efficient copy-in or copy-out may resort to with-pointer-to-vector-data and possibly to si::without-gcing (is that thread-safe at all?) to quickly copy the vector and do nothing more than that (no callbacks, no signals, etc.).
To return to the CL+SSL example, I suggest to use a foreign-alloc'ed buffer for ssl-read etc., then copy that into a Lisp vector.
Well, I would prefer the interface I've explained above.
(Perhaps with the :in and :out arguments you mentioned.)
I think this is the best one can achieve portably, without resorting to very specialised features like IIRC Allegro's ability to allocate Lisp vectors at non-moving locations.
[...]
I didn't think about Allegro when I started using this interface. It's only now that I realize how Allegro's allocation strategies must have influenced the current proposal.
What I had in mind was SBCL, which has a macro called WITH-PINNED-OBJECTS that does exactly what CFFI needs.
So is it a good idea to add an interface to CFFI that would, currently, be guaranteed to be efficient only some Lisps only?
That's a question the CFFI maintainers would have to decide. But my impression was that CFFI, in contrast to UFFI, is willing to implement features even if not every Lisp supports them.
Perhaps it would influence the decision to know which Lisps there are that can pin objects in memory. * SBCL (only on gencgc ports, but the gencgc porting committee is meant to fix that real soon now... Implementing the macro on non-conservative gencgc ports would have to be done a little differently, but the page pinning mechanism should in theory work on those ports, too.) * CMUCL does not have a the macro SBCL offers, but having the same GC, it should be trivial to implement. I guess nobody has done that yet because CMUCL doesn't have preemptive threads and therefore less need for it. * ECL with Boehm GC should have no trouble because object's don't move at all (?) * CLISP: I guess not, but would it be hard to implement? * OpenMCL: don't know * The commercial Lisps: Don't know, but they'll implement it if their paying customers start asking for it. ;-)
d.