Liam,
I have tried this and (without using (the double-float (...)) since this is unnecessary for cffi:mem-aref with hardcoded :double data type), I get the following timing data:
using the standard grid:gref routine: Evaluation took: 0.542 seconds of real time 0.536671 seconds of total run time (0.525229 user, 0.011442 system) [ Run times consist of 0.037 seconds GC time, and 0.500 seconds non-GC time. ] 99.08% CPU 1,189,729,211 processor cycles 153,334,416 bytes consed
using the standard grid:gref* routine: Evaluation took: 0.067 seconds of real time 0.066006 seconds of total run time (0.063179 user, 0.002827 system) [ Run times consist of 0.008 seconds GC time, and 0.059 seconds non-GC time. ] 98.51% CPU 146,192,222 processor cycles 27,060,416 bytes consed
using the modified grid:gref* routine (specialized to grid:vector-double-float) Evaluation took: 0.070 seconds of real time 0.068041 seconds of total run time (0.065151 user, 0.002890 system) [ Run times consist of 0.011 seconds GC time, and 0.058 seconds non-GC time. ] 97.14% CPU 152,642,611 processor cycles 27,061,456 bytes consed
Apparently my system didn't notice the difference. Also, SBCL complains about the argument (grid::foreign-pointer object) to cffi:mem-aref being of type NUMBER instead of integer or fixnum. Furthermore, it says it has to do float to pointer coercion to <return-value>. I checked that the redefined method is actually used by doing a second run with a print statement added to the defmethod.
using cffi:mem-aref directly: Evaluation took: 0.002 seconds of real time 0.002483 seconds of total run time (0.002482 user, 0.000001 system) 100.00% CPU 5,442,756 processor cycles 0 bytes consed
I guess that even using compiler macros or other trickery one would have to remove the allocation of linearized indices and foreign ptr addresses from the inner loops as I have done in my example by using auxiliary variables zvector-fptr and output-fptr. Maybe one can define something like (with-foreign-array (name-of-array :double) ...) that locally redefines (grid:gref name-of-array ...) and (grid:gref* name-of-array ...) as macros evaluating to cffi:mem-aref and storing the respective linearized indices and memory pointers at the level of with-foreign-array? Although not as convenient as some 'self-optimizing' grid:gref, I would consider this a satisfactory solution. Don't know how to do that without getting lost in a forest of commas and backquotes, though.
best regards, Sebastian
On 27.10.2010, at 05:22, Liam Healy wrote:
Sebastian,
Can you temporarily define this and find the timing/consing for your test case:
(defmethod gref* ((object vector-double-float) linearized-index) (cffi:mem-aref (foreign-pointer object) :double linearized-index))
(I think you don't use any matrices but if you do, define an analogous function for matrix-double-float.)
As you can see, it has the literal type declaration, and I'm hopeful that CFFI will pick that up and make this competitive in speed with the best that you saw. If that's so, it should be fairly easy for me to make this generic and incorporate it into GSD. I'm still interested in making the linearization more efficient if that's still significant, but let's try this for now to see how much speed we can squeeze out of gref*.
Thanks,
Liam
Sebastian,
I'm very puzzled by this. You should now have the type :double hard-wired in the CFFI call. Can you run your big test with and without this function defined, instead of just the grid:gref* test? If it still shows no improvement, that says that the problem is not in the binding of type at runtime in the cffi:mem-aref call. Or if you'd like, send me the test (I know you posted pieces before but just to be sure, send the whole file again) and I will try it.
Liam
On Thu, Oct 28, 2010 at 10:51 AM, Sebastian Sturm Sebastian.Sturm@itp.uni-leipzig.de wrote:
Liam,
I have tried this and (without using (the double-float (...)) since this is unnecessary for cffi:mem-aref with hardcoded :double data type), I get the following timing data:
using the standard grid:gref routine: Evaluation took: 0.542 seconds of real time 0.536671 seconds of total run time (0.525229 user, 0.011442 system) [ Run times consist of 0.037 seconds GC time, and 0.500 seconds non-GC time. ] 99.08% CPU 1,189,729,211 processor cycles 153,334,416 bytes consed
using the standard grid:gref* routine: Evaluation took: 0.067 seconds of real time 0.066006 seconds of total run time (0.063179 user, 0.002827 system) [ Run times consist of 0.008 seconds GC time, and 0.059 seconds non-GC time. ] 98.51% CPU 146,192,222 processor cycles 27,060,416 bytes consed
using the modified grid:gref* routine (specialized to grid:vector-double-float) Evaluation took: 0.070 seconds of real time 0.068041 seconds of total run time (0.065151 user, 0.002890 system) [ Run times consist of 0.011 seconds GC time, and 0.058 seconds non-GC time. ] 97.14% CPU 152,642,611 processor cycles 27,061,456 bytes consed
Apparently my system didn't notice the difference. Also, SBCL complains about the argument (grid::foreign-pointer object) to cffi:mem-aref being of type NUMBER instead of integer or fixnum. Furthermore, it says it has to do float to pointer coercion to <return-value>. I checked that the redefined method is actually used by doing a second run with a print statement added to the defmethod.
using cffi:mem-aref directly: Evaluation took: 0.002 seconds of real time 0.002483 seconds of total run time (0.002482 user, 0.000001 system) 100.00% CPU 5,442,756 processor cycles 0 bytes consed
I guess that even using compiler macros or other trickery one would have to remove the allocation of linearized indices and foreign ptr addresses from the inner loops as I have done in my example by using auxiliary variables zvector-fptr and output-fptr. Maybe one can define something like (with-foreign-array (name-of-array :double) ...) that locally redefines (grid:gref name-of-array ...) and (grid:gref* name-of-array ...) as macros evaluating to cffi:mem-aref and storing the respective linearized indices and memory pointers at the level of with-foreign-array? Although not as convenient as some 'self-optimizing' grid:gref, I would consider this a satisfactory solution. Don't know how to do that without getting lost in a forest of commas and backquotes, though.
best regards, Sebastian
On 27.10.2010, at 05:22, Liam Healy wrote:
Sebastian,
Can you temporarily define this and find the timing/consing for your test case:
(defmethod gref* ((object vector-double-float) linearized-index) (cffi:mem-aref (foreign-pointer object) :double linearized-index))
(I think you don't use any matrices but if you do, define an analogous function for matrix-double-float.)
As you can see, it has the literal type declaration, and I'm hopeful that CFFI will pick that up and make this competitive in speed with the best that you saw. If that's so, it should be fairly easy for me to make this generic and incorporate it into GSD. I'm still interested in making the linearization more efficient if that's still significant, but let's try this for now to see how much speed we can squeeze out of gref*.
Thanks,
Liam
GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel