OK, I've now committed what I think is my last attempt at this. This latest version includes optimization of matrices and setting values in foreign arrays. This latter comes with a caveat however. The compiler will macroexpand setf, and for SBCL at least, that means making temporary variables to bind the actual values. Those variables of course don't have any declarations, so no compiler macro expansion is done. They way around this is to funcall #'(setf gref*) instead of using the setf macro, something that I bet most people won't want to do. If I get some more energy to pursue this, I could define a setf macro to shadow cl:setf, but I think I'll let this be for now.
I've written a "timing test" function loosely based on yours. It is in foreign-array/tests/timing.lisp. I hope in the next few days to run a few of these tests to see the benefit of the optimizations that have been put in place. You can see I have commented out the funcall #'(setf grid:gref*) form at the end; in this example there are so few sets that the speed difference is undetectable.
Liam