At Fri, 08 Mar 2013 13:09:00 +0100, Nicolas Neuss wrote:
[I added some more people to the CC]
Akshay Srinivasan akshaysrinivasan@gmail.com writes:
At Thu, 07 Mar 2013 10:35:46 +0100, Nicolas Neuss wrote:
Akshay Srinivasan akshaysrinivasan@gmail.com writes:
I wanted to polish the whole "static object" thing I'm doing write now, by writing a thin layer over defclass itself. This seems awful lot like MOP.
What I'm doing right now is essentially that every tensor class has a set of inlined functions associated with it, which are then used inside macros everywhere to define type specific functions.
From what I understand of MOP, I can possibly define a meta-class which holds all these functions, and then possibly define the defmethod over instances of the metaclass (which sort of does the equivalent of macroexpansion ?). Essentially I don't really want to have runtime dispatch; can I define in some sense a meta-version of defmethod/defgeneric on a class so that it wraps the code for the method inside a symbol-macrolet to replace things like "+,-" with something specialised for the class?
Although I already can imagine what you want quite well - could you maybe sketch an example (as simple as possible)?
I can't find any real resource on MOP, so forgive me if this doesn't make any sense whatsoever.
The chapters 5 and 6 of the AMOP book should be freely available, see
http://www.clisp.org/impnotes/mop-chap.html
or google for "mop_spec.pdf".
Nicholas: I know this is probably more up your alley. Do you think this sort of thing is possible with MOP ?
I doubt a little that the MOP is sufficient for achieving this, but I also would not find using something like DEFMETHOD* instead of DEFMETHOD very bad and then you have all liberty you want. As I see it the MOP was created for achieving high flexibility, but not high performance, and so implementations like CMUCL or SBCL have a slightly similar but more lowlevel mechanism (compiler transforms, IIRC) how one can instruct the compiler to optimize operations with type information that is known at compile time.
Sigh. I was hoping to avoid doing all the superclass ordering and stuff, oh well. I sort of want to do things like the macro generate-typed-copy! in the file: https://github.com/enupten/matlisp/blob/tensor/src/level-1/copy.lisp without having to read things from a Hashtable everytime. Maybe writing a new object system is overkill, and I should just use a macro like with-slots.
What about doing a type-specialized compilation on-demand as I do in femlisp/src/matlisp/blas-basic.lisp?
Alternatively, this could also be triggered when NO-APPLICABLE-METHOD is called.
I glanced through the code. It looks like you're incrementally encoding information about loop ordering and what to do in the loop ? Am I right ? Is there a version of m^* which uses such a thing though. I've only read the source for t* (which was the inspiration for mod-idxtimes :)
Yes, I probably should've done that; but it only got painful when I had enumerate all the loop orderings for GEMM. The macros for different which generate the basic BLAS functions can be replaced in time with more elegant code with time.
Akshay
Apropos: I am still trying to build and run your Matlisp without success. First, I had difficulties because f77 did not know the "exit" command used in "iladlr.f", for example. Using gfortran compiled at least the Fortran code, however after compilation I am left in a state with apparently nothing new available.
Is this the Intel compiler ?
From the Manpages on my system
gfortran - GNU Fortran compiler
f77=fort77 - invoke f2c Fortran translator transparently, like a compiler
I think this has changed recently. Some time ago, f77 was the GNU Fortran compiler.
I'll try compiling it with f77 over the weekend and report back.
[...] ; /home/neuss/.cache/common-lisp/sbcl-1.1.5.5-203e2ac-linux-x64/home/neuss/matlisp/src/sugar/ASDF-TMP-seq.fasl written ; compilation finished in 0:00:00.022 ; ; compilation unit finished ; printed 8 notes
** MATLISP is loaded. Type (HELP MATLISP) to see a list of available symbols. To use matlisp:
(use-package "MATLISP") or (in-package "MATLISP-USER")
- (help matlisp)
; in: HELP MATLISP ; (HELP MATLISP) ; ; caught STYLE-WARNING: ; undefined function: HELP ; ; caught WARNING: ; undefined variable: MATLISP ; ; compilation unit finished ; Undefined function: ; HELP ; Undefined variable: ; MATLISP ; caught 1 WARNING condition ; caught 1 STYLE-WARNING condition
debugger invoked on a UNBOUND-VARIABLE in thread #<THREAD "main thread" RUNNING {10029D9833}>: The variable MATLISP is unbound.
Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name): 0: [ABORT] Exit debugger, returning to top level.
((LAMBDA ())) 0] 0
- (apropos "matlisp")
COMMON-LISP-USER::MATLISP :MATLISP (bound) :MATLISP-TESTS (bound) :MATLISP-USER (bound) *MATLISP-VERSION* (bound) MATLISP MATLISP-HERALD (fbound) MATLISP-NAME MATLISP-VERSION (fbound) SAVE-MATLISP (fbound) MATLISP-FFI::MATLISP-SPECIALIZED-ARRAY MATLISP-SYSTEM::MATLISP MATLISP-SYSTEM::MATLISP-CONDITIONS MATLISP-SYSTEM::MATLISP-CONFIG MATLISP-SYSTEM::MATLISP-PACKAGES MATLISP-SYSTEM::MATLISP-TESTS MATLISP-SYSTEM::MATLISP-UTILITIES
Yes, the old help system isn't yet incorporated. I'm sorry that you've to tread through my undocumented code.
Assuming you want to test the GEMM, you'd want to do something like:
(in-package :matlisp)
(let ((A (make-real-tensor 1000 1000)) (B (make-real-tensor 1000 1000))) ;;Slow and dynamic (time (mod-dotimes (idx (dimensions A)) do (progn (setf (tensor-ref A idx) (random 1d0) (tensor-ref B idx) (random 1d0))))) ;;Faster (although random slows it down quite a bit). #+nil (time (let-typed ((sto-a (store A) :type real-store-vector) (sto-b (store B) :type real-store-vector)) (mod-dotimes (idx (dimensions A)) with (linear-sums (of-a (strides A) (head A)) (of-b (strides B) (head B))) do (progn (real-typed.value-writer (random 1d0) sto-a of-a) (real-typed.value-writer (random 1d0) sto-b of-b))))) ;;Use lisp (let ((*real-l3-fcall-lb* 1000)) (time (gemm 1d0 A B nil nil))) ;;Use fortran (let ((*real-l3-fcall-lb* 0)) (time (gemm 1d0 A B nil nil))))
I realised I haven't actually added all my "test" files into the repository. I'll add them to the repo today.
On my computer the timings are something like: Lisp: 3.2s C (tests/mm.c): 2.2s Goto: 0.2s
Akshay
OK, this works.
The timings were on SBCL by the way. CCL sadly tends to be extremely slow. Don't know about other compiled lisps.
Some further questions and remarks:
- Do you have also a reader macro like [...] in old Matlisp? And could you illustrate how slicing works?
No, there isn't. I'm trying to tweak Mark Kantrowitz' infix and add slicing and the [..] declaration to it.
You can do the slicing in Lisp by doing things like:
(defvar X (make-real-tensor 10 10 10))
X
;; Get (:, 0, 0)
(sub-tensor~ X '((* * *) (0 * 1) (0 * 1)))
;; Get (:, 2:5, :)
(sub-tensor~ X '((* * *) (2 * 5)))
;; Get (:, :, 0:2:10) (0:10:2 = [i : 0 <= i < 10, i % 2 = 0])
(sub-tensor~ X '((* * *) (* * *) (0 2 10)))
The semantics of the slicing operator resembles that of Python, except for the "step" of the slice always being in the middle rather than the end. I know this function is ugly, but this was written with the purpose of being the backend to the infix (and move parsing into the infix reader).
- Looking at how complicated e.g. "gemm.lisp" is, I am not sure if doing this in CL is really worthwile. Optimizing for small matrices might be the wrong idea from the beginning.
Its actually optimized for everything. The code for gemm.lisp is extraordinarily hairy, because it has the code for every loop order (of which there are 3), to take advantage of SSE when possible. Its not too bad though. I know I could've written some sort of code to automate this, but I only have so much time.
It calls BLAS when the size of the matrix exceeds *real-l3-fcall-lb*, and so works very well on matrices of all sizes. Calling fortran for matrices of size less than 10, is quite expensive though. Even this is not entirely clear, because it appears that if you call the same fortran function repeatedly in a loop then the overhead tends to be much less for subsequent foreign calls than the first one. The power user can then bind the variables in src/base/tweakable.lisp for fine-tuned optimization.
- I would be interested in the minimal amount of code necessary for adding some new LAPACK routine. If possible, the stub should be even smaller than in femlisp/src/matlisp/ggev.lisp and femlisp/src/matlisp/hgev.lisp (solutions of generalized eigenvalue problems).
I don't think writing code for LAPACK in lisp is feasible; it doesn't look like femlisp does that, but I know this project called Lisplab has its code for LU ..
The code in src/lapack/getrs.lisp is not really all that bad. Sure its probably not as neat as that in lisp-matrix or femlisp, but its essentially doing the same thing.
The whole polishing the code thing I was referring to before would be the phase where I steal code and ideas from each of your projects and use them around the current structure of Matlisp.
- I really am interested in single-float stuff too, because I will look more closely in generating high-performance code in the next future. In this domain, using single-float is often interesting, because it needs only half the memory and using it can lead to double efficiency in situations where memory bandwidth is the limiting factor.
Yes, single-floats (and single-complex) should be quite useful. This should be very easy to add in the sense of basic functionality. All you have to do is define a new tensor type as in src/classes/symbolic-tensor.lisp, and call each of the method generation macros like generate-typed-gemm!.
I think it is trivial to generate methods if the arguments are of the same type, but it will take some work if this is not the case (for instance gemm with a real-matrix and a complex-matrix). Again I have an idea as to how to go about it, but I only have so much time on my hand. This part very much resembles how an Object system would work. The combinatorial explosion also means that hand-coding methods is going to be an utter pain in the back. I want to get all the method generation stuff working, before I even bother with this.
Akshay