Re: [Gsll-devel] practical differences between native CL array and the GSLL marray type?

15 Apr 2009


      I certainly have no objections to any contributions.  Go ahead and
write a method and post it here.
I would make a simple (numerical-equal (cl-array a) (cl-array b) ...)
kind of test.

Liam

On Wed, Apr 15, 2009 at 10:46 AM, Malcolm Reynolds
<malcolm.reynolds@gmail.com> wrote:
...
Thanks for adding this so quick Thomas! Just pulled down the new
version and it looks good. The specialisation for gsll-marrays looks
like it should be quite easy, I might even tackle it myself if Liam
has no objections..
One thing that would also be useful to me in my current work is to
compare matrices for some degree of approximate equality - eg if Y is
the pseudoinverse of the matrix X, then the properties XYX = X and YXY
= Y should hold, but there is likely to be some numerical aberrations.
I thought about a function for numerical-approximately-equal, with a
tolerance parameter, but now I think probably the best way to handle
this is to pass in a custom :test function. Would you agree?
Malcolm
On Wed, Apr 15, 2009 at 5:02 AM, Thomas M. Hermann <tmh.public@gmail.com> wrote:
...
The NUMERICAL-EQUAL function has been implemented as a generic function
in LISP-UNIT. It is specialized for the system classes NUMBER, LIST,
VECTOR and ARRAY. It has been lightly tested. The documentation has not
been updated. Please note that when you specialize it for your data
type, a congruent argument list is (RESULT1 RESULT2 &KEY TEST). This
will be clear when the documentation is updated.
Also note that the versions specialized for the CL system classes are
for RESULT1 and RESULT2 of the same class, so, for example
(NUMERICAL-EQUAL LIST ARRAY) ; Not defined => an error
will throw an error. I'll get around to defining the mixed class
versions later. Sooner if there is a real need for them.
Please feel free to email me bugs and further suggestions.
Cheers,
Tom
Liam Healy wrote:
...
I haven't defined such a predicate, but I think it's a good idea.  As
you have seen, in order to do tests on marrays, I turn all the marrays
into CL arrays and do the comparisons on the result.  I don't know if
it's the best possible way to do it, but it works.  You get an error
from assert-numerical-equal because it uses #'numerical-equal as a
test, and that does not handle marrays.   Your suggestion spurred me
to take a look at the lisp-unit source; what I'm thinking about is
changing  #'numerical-equal to a generic function with some methods
pre-defined for CL classes; this would permit applications like GSLL
to add methods for their own objects.
I've cc'ed Tom Hermann on this email; he is behind the modified
version of lisp-unit more so than I am.
Tom: what do you think of this idea?
Liam
On Mon, Apr 13, 2009 at 11:05 AM, Malcolm Reynolds
<malcolm.reynolds@gmail.com> wrote:
...
On a related point, I'm writing some unit tests using your modified
version of lisp-unit (the one from repo.or.cz) and I haven't as yet
been able to figure out a good way to compare two matrices for
equality. In the automatically generated tests in the GSLL code it
appears you do (cl-array ...) to convert the matrices back to native
form. Is this the best way possible, or is there some numerical
equality predicate in the gsll library that I've missed? It would seem
to be ideal to be able to
define a test by (assert-numerical-equal <marray 1> <marray 2>) but I
get an error message when I try this...
Malcolm
On Mon, Apr 13, 2009 at 2:24 PM, Malcolm Reynolds
<malcolm.reynolds@gmail.com> wrote:
...
Hi Liam, thanks for your response.
Glad to hear that marrays are recommended, and especially that they
are implemented well on SBCL as that is what I'm using.
As for the million element square matrices then yes, clearly on
reflection that won't work. Some of the matrices I'm using will be
sparse, but some will not be sparse at all - without wanting to get
bogged down in details, I'm going to need to represent the Graph
Laplacian, which is a matrix that is equal to the degree matrix minus
the adjacency matrix (and so in a fairly sparse graph this matrix is
sparse) but I will also need the graph Kernel, which is the
pseudoinverse of the Laplacian. This will not be sparse at all, so
clearly a direct representation of it isn't possible. I know of
someone in my university's department who has a fast method for this
kind of situation which means that calculating the full kernel matrix
is unnecessary, so I guess I can look into that if and when I find
that performance is a limiting factor.
Anyway thanks again for your response, and for your hard work on GSLL.
I'm pretty new to lisp and I'm continuously impressed by the quality
of the libraries!
Malcolm
On Sat, Apr 11, 2009 at 3:46 PM, Liam Healy <lhealy@common-lisp.net> wrote:
...
Yes, it is definitely my recommendation to use marrays
everywhere.  I am starting to get in the habit of doing so
myself.  One of my ideas is to extend CL array operations to
marrays so that you need not convert back and forth.  It is
also my intent that other math software libraries would also
use marrays (or an extension of them), so that they become a
standard way to interchange data between libraries.
There is more overhead for marrays than plain arrays, but on
SBCL where the native arrays are used in GSL directly, there
is very little, and there is no copying.  I recommend that
you build it with marray and then see if you have
performance problems.  However, if you are talking matrices
of 1000000x1000000, I think you will have problems in any
representation on almost any kind of hardware.  For single
floats, one such matrix is 4TB.  In that case, I recommend
you rethink the problem structure.  If these are sparse
matrices, you should use a sparse matrix representation.
GSL doesn't have such a representation, but others have been
interested in having it so it might be worth posting
something to the GSL mailing list -- perhaps someone has
written a GSL-compatible sparse array representation.
Liam
On Sat, Apr 11, 2009 at 8:12 AM, Malcolm Reynolds
<malcolm.reynolds@gmail.com> wrote:
> Hi all, hope this is the right place for this kind of request.
>
> I'm writing a program which will be doing Machine Learning over graph
> data structures. In it I'm going to be needing to represent a bunch of
> matrices, some sparse, some definitely not sparse, and do fairly basic
> operations on these matrices (add/ subtract/multiply, pseudoinverse,
> probably some others). I've started a basic prototype using the
> standard CL arrays but I always knew I would need GSLL to calculate a
> pseudo inverse.
>
> My question is, would it be a reasonably sensible decision to just use
> marrays for the whole program rather than worrying
> about converting to/ from CL-type arrays all the time? Are there any
> places where CL-arrays beat marrays either in terms of memory usage,
> access speed, or anything else? I might be needing to scale this
> system up be dealing with N*N square matrices where N is on the order
> of hundreds of thousands or even millions, if that makes any
> difference to the recommendation.
>
> Thanks in advance for any help!
>
> Malcolm Reynolds
--
=== Thomas M. Hermann ===