The timing is unfortunate; I started on GSD a while ago, once I re-discovered Tamas's AFFI, and was wrapping it up when he announced xarray. I have looked over xarray to figure out what we have in common, but not in depth. I think any help from interested/energetic/knowledgeable volunteers who could help figure out a merge strategy would be most welcome in this area.
One of the things I did to integrate GSD with GSLL was to pull out the foreign-array access as the system "c-array", with type conversion between languages. This was motivated by the work of a colleague, who essentially tore off pieces of GSLL in order to make a port to a robotics library in C++, and who had requested certain array construction features in GSLL. Now it should be easier to built an interface to a new library, and I hope eventually to smoothly change array objects to make them compatible with different libraries. Unfortunately I haven't had the time I thought I would in January to pursue this, but there is certainly much more work to be done in this area. I pushed to the GSLL master branch because that all worked as before, and GSD are usable.
I think all of our goals are basically the same. Per Tamas's points: 1 - Exactly the same, present in GSD "grid" now 2 - Not yet in GSD, but that is the goal as I say 3 - Not there; would be nice to integrate with GSL's views as we have discussed 4 - Present in GSD; see "compose.lisp". Perhaps not all everyone is looking for, but with some thought and knowledge of AFFI, they're not too hard to write.
The general function to make most of the array composition functions possible is map-grid and map-n-grids. One thing I would like to have but isn't there is reduction over an axis, as for instance with dot product and matrix multiplication. Also, I would like to be able to treat a scalar as a grid but that isn't possible yet.
Liam
On Sun, Jan 10, 2010 at 2:39 AM, Tamas Papp tkpapp@gmail.com wrote:
Hi Mirko,
I only had a cursory look at Liam's library. Slices seem to be based on affine indexing, which in theory is faster than xarray's views, but it is less general.
At the moment, xarray does four things:
- rank/dimension query and element access. If it is an array-like
object, you can just write methods and access it via a general interface. Essentially redoing CL's array interface, but with generic functions.
- object conversion framework. You can use (take 'array object) to
convert your object to a Lisp array. TAKE is a generic function, you just specialize if you introduce new classes. It takes options (like element-type for arrays, etc). Think of R's as.* functions.
- views. This is the only part I blogged about, but it may be the
least important. It is just that the rest is still under development.
- convenience functions for generating, mapping, manipulating
array-like objects. If your objects have a basic xarray interface (basically xdims and xref), xarray can take outer products, map into other objects, etc. You can specialize methods for speed, too.
There are a few details I am still ironing out, but basically everything is functional now. I consider 1. and 2. above the most important features of xarray.
My goal is to arrive at a state like R, where I don't have to think about whether my matrix is R's standard matrix, or implemented by the Matrix library, if I don't want to. Things should "just work", and I should only have to think about details when I optimize my code.
Caveat: xarray is still experimental. 1. above should be stable, 2. more or less, 3. might change, and 4. too. I want to get it right. Discussion about people's needs/preferences would be welcome.
Best,
Tamas
PS.: I am on the GSLL list, so no need to cc me. My Princeton e-mail is deprecated now.
On Sat, 09 Jan 2010, Mirko Vukovic wrote:
Any comments on how that compares to Tamas Papp's xarray?
I am very excited, happy and grateful with the availability of numerical libraries in CL. But I am concerned that we may end up with several non-conformant libraries for array accessing.
In an ideal world, we would have a common library for array manipulation, and also a common architecture for accessing libraries such as gsl (C), netlib (fortran), and others (sundials for example).
Respectfully,
Mirko
On Sun, Dec 27, 2009 at 10:31 PM, Liam Healy lhealy@common-lisp.net wrote:
Based on several expressed wishes for the ability to create, compose and extract array-like objects and pieces of such objects, I have introduced the "Grid Structured Data" collection at http://repo.or.cz/w/gsd.git, and rewritten GSLL to be built on top of that. The GSLL interface is the same as before, but now it is possible to do subarrays, concatenation, slices, transpose, etc. on arrays (both CL arrays and GSLL marrays). There is some documentation for gsd in gsd/documentation/grid/index.html which describes how it works. There is more work to be done, but as it is now, it provides functions that people have asked for to create and manipulate marrays. If you don't need that, you can go on as before and everything is the same.
Liam
Gsll-devel mailing list Gsll-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
My cursory look suggests a fair amount of similarity as well, and I find attractive features in both -- but want just one API!
A reasonable proposal, from my highly biased perspective, would be a 3rd party merge of the better components of each.
(I've spent time with the xarray interface, which is the source of my biases -- it mostly works for what I want to do, despite being a simplification of the lisp-matrix access approach -- which isn't bad, there are some lisp-matrix functionality which is strictly edge-case relevant...
On Sun, Jan 10, 2010 at 3:28 PM, Liam Healy lhealy@common-lisp.net wrote:
The timing is unfortunate; I started on GSD a while ago, once I re-discovered Tamas's AFFI, and was wrapping it up when he announced xarray. I have looked over xarray to figure out what we have in common, but not in depth. I think any help from interested/energetic/knowledgeable volunteers who could help figure out a merge strategy would be most welcome in this area.
One of the things I did to integrate GSD with GSLL was to pull out the foreign-array access as the system "c-array", with type conversion between languages. This was motivated by the work of a colleague, who essentially tore off pieces of GSLL in order to make a port to a robotics library in C++, and who had requested certain array construction features in GSLL. Now it should be easier to built an interface to a new library, and I hope eventually to smoothly change array objects to make them compatible with different libraries. Unfortunately I haven't had the time I thought I would in January to pursue this, but there is certainly much more work to be done in this area. I pushed to the GSLL master branch because that all worked as before, and GSD are usable.
I think all of our goals are basically the same. Per Tamas's points: 1 - Exactly the same, present in GSD "grid" now 2 - Not yet in GSD, but that is the goal as I say 3 - Not there; would be nice to integrate with GSL's views as we have discussed 4 - Present in GSD; see "compose.lisp". Perhaps not all everyone is looking for, but with some thought and knowledge of AFFI, they're not too hard to write.
The general function to make most of the array composition functions possible is map-grid and map-n-grids. One thing I would like to have but isn't there is reduction over an axis, as for instance with dot product and matrix multiplication. Also, I would like to be able to treat a scalar as a grid but that isn't possible yet.
Liam
On Sun, Jan 10, 2010 at 2:39 AM, Tamas Papp tkpapp@gmail.com wrote:
Hi Mirko,
I only had a cursory look at Liam's library. Slices seem to be based on affine indexing, which in theory is faster than xarray's views, but it is less general.
At the moment, xarray does four things:
- rank/dimension query and element access. If it is an array-like
object, you can just write methods and access it via a general interface. Essentially redoing CL's array interface, but with generic functions.
- object conversion framework. You can use (take 'array object) to
convert your object to a Lisp array. TAKE is a generic function, you just specialize if you introduce new classes. It takes options (like element-type for arrays, etc). Think of R's as.* functions.
- views. This is the only part I blogged about, but it may be the
least important. It is just that the rest is still under development.
- convenience functions for generating, mapping, manipulating
array-like objects. If your objects have a basic xarray interface (basically xdims and xref), xarray can take outer products, map into other objects, etc. You can specialize methods for speed, too.
There are a few details I am still ironing out, but basically everything is functional now. I consider 1. and 2. above the most important features of xarray.
My goal is to arrive at a state like R, where I don't have to think about whether my matrix is R's standard matrix, or implemented by the Matrix library, if I don't want to. Things should "just work", and I should only have to think about details when I optimize my code.
Caveat: xarray is still experimental. 1. above should be stable, 2. more or less, 3. might change, and 4. too. I want to get it right. Discussion about people's needs/preferences would be welcome.
Best,
Tamas
PS.: I am on the GSLL list, so no need to cc me. My Princeton e-mail is deprecated now.
On Sat, 09 Jan 2010, Mirko Vukovic wrote:
Any comments on how that compares to Tamas Papp's xarray?
I am very excited, happy and grateful with the availability of numerical libraries in CL. But I am concerned that we may end up with several non-conformant libraries for array accessing.
In an ideal world, we would have a common library for array manipulation, and also a common architecture for accessing libraries such as gsl (C), netlib (fortran), and others (sundials for example).
Respectfully,
Mirko
On Sun, Dec 27, 2009 at 10:31 PM, Liam Healy lhealy@common-lisp.net wrote:
Based on several expressed wishes for the ability to create, compose and extract array-like objects and pieces of such objects, I have introduced the "Grid Structured Data" collection at http://repo.or.cz/w/gsd.git, and rewritten GSLL to be built on top of that. The GSLL interface is the same as before, but now it is possible to do subarrays, concatenation, slices, transpose, etc. on arrays (both CL arrays and GSLL marrays). There is some documentation for gsd in gsd/documentation/grid/index.html which describes how it works. There is more work to be done, but as it is now, it provides functions that people have asked for to create and manipulate marrays. If you don't need that, you can go on as before and everything is the same.
Liam
Gsll-devel mailing list Gsll-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Gsll-devel mailing list Gsll-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
I don't think there's any doubt we all want one all-singing all-dancing interface that provides array utility for everything in CL that needs it. The reason there are two starts is because we both started in parallel without the being aware of the other's work. The reason that GSD only got as far as it did was because I had addressed the complaints on this mailing list and elsewhere (including me) and I had no more time. I do not think of it as complete.
So the real work here is going through both sets of code and figuring out how to unify them. I think all the help we can get would be welcome; I'm certainly willing to work toward the goal.
To get the ball rolling, start with a feature at a user's level that you need; I mean by this some very specific thing that you've coded in CL already and found to be clumsy. Then see how each package implements it or would implement it. Then post to the list your findings, together with your example if possible, to start a discussion. Anyone can do this; I don't think there's a need to restrict to a "third party"; we lack a first party at the moment. By picking single feature(s) and working from there, we will incrementally get to the goal we all seek. If we try to do everything at once or at the most abstract level, we're likely not going to get there as quickly.
Liam
On Sun, Jan 10, 2010 at 11:50 AM, A.J. Rossini blindglobe@gmail.com wrote:
My cursory look suggests a fair amount of similarity as well, and I find attractive features in both -- but want just one API!
A reasonable proposal, from my highly biased perspective, would be a 3rd party merge of the better components of each.
(I've spent time with the xarray interface, which is the source of my biases -- it mostly works for what I want to do, despite being a simplification of the lisp-matrix access approach -- which isn't bad, there are some lisp-matrix functionality which is strictly edge-case relevant...
The only point about having a 3rd party do it, is so that the current excellent efforts that you and Tamas are putting into this, don't stall into an argumentative mess :-).
(can I have my cake and eat it, too? only if I and others put in the efforts...).
On Sun, Jan 10, 2010 at 9:20 PM, Liam Healy lhealy@common-lisp.net wrote:
I don't think there's any doubt we all want one all-singing all-dancing interface that provides array utility for everything in CL that needs it. The reason there are two starts is because we both started in parallel without the being aware of the other's work. The reason that GSD only got as far as it did was because I had addressed the complaints on this mailing list and elsewhere (including me) and I had no more time. I do not think of it as complete.
So the real work here is going through both sets of code and figuring out how to unify them. I think all the help we can get would be welcome; I'm certainly willing to work toward the goal.
To get the ball rolling, start with a feature at a user's level that you need; I mean by this some very specific thing that you've coded in CL already and found to be clumsy. Then see how each package implements it or would implement it. Then post to the list your findings, together with your example if possible, to start a discussion. Anyone can do this; I don't think there's a need to restrict to a "third party"; we lack a first party at the moment. By picking single feature(s) and working from there, we will incrementally get to the goal we all seek. If we try to do everything at once or at the most abstract level, we're likely not going to get there as quickly.
Liam
On Sun, Jan 10, 2010 at 11:50 AM, A.J. Rossini blindglobe@gmail.com wrote:
My cursory look suggests a fair amount of similarity as well, and I find attractive features in both -- but want just one API!
A reasonable proposal, from my highly biased perspective, would be a 3rd party merge of the better components of each.
(I've spent time with the xarray interface, which is the source of my biases -- it mostly works for what I want to do, despite being a simplification of the lisp-matrix access approach -- which isn't bad, there are some lisp-matrix functionality which is strictly edge-case relevant...
Features that I would like:
1. Compatibility with GSLL and LAPACK (and NETLIB for that matter). When doing numerics, I would use grid (or xarray) and forget about cl-arrays. 2. More forgiving interface in array creation: coerce supplied values to declared type 3. syntactic sugar: refer to array subscripts using underscores: a_i_j or a_2:5_*
On that last point, I have a small utility that does the first example. I am not sure where to post it for your review. I think github is overkill to post three files (asd, package, and lisp).
I am intrigued by xarrays' generic interface, so that xarrays can interface with `any type of object'. I fail to see its use now, but that is just my lack of imagination.
On a `lack of imagination' topic, can someone give me an example of indexing that xarray has, and that the affine indexing cannot accomplish?
Thanks,
Mirko
On Sun, Jan 10, 2010 at 3:20 PM, Liam Healy lhealy@common-lisp.net wrote:
I don't think there's any doubt we all want one all-singing all-dancing interface that provides array utility for everything in CL that needs it. The reason there are two starts is because we both started in parallel without the being aware of the other's work. The reason that GSD only got as far as it did was because I had addressed the complaints on this mailing list and elsewhere (including me) and I had no more time. I do not think of it as complete.
So the real work here is going through both sets of code and figuring out how to unify them. I think all the help we can get would be welcome; I'm certainly willing to work toward the goal.
To get the ball rolling, start with a feature at a user's level that you need; I mean by this some very specific thing that you've coded in CL already and found to be clumsy. Then see how each package implements it or would implement it. Then post to the list your findings, together with your example if possible, to start a discussion. Anyone can do this; I don't think there's a need to restrict to a "third party"; we lack a first party at the moment. By picking single feature(s) and working from there, we will incrementally get to the goal we all seek. If we try to do everything at once or at the most abstract level, we're likely not going to get there as quickly.
Liam
On Sun, Jan 10, 2010 at 11:50 AM, A.J. Rossini blindglobe@gmail.com wrote:
My cursory look suggests a fair amount of similarity as well, and I find attractive features in both -- but want just one API!
A reasonable proposal, from my highly biased perspective, would be a 3rd party merge of the better components of each.
(I've spent time with the xarray interface, which is the source of my biases -- it mostly works for what I want to do, despite being a simplification of the lisp-matrix access approach -- which isn't bad, there are some lisp-matrix functionality which is strictly edge-case relevant...
Gsll-devel mailing list Gsll-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
On Tue, 12 Jan 2010, Mirko Vukovic wrote:
Features that I would like:
- Compatibility with GSLL and LAPACK (and NETLIB for that matter). When
doing numerics, I would use grid (or xarray) and forget about cl-arrays.
LLA, a Lisp interface to LAPACK, already supports xarray. LLA is on github, but it is undergoing a major rewrite at the moment --- drop me an e-mail if you are interested in the latest version (which I will push soon anyway).
- More forgiving interface in array creation: coerce supplied values to
declared type 3. syntactic sugar: refer to array subscripts using underscores: a_i_j or a_2:5_*
On that last point, I have a small utility that does the first example. I am not sure where to post it for your review. I think github is overkill to post three files (asd, package, and lisp).
paste.lisp.org
I am intrigued by xarrays' generic interface, so that xarrays can interface with `any type of object'. I fail to see its use now, but that is just my lack of imagination.
Imagine that you write your
(do-something-to-a matrix)
method, which is supposed to work on, say, GSLL matrices. You can define a fallback method as
(defmethod do-something-to-a (matrix) (do-something-to-a (take 'gsll:matrix matrix)))
and from then on, your method will work with all kinds of matrix-like objects for which you have defined take methods. Of course, conversion may not be fast, but for exploratory programming, you should not worry about this.
On a `lack of imagination' topic, can someone give me an example of indexing that xarray has, and that the affine indexing cannot accomplish?
Eg if you want the first, second, and fifth row from a matrix.
Tamas
Some thoughts on the two interfaces (grid, xarray) discussed here ...
I am trying to figure out if we can classify different types of usage of vector and matrix data. The classification below is very rough with much gray area in-between.
At some basic level, collections of numbers are either
1. vectors and arrays to be processed by numerical algorithms 2. just collections of numbers that are will be parsed, processed in some semi-numerical algorithms
Packages such as GSL and LAPACK will deal mostly with the first kind.
For other uses, like when dealing with results from multiple experiments, we are using vectors and arrays as indexed storage with fast access, but there may not be anything `algebraic' (in the sense of linear algebra) to those collections.
In this second case, we may choose to process all the numbers in the collection, or some random subset of them. (In either case, vectorized processing of those collections may be desired - Tamas has published a package that does that).
It seems to me that Tamas' (now abandoned) `affi' package, on top of which `grid' is built upon, is a natural for case 1 above, while xarray is natural for case 2 above.
In addition, someone noted that affi is probably faster than xarray (to be verified), which is of paramount importance for the number crunching libraries (We first use non-numeric tools at the top level when parsing the data, which than may pass the data to the number-crunchers in gsll, lla, where speed is important).
In that case, the two packages may have a valid role each. What would be optimal would be a unified notation, in which case that of grid would be a subset of the xarray.
Mirko
I agree there are the different classes of usage, and it's certainly my hope that whatever we adopt will be usable and convenient for both cases. I'm not sure there will be a dramatic difference in efficiency; I think this it might be a case of premature optimization.
Anyway, I'd like to make a distinction in surface syntax and core implementation. I think any algorithm can be implemented with any surface syntax that we want, and it seems that a lot of the syntax of xarray and grid are similar, and where they're different sometimes it's because I ran out of time and didn't carry up through the layers some of what Tamas did in affi that I see in xarray. Other times it's just a missing feature, like reduction. As far as core implementation goes, it seems like there ought to be a choice between affi and xarray, presuming there's a difference in efficiency or some other useful quality.
Liam
On Sun, Jan 24, 2010 at 7:24 PM, Mirko Vukovic mirko.vukovic@gmail.com wrote:
Some thoughts on the two interfaces (grid, xarray) discussed here ...
I am trying to figure out if we can classify different types of usage of vector and matrix data. The classification below is very rough with much gray area in-between.
At some basic level, collections of numbers are either
vectors and arrays to be processed by numerical algorithms just collections of numbers that are will be parsed, processed in some semi-numerical algorithms
Packages such as GSL and LAPACK will deal mostly with the first kind.
For other uses, like when dealing with results from multiple experiments, we are using vectors and arrays as indexed storage with fast access, but there may not be anything `algebraic' (in the sense of linear algebra) to those collections.
In this second case, we may choose to process all the numbers in the collection, or some random subset of them. (In either case, vectorized processing of those collections may be desired - Tamas has published a package that does that).
It seems to me that Tamas' (now abandoned) `affi' package, on top of which `grid' is built upon, is a natural for case 1 above, while xarray is natural for case 2 above.
In addition, someone noted that affi is probably faster than xarray (to be verified), which is of paramount importance for the number crunching libraries (We first use non-numeric tools at the top level when parsing the data, which than may pass the data to the number-crunchers in gsll, lla, where speed is important).
In that case, the two packages may have a valid role each. What would be optimal would be a unified notation, in which case that of grid would be a subset of the xarray.
Mirko