Re: [Gsll-devel] Introducing "Grid Structured Data"

21 Feb 2010

      I agree there are the different classes of usage, and it's certainly
my hope that whatever we adopt will be usable and convenient for both
cases.  I'm not sure there will be a dramatic difference in
efficiency; I think this it might be a case of premature optimization.

Anyway, I'd like to make a distinction in surface syntax and core
implementation.  I think any algorithm can be implemented with any
surface syntax that we want, and it seems that a lot of the syntax of
xarray and grid are similar, and where they're different sometimes
it's because I ran out of time and didn't carry up through the layers
some of what Tamas did in affi that I see in xarray.  Other times it's
just a missing feature, like reduction.  As far as core implementation
goes, it seems like there ought to be a choice between affi and
xarray, presuming there's a difference in efficiency or some other
useful quality.

Liam

On Sun, Jan 24, 2010 at 7:24 PM, Mirko Vukovic <mirko.vukovic@gmail.com> wrote:
...
Some thoughts on the two interfaces (grid, xarray) discussed here ...
I am trying to figure out if we can classify different types of usage of
vector and matrix data.  The classification below is very rough with much
gray area in-between.
At some basic level, collections of numbers are either
vectors and arrays to be processed by numerical algorithms
just collections of numbers that are will be parsed, processed in some
semi-numerical algorithms
Packages such as GSL and LAPACK will deal mostly with the first kind.
For other uses, like when dealing with results from multiple experiments, we
are using vectors and arrays as indexed storage with fast access, but there
may not be anything `algebraic' (in the sense of linear algebra) to those
collections.
In this second case, we may choose to process all the numbers in the
collection, or some random subset of them.  (In either case, vectorized
processing of those collections may be desired - Tamas has published a
package that does that).
It seems to me that Tamas' (now abandoned) `affi'  package, on top of which
`grid' is built upon, is a natural for case 1 above, while xarray is natural
for case 2 above.
In addition, someone noted that affi is probably faster than xarray (to be
verified), which is of paramount importance for the number crunching
libraries (We first use non-numeric tools at the top level when parsing the
data, which than may pass the data to the number-crunchers in gsll, lla,
where speed is important).
In that case, the two packages may have a valid role each.  What would be
optimal would be a unified notation, in which case that of grid would be a
subset of the xarray.
Mirko

Re: [Gsll-devel] Introducing "Grid Structured Data"

Liam Healy