No binary operators for structured arrays in Numpy?

Question

No binary operators for structured arrays in Numpy?

Well, therefore, after going through tutorials on multi-level structured arrays, I can create some simple examples:

from numpy import array, ones names=['scalar', '1d-array', '2d-array'] formats=['float64', '(3,)float64', '(2,2)float64'] my_dtype = dict(names=names, formats=formats) struct_array1 = ones(1, dtype=my_dtype) struct_array2 = array([(42., [0., 1., 2.], [[5., 6.],[4., 3.]])], dtype=my_dtype)

(My alleged use case would have more than three entries and use very long 1d arrays.) So, everything goes well until we try to do some basic math. I get errors for all of the following:

 struct_array1 + struct_array2 struct_array1 * struct_array2 1.0 + struct_array1 2.0 * struct_array2

Apparently, simple operators (+, -, *, /) are not supported even for the simplest structured arrays. Or am I missing something? Should I look at another package (and not say Pandas, because this is a complete excess for this)? This seems like an obvious opportunity, so I'm a little dumbfounded. But it's hard to find any chatter about it on the net. Doesn't that greatly limit the usefulness of structured arrays? Why would anyone use a structural array rather than arrays packed in a dict? Is there a technical reason why this can be intractable? Or, if the right solution is to do the hard work of overloading, then how is this done while saving operations quickly?

+7

python binary-operators numpy structured-array

user2789194 Oct 13 '14 at 21:25

source share

3 answers

On the doc pages of the numpy structured array, most examples use mixed data types - float, ints, and strings. On SO, most issues with a structured array are related to loading mixed data from CSV files. On the other hand, in your example, it seems that the main purpose of the structure is to give names to the columns.

You can do the math on the named columns, for example.

 struct_array1['scalar']+struct_array2['scalar'] struct_array1['2d-array']+struct_array2['2d-array']

You can also iterate over the fields:

 for n in my_dtype['names']: print a1[n]+a2[n]

And yes, for this purpose, as well as these values of arrays in the dictionary or attributes of the object.

However, when thinking about the CSV case, sometimes we want to talk about specific “lines” of a CSV or structured array, for example. struct_array[0] . Such a "string" is a tuple of values.

In any case, the primary data structures in numpy are multidimensional arrays of numerical values, and most of the code revolves around the number data types - float, int, etc. Structured arrays are a generalization of this, using elements which, in essence, are simply fixed sets of bytes. How these bytes are interpreted is determined by dtype .

Think about how MATLAB evolved. First, matrices appeared, then cells (for example, Python lists), then structures, and finally classes and objects. Python already had lists, dictionaries, and objects. numpy adds arrays. It is not necessary to reinvent the general structures of Python.

I am inclined to define such a class:

 class Foo(object): def __init__(self): self.scalar = 1 self._1d_array = np.arange(10) self._2d_array = np.array([[1,2],[3,4]])

and implement only binary operations that are really necessary for the application.

+4

hpaulj Oct 13 '14 at 23:05

source share

Well, after more research, I came across an answer. (There is no mistake in hpaulj - the question has not been posed so well.) But I wanted to publish in case anyone else has such a disorder.

The answer comes from the numpy documentation on ndarray.view. They specifically provide an example in which they "[create] a view on a structured array so that it can be used in calculations."

So, I was upset that I could not work with my example of structured arrays. In the end, I "see" my structured array as just a collection of floating point numbers! Well, in the end, all I needed was to tell numpy this abstraction using "view". Errors in the question can be avoided by using:

 ( struct_array1.view(dtype='float64') + struct_array2.view(dtype='float64') ).view(dtype=my_dtype) ( struct_array1.view(dtype='float64') + struct_array2.view(dtype='float64') ).view(dtype=my_dtype) ( 1.0 + struct_array2.view(dtype='float64') ).view(dtype=my_dtype) ( 2.0 * struct_array2.view(dtype='float64') ).view(dtype=my_dtype)

It is not as elegant as we would like, but at least he has the opportunity.

0

user2789194 Oct 17 '14 at 0:49

source share

hpaulj · Accepted Answer · 2014-10-24T16:23:19+0000

Another way to work with the entire array is to use the type 'union' described in the documentation. In your example, you can expand your dtype by adding a "union" field and specifying overlapping "offsets":

 from numpy import array, ones, zeros names=['scalar', '1d-array', '2d-array', 'union'] formats=['float64', '(3,)float64', '(2,2)float64', '(8,)float64'] offsets=[0, 8, 32, 0] my_dtype = dict(names=names, formats=formats, offsets=offsets) struct_array3=zeros((4,), dtype=my_dtype)

['union'] now provides access to all data as an array (n,8)

 struct_array3['union'] # == struct_array3.view('(8,)f8') struct_array3['union'].shape # (4,8)

You can work with the "union" or any other fields:

 struct_array3['union'] += 2 struct_array3['scalar']= 1

The union field may have another compatible form, for example, '(2,4)float64' . The "string" of such an array might look like this:

 array([ (3.0, [0.0, 0.0, 0.0], [[2.0, 2.0], [0.0, 0.0]], [[3.0, 0.0, 0.0, 0.0], [2.0, 2.0, 0.0, 0.0]])], dtype={'names':['scalar','1d-array','2d-array','union'], 'formats':['<f8',('<f8', (3,)),('<f8', (2, 2)),('<f8', (2, 4))], 'offsets':[0,8,32,0], 'itemsize':64})

No binary operators for structured arrays in Numpy?

More articles: