Numpy 2D Array Mapping Functions

Question

Numpy 2D Array Mapping Functions

I have a function foothat takes an NxM numpy array as an argument and returns a scalar value. I have a numpx AxNxM array datathat I would like to map over footo give me a numpy resulting array of length A.

Now I do this:

result = numpy.array([foo(x) for x in data])

This works, but it looks like I'm not using numpy magic (and speed). Is there a better way?

I looked through numpy.vectorizeand numpy.apply_along_axisbut does not work for the function of 2D arrays.

EDIT: I am doing enhanced regression on 24x24 images, so my AxNxM is something like 1000x24x24. What I have mentioned fooabove applies a haradike function to the patch (therefore, do not calculate too intensively).

+5

python numpy

perimosocordiae May 05 '10 at 11:19

source share

2 answers

, 3D- 2D- foo , 1D , , foo. ( trace foo):

from numpy import *

def apply2d_along_first(func2d, arr3d):
    a, n, m = arr3d.shape
    def func1d(arr1d):
        return func2d(arr1d.reshape((n,m)))
    arr2d = arr3d.reshape((a,n*m))
    return apply_along_axis(func1d, -1, arr2d)

A, N, M = 3, 4, 5
data = arange(A*N*M).reshape((A,N,M))

print data
print apply2d_along_first(trace, data)

:

[[[ 0  1  2  3  4]
  [ 5  6  7  8  9]
  [10 11 12 13 14]
  [15 16 17 18 19]]

 [[20 21 22 23 24]
  [25 26 27 28 29]
  [30 31 32 33 34]
  [35 36 37 38 39]]

 [[40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]
[ 36 116 196]

+1

sastanin 05 '10 12:41

wisty · Accepted Answer · 2010-05-05T11:40:31+0000

If NxM is large (say, 100), then the cost of iterating over A will be amortized basically.

Say the array is 1000 X 100 X 100.

The iteration is O (1000), but the cumulative cost of the internal function is O (1000 X 100 X 100) - 10,000 times slower. (Note, my terminology is a little fascinating, but I know what I'm talking about)

I'm not sure, but you can try the following:

result = numpy.empty(data.shape[0])
for i in range(len(data)):
    result[i] = foo(data[i])

You would save a lot of memory when creating the list ... but the overhead of the loop would be greater.

Or you could write a parallel version of the loop and split it into several processes. This can be much faster, depending on how intense it is foo(since this would compensate for data processing).

Numpy 2D Array Mapping Functions

More articles: