Get average value of a 2D fragment of a 3D array in numpy

Question

Get average value of a 2D fragment of a 3D array in numpy

I have a numpy array with the form:

(11L, 5L, 5L)

I want to calculate the average of 25 elements of each “slice” of the array [0,:,:], [1,:,:], etc., returning 11 values.

Seems silly, but I can't figure out how to do this. I thought the mean(axis=x) function would do this, but I tried all possible combinations of the axis, and none of them gave me the result I want.

I obviously can do this using a for and slicing loop, but of course, is there a better way?

+7

arrays numpy slice multidimensional-array mean

robintw Aug 21 '13 at 12:10

source share

3 answers

You can always use np.einsum :

 >>> a = np.arange(11*5*5).reshape(11,5,5) >>> np.einsum('...ijk->...i',a)/(a.shape[-1]*a.shape[-2]) array([ 12, 37, 62, 87, 112, 137, 162, 187, 212, 237, 262])

Works on arrays with large sizes (all these methods will be if the axis labels are changed):

 >>> a = np.arange(10*11*5*5).reshape(10,11,5,5) >>> (np.einsum('...ijk->...i',a)/(a.shape[-1]*a.shape[-2])).shape (10, 11)

Faster to load:

 a = np.arange(11*5*5).reshape(11,5,5) %timeit a.reshape(11, 25).mean(axis=1) 10000 loops, best of 3: 21.4 us per loop %timeit a.mean(axis=(1,2)) 10000 loops, best of 3: 19.4 us per loop %timeit np.einsum('...ijk->...i',a)/(a.shape[-1]*a.shape[-2]) 100000 loops, best of 3: 8.26 us per loop

Scales slightly better than other methods as the size of the array increases.

Using dtype=np.float64 does not change the above timings noticeably, so just double check:

 a = np.arange(110*50*50,dtype=np.float64).reshape(110,50,50) %timeit a.reshape(110,2500).mean(axis=1) 1000 loops, best of 3: 307 us per loop %timeit a.mean(axis=(1,2)) 1000 loops, best of 3: 308 us per loop %timeit np.einsum('...ijk->...i',a)/(a.shape[-1]*a.shape[-2]) 10000 loops, best of 3: 145 us per loop

Also interesting:

 %timeit np.sum(a) #37812362500.0 100000 loops, best of 3: 293 us per loop %timeit np.einsum('ijk->',a) #37812362500.0 100000 loops, best of 3: 144 us per loop

+5

Daniel Aug 21 '13 at 13:14

source share

You can reshape(11, 25) and then call mean only once (faster):

 a.reshape(11, 25).mean(axis=1)

Alternatively, you can call np.mean twice (about 2 times slower on my computer):

 a.mean(axis=2).mean(axis=1)

+4

Saullo castro Aug 21 '13 at 12:14

source share

J. Martinot-Lagarde · Accepted Answer · 2013-08-21T12:39:40+0000

Use the tuple for the axis:

 >>> a = np.arange(11*5*5).reshape(11,5,5) >>> a.mean(axis=(1,2)) array([ 12., 37., 62., 87., 112., 137., 162., 187., 212., 237., 262.])

Edit: this only works with numpy version 1.7+.

Get average value of a 2D fragment of a 3D array in numpy

More articles: