Numid Sum of Antidiagonal Array

Given the number of ndarray, I would like to take the first two axes and replace them with a new axis, which is the sum of their anti-diagonals.

In particular, suppose I have variables x, y, z, ..., and the entries in my array represent the probability

array[i,j,k,...] = P(x=i, y=j, z=k, ...) 

I would like to get

 new_array[l,k,...] = P(x+y=l, z=k, ...) = sum_i P(x=i, y=li, z=k, ...) 

ie, new_array[l,k,...] is the sum of all array[i,j,k,...] such that i+j=l .

What is the most efficient and / or cleanest way to do this in numpy?

EDIT add: Following @hpaulj's recommendation, here is an obvious iterative solution:

 array = numpy.arange(30).reshape((2,3,5)) array = array / float(array.sum()) # make it a probability new_array = numpy.zeros([array.shape[0] + array.shape[1] - 1] + list(array.shape[2:])) for i in range(array.shape[0]): for j in range(array.shape[1]): new_array[i+j,...] += array[i,j,...] new_array.sum() # == 1 
+2
python arrays numpy scipy multidimensional-array
source share
1 answer

There is a trace function that gives the sum of the diagonal. You can specify the offset and 2 axes (0 and 1 are the default values). And to get anti-diagonal, you just need to flip one dimension. np.flipud does this, although it simply indexes [::-1,...] .

Putting them together

 np.array([np.trace(np.flipud(array),offset=k) for k in range(-1,3)]) 

matches your new_array .

It still goes over the possible values ​​of l (in this case 4). trace itself compiled.

In this small case, it is actually slower than your double loop (2x3 steps). Even if I move flipud from the inner loop, it is still slower. I do not know how this scales for large arrays.

Part of the problem with vectorizing this even further is the fact that each diagonal has a different length.

 In [331]: %%timeit array1 = array[::-1] np.array([np.trace(array1,offset=k) for k in range(-1,3)]) .....: 10000 loops, best of 3: 87.4 µs per loop In [332]: %%timeit new_array = np.zeros([array.shape[0] + array.shape[1] - 1] + list(array.shape[2:])) for i in range(2): for j in range(3): new_array[i+j] += array[i,j] .....: 10000 loops, best of 3: 43.5 µs per loop 

scipy.sparse has a dia format in which values ​​of nonzero diagonals are stored. It saves a populated array of values ​​along with offsets.

 array([[12, 0, 0, 0], [ 8, 13, 0, 0], [ 4, 9, 14, 0], [ 0, 5, 10, 15], [ 0, 1, 6, 11], [ 0, 0, 2, 7], [ 0, 0, 0, 3]]) array([-3, -2, -1, 0, 1, 2, 3]) 

Although this is a way around the problem of variable diagonal lengths, I don’t think it helps in this case when you just need their sums.

+3
source share

All Articles