Effective 2d cumsum

Say I have such an array

>>> a = np.arange(1,8).reshape((1,-1)) >>> a array([[1, 2, 3, 4, 5, 6, 7]]) 

and I want to create for each of the elements in a "cumsum of next 4 items". That is my expected result

 1, 2, 3, 4, 5, 6, 7, 8 1+2, 2+3, ... 1+2+3 2+3+4 ... 1+2+3+4 2+3+4+5 ... 

i.e. matrix containing

 1, 2, 3, 4, 5, 0, 0, 0 3, 5, 7, 9, 11,0, 0, 0 6, 9, 12,15,18,0, 0, 0 10,14,18,21,26,0, 0, 0 

Since the cumsum operation cannot be performed correctly for the last 3 elements, I expect there to be 0 . I know how to make one kumma. In fact, arrays

 a[:4].cumsum().reshape((-1,1)); a[1:5].cumsum().reshape((-1,1))... 

arranged horizontally. However, I do not know how to do this in an effective way. What would be a beautiful vectorized numpy way of doing this? I am also open to scipy packages if they dominate numpy in terms of efficiency or readability.

+7
python arrays numpy scipy cumsum
source share
3 answers

One possible way would be to use a moving window approach in conjunction with cumsum() .

For example:

 from numpy.lib.stride_tricks import as_strided a = np.arange(1, 9) # the starting array slice_length = 4 

Then you could write:

 arr = as_strided(a, (slice_length, len(a)), (a.strides[0], a.strides[0])).cumsum(axis=0) 

You like this the most, but to fill in the remaining 0 values ​​you can use a slice and assign to get the desired result:

 arr[:, (1-slice_length):] = 0 

Then you have an array:

 >>> arr array([[ 1, 2, 3, 4, 5, 0, 0, 0], [ 3, 5, 7, 9, 11, 0, 0, 0], [ 6, 9, 12, 15, 18, 0, 0, 0], [10, 14, 18, 22, 26, 0, 0, 0]]) 

I don’t know if there is a way to produce exactly your desired result using one single vector method in NumPy (i.e. without slicing). ( accumulateat , a bit like reduceat , might be interesting to add to NumPy ufuncs ...)

+1
source share

You can perform your calculations efficiently using a simpler version of the summarized area table method, also known as integral image in image processing applications. First, you compute and save the table of summarized areas, the full cumsum of your first row with an added 0 in front:

 a = np.arange(1, 8) cs = np.concatenate(([0], np.cumsum(a))) 

And now you can create each of your "cumsum of next n items" as cs[:n] - cs[:-n] :

 >>> for n in range(1, 5): ... print n, '-->', (cs[n:] - cs[:-n])[:4] ... 1 --> [1 2 3 4] 2 --> [3 5 7 9] 3 --> [ 6 9 12 15] 4 --> [10 14 18 22] 

You need to arrange them correctly in the right form, but as soon as the initial calculation is completed, you can calculate each element of your output with a single subtraction, which is about as effective as it can get.

+1
source share

You can use broadcasting like this:

 In [53]: a Out[53]: array([ 4, 13, 4, 18, 1, 2, 11, 15]) In [54]: WSZ = 4 # Window size In [55]: idx = np.arange(WSZ)[:,None] + np.arange(a.size-WSZ+1) # Broadcasted indices In [56]: a[idx].cumsum(axis=0) # Index into "a" & perform cumsum along axis-0 Out[56]: array([[ 4, 13, 4, 18, 1], [17, 17, 22, 19, 3], [21, 35, 23, 21, 14], [39, 36, 25, 32, 29]], dtype=int32) 

A folder with zeros if necessary -

 In [57]: np.lib.pad(a[idx].cumsum(0),((0,0),(0,WSZ-1)),'constant',constant_values=0) Out[57]: array([[ 4, 13, 4, 18, 1, 0, 0, 0], [17, 17, 22, 19, 3, 0, 0, 0], [21, 35, 23, 21, 14, 0, 0, 0], [39, 36, 25, 32, 29, 0, 0, 0]], dtype=int32) 
0
source share

All Articles