Find diagonal sums in numpy (faster)

I have several numpy board arrays like this:

 array([[0, 0, 0, 1, 0, 0, 0, 0], [1, 0, 0, 0, 0, 1, 0, 1], [0, 0, 0, 0, 0, 0, 0, 1], [0, 1, 0, 1, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 1, 0, 0, 0], [0, 0, 1, 0, 0, 0, 1, 0], [1, 0, 0, 0, 0, 1, 0, 0]]) 

And I use the following code to find the sum of the elements on each nth diagonal from -7 to 8 boards (and its mirror version).

 n = 8 rate = [b.diagonal(i).sum() for b in (board, board[::-1]) for i in range(-n+1, n)] 

After some profiling, this operation takes about 2/3 of the total working time, and it seems that this is due to two factors:

  • The .diagonal method creates a new array instead of a view (it looks like numpy 1.7 will have a new .diag method to solve this problem)
  • Iteration is performed in python inside a list comprehension.

So, are there ways to find these amounts faster (perhaps in the C numpy layer)?


After several tests, I was able to reduce 7.5x the total time by caching this operation ... Perhaps I was looking for the wrong bottleneck?


One more thing:

I just found the .trace method, which replaces the diagonal(i).sum() thing and ... There was no big improvement in performance (from 2 to 4%).

Therefore, the problem must be understanding. Any ideas?

+7
python numpy sum
source share
2 answers

Possible solution using stride_tricks . This is partly based on the wealth of information available in answering this question , but the problem, it seems to me, is simply not the same as a duplicate. Here's the basic idea applied to a square matrix - see below for a function that implements a more general solution.

 >>> cols = 8 >>> a = numpy.arange(cols * cols).reshape((cols, cols)) >>> fill = numpy.zeros((cols - 1) * cols, dtype='i8').reshape((cols - 1, cols)) >>> stacked = numpy.vstack((a, fill, a)) >>> major_stride, minor_stride = stacked.strides >>> strides = major_stride, minor_stride * (cols + 1) >>> shape = (cols * 2 - 1, cols) >>> numpy.lib.stride_tricks.as_strided(stacked, shape, strides) array([[ 0, 9, 18, 27, 36, 45, 54, 63], [ 8, 17, 26, 35, 44, 53, 62, 0], [16, 25, 34, 43, 52, 61, 0, 0], [24, 33, 42, 51, 60, 0, 0, 0], [32, 41, 50, 59, 0, 0, 0, 0], [40, 49, 58, 0, 0, 0, 0, 0], [48, 57, 0, 0, 0, 0, 0, 0], [56, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 0, 7], [ 0, 0, 0, 0, 0, 0, 6, 15], [ 0, 0, 0, 0, 0, 5, 14, 23], [ 0, 0, 0, 0, 4, 13, 22, 31], [ 0, 0, 0, 3, 12, 21, 30, 39], [ 0, 0, 2, 11, 20, 29, 38, 47], [ 0, 1, 10, 19, 28, 37, 46, 55]]) >>> diags = numpy.lib.stride_tricks.as_strided(stacked, shape, strides) >>> diags.sum(axis=1) array([252, 245, 231, 210, 182, 147, 105, 56, 7, 21, 42, 70, 105, 147, 196]) 

Of course, I have no idea how fast it really will be. But I'm sure it will be faster than understanding Python.

For convenience, there is a fully general diagonals function. It is assumed that you want to move the diagonal along the longest axis:

 def diagonals(a): rows, cols = a.shape if cols > rows: a = aT rows, cols = a.shape fill = numpy.zeros(((cols - 1), cols), dtype=a.dtype) stacked = numpy.vstack((a, fill, a)) major_stride, minor_stride = stacked.strides strides = major_stride, minor_stride * (cols + 1) shape = (rows + cols - 1, cols) return numpy.lib.stride_tricks.as_strided(stacked, shape, strides) 
+6
source share

As I wrote in a comment, I would not become in C code.

Try to go with PyPy . In fact, numpy support is very good (however, it does not directly support array.diagonal). I did not check if there is another buidin method for this. Not serious, I tried the following code:

 try: import numpypy # required by PyPy except ImportError: pass import numpy board = numpy.array([[0, 0, 0, 1, 0, 0, 0, 0], [1, 0, 0, 0, 0, 1, 0, 1], [0, 0, 0, 0, 0, 0, 0, 1], [0, 1, 0, 1, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 1, 0, 0, 0], [0, 0, 1, 0, 0, 0, 1, 0], [1, 0, 0, 0, 0, 1, 0, 0]]) n=len(board) def diag_sum(i, b): s = 0 if i>=0: row = 0 end = n else: row = -i end = n+i i = 0 while i<end: s += b[row, i] i+=1 row+=1 return s import time t=time.time() for i in xrange(50000): # rate = [b.diagonal(i).sum() # for b in (board, board[::-1]) # for i in range(-n+1, n)] rate = [diag_sum(i,b) for b in (board, board[::-1]) for i in range(-n+1, n)] print time.time() - t 

Results:

  • 0.64s PyPy with diag_sum
  • version 6.01s of CPython with diag_sum
  • version 5.60s of CPython with b.diagonal
+2
source share

All Articles