How can I vectorize this triple loop over 2d arrays in numpy?

Is it possible to exclude all Python loops in this calculation:

result[i,j,k] = (x[i] * y[j] * z[k]).sum() 

where x[i] , y[j] , z[k] are vectors of length N and x , y , z are the first sizes with length A , B , C street exit has the form (A,B,C) , and each element the sum of the triple product (elementary).

I can get it from 3 to 1 cycles (code below), but I am stuck trying to exclude the last cycle.

If necessary, I can do A=B=C (after a small amount of indentation).

 # Example with 3 loops, 2 loops, 1 loop (testing omitted) N = 100 # more like 100k in real problem A = 2 # more like 20 in real problem B = 3 # more like 20 in real problem C = 4 # more like 20 in real problem import numpy x = numpy.random.rand(A, N) y = numpy.random.rand(B, N) z = numpy.random.rand(C, N) # outputs of each variant result_slow = numpy.empty((A,B,C)) result_vec_C = numpy.empty((A,B,C)) result_vec_CB = numpy.empty((A,B,C)) # 3 nested loops for i in range(A): for j in range(B): for k in range(C): result_slow[i,j,k] = (x[i] * y[j] * z[k]).sum() # vectorize loop over C (2 nested loops) for i in range(A): for j in range(B): result_vec_C[i,j,:] = (x[i] * y[j] * z).sum(axis=1) # vectorize one C and B (one loop) for i in range(A): result_vec_CB[i,:,:] = numpy.dot(x[i] * y, z.transpose()) numpy.testing.assert_almost_equal(result_slow, result_vec_C) numpy.testing.assert_almost_equal(result_slow, result_vec_CB) 
+7
source share
2 answers

If you use numpy> 1.6, there is an awesome np.einsum function:

 np.einsum('im,jm,km->ijk',x,y,z) 

This is equivalent to your looping versions. I'm not sure how this will be fair in terms of efficiency, as soon as you get to the size of your arrays in a real problem (I really get segfault on my machine when I switch to these sizes). Another solution that I often prefer for such problems is to rewrite the method using cython.

+9
source

Using einsum makes a lot of sense in your case; but you can do it quite easily by hand. The trick is to make the arrays broadcast against each other. This means that they modify them so that each array changes independently along its own axis. Then multiply them together, letting numpy take care of the broadcast; and then summed along the last (rightmost) axis.

 >>> x = numpy.arange(2 * 4).reshape(2, 4) >>> y = numpy.arange(3 * 4).reshape(3, 4) >>> z = numpy.arange(4 * 4).reshape(4, 4) >>> (x.reshape(2, 1, 1, 4) * ... y.reshape(1, 3, 1, 4) * ... z.reshape(1, 1, 4, 4)).sum(axis=3) array([[[ 36, 92, 148, 204], [ 92, 244, 396, 548], [ 148, 396, 644, 892]], [[ 92, 244, 396, 548], [ 244, 748, 1252, 1756], [ 396, 1252, 2108, 2964]]]) 

You can make this more generalized by using slice notation, the value of newaxis (which is None , so it will work with None below), and the fact that sum takes the values ​​of the negative axis (with -1 , denoting the last, -2 , denoting the next and the last), etc.). Thus, you do not need to know the original shape of the arrays; as long as their last axes are compatible, this will broadcast the first three together:

 >>> (x[:, numpy.newaxis, numpy.newaxis, :] * ... y[numpy.newaxis, :, numpy.newaxis, :] * ... z[numpy.newaxis, numpy.newaxis, :, :]).sum(axis=-1) array([[[ 36, 92, 148, 204], [ 92, 244, 396, 548], [ 148, 396, 644, 892]], [[ 92, 244, 396, 548], [ 244, 748, 1252, 1756], [ 396, 1252, 2108, 2964]]]) 
+8
source

All Articles