import numpy as np
n, k = 30, 40
U = np.random.random((n, n, k))
K = np.random.random((n, n, n))
def using_loops(U, K):
S = np.empty((n, n, n))
for i in xrange(n):
temp = np.zeros((n, n))
for j in xrange (n):
if j != i:
temp += np.dot(U[j], U[j].T)
S[i] = np.dot(temp, K[i])
return S
def using_einsum(U, K):
uut = np.einsum('ijk,ilk->ijl', U, U)
total = uut.sum(axis=0)
total = total - uut
S = np.einsum('ijk,ikl->ijl', total, K)
return S
This verifies that using_loopsthey using_einsumgive the same result.
In [260]: np.allclose(using_loops(U, K), using_einsum(U, K))
Out[260]: True
This shows that using_einsum is faster; how much the size depends on nand k:
In [262]: %timeit using_loops(U, K)
100 loops, best of 3: 17.1 ms per loop
In [263]: %timeit using_einsum(U, K)
1000 loops, best of 3: 1.92 ms per loop
In general, whenever you see the sums of products, there is a good chance that
np.einsum
will be a fairly quick way to get results. It will almost certainly beat Python for-loops.