I have two multidimensional arrays NumPy, A and B , with A.shape = (K, d, N) and B.shape = (K, N, d) . I would like to perform an elementary operation on the 0 ( K ) axis, and this operation is a matrix multiplication along the axes 1 and 2 ( d, N and N, d ). Thus, the result should be a multidimensional array C with C.shape = (K, d, d) , so that C[k] = np.dot(A[k], B[k]) . A naive implementation would look like this:
C = np.vstack([np.dot(A[k], B[k])[np.newaxis, :, :] for k in xrange(K)])
but this implementation is slow. A slightly faster approach is as follows:
C = np.dot(A, B)[:, :, 0, :]
which uses the default behavior of np.dot for multidimensional arrays, giving me an array with the form (K, d, K, d) . However, this approach calculates the required answer K times (each of the entries along the 2 axis is the same). Asymptotically this will be slower than the first approach, but the overhead is much less. I also know the following approach:
from numpy.core.umath_tests import matrix_multiply C = matrix_multiply(A, B)
but I canβt guarantee that this feature will be available. So my question is, does NumPy provide a standard way to do this efficiently? The answer that applies to multidimensional arrays in general would be perfect, but the answer, characteristic only for this case, would also be great.
Edit: As @Juh_ pointed out, the second approach is incorrect. The correct version is:
C = np.dot(A, B).diagonal(axis1=0, axis2=2).transpose(2, 0, 1)
but the added overhead makes it slower than the first approach, even for small matrices. The latter approach wins with a long shot in all my tests of time, for small and large matrices. Now I am decisively considering using this if a better solution is not found, even if it means copying the numpy.core.umath_tests library (written in C) into my project.