You all complicate things unnecessarily ... Since you have one operation, multiplication, which is commutative, you can change the order in which you perform them at will, i.e. you do not need to multiply matrix1 elements with matrix1 elements, and once you have calculated all of them, multiply them together. Instead, you can first multiply all the corresponding elements of matrix1 together, then all the corresponding elements of matrix2 together, and then multiply the two resulting values. This way you can write your function as very simple:
def fast_geometric_matrix_multiplication(matrix1, matrix2): return np.prod(matrix1, axis=1)[:, None] * np.prod(matrix2, axis=0)
It has an additional advantage: if you multiply the matrices of the forms (m, k) and (k, n) , you expect that you will have to do the multiplications m*n*2*k , whereas this method requires only m*k + n*k + m*n , which is bound to be much smaller than what you are currently doing for almost any array form.
And of course:
In [24]: a = np.random.rand(100, 200) ...: b = np.random.rand(200, 50) ...: In [25]: np.allclose(geometric_matrix_multiplication(a, b), ...: fast_geometric_matrix_multiplication(a, b)) Out[25]: True In [26]: %timeit geometric_matrix_multiplication(a, b) 1 loops, best of 3: 1.39 s per loop In [27]: %timeit fast_geometric_matrix_multiplication(a, b) 10000 loops, best of 3: 74.2 us per loop In [28]: %timeit np.prod(a[:, None, :]*b[..., None].T, axis=2) 100 loops, best of 3: 5.97 ms per loop