I could use einsum here:
>>> a = np.random.randint(0, 10, (3,3)) >>> b = np.random.randint(0, 10, (3,3)) >>> a array([[9, 2, 8], [5, 4, 0], [8, 0, 6]]) >>> b array([[5, 5, 0], [3, 5, 5], [9, 4, 3]]) >>> a.dot(b) array([[123, 87, 34], [ 37, 45, 20], [ 94, 64, 18]]) >>> np.diagonal(a.dot(b)) array([123, 45, 18]) >>> np.einsum('ij,ji->i', a,b) array([123, 45, 18])
For large arrays, this will be much faster than direct multiplication:
>>> a = np.random.randint(0, 10, (1000,1000)) >>> b = np.random.randint(0, 10, (1000,1000)) >>> %timeit np.diagonal(a.dot(b)) 1 loops, best of 3: 7.04 s per loop >>> %timeit np.einsum('ij,ji->i', a, b) 100 loops, best of 3: 7.49 ms per loop
[Note: I originally made a version of elementwise, ii,ii->i instead of matrix multiplication. The same einsum tricks.]