Calculate "v ^ TA v" for the matrix of vectors v

I have a matrix k*n X and a matrix k*k A. For each column X I would like to calculate the scalar

 X[:, i].T.dot(A).dot(X[:, i]) 

(or, mathematically, Xi' * A * Xi ).

I currently have a for loop:

 out = np.empty((n,)) for i in xrange(n): out[i] = X[:, i].T.dot(A).dot(X[:, i]) 

but since n large, I would like to do it faster if possible (i.e. use some NumPy functions instead of a loop).

+5
source share
3 answers

It seems to be nice: (XTdot(A)*XT).sum(axis=1)

Edit: this is a little faster. np.einsum('...i,...i->...', XTdot(A), XT) . Both work better if X and A are fortran continuous.

+5
source

You can use numpy.einsum :

 np.einsum('ji,jk,ki->i',x,a,x) 

This will give the same result. Let's see if it will be much faster:

enter image description here

It seems that dot is still faster, especially because it uses streaming BLAS, unlike einsum , which runs on a single core.

 import perfplot import numpy as np def setup(n): k = n x = np.random.random((k, n)) A = np.random.random((k, k)) return x, A def loop(data): x, A = data n = x.shape[1] out = np.empty(n) for i in range(n): out[i] = x[:, i].T.dot(A).dot(x[:, i]) return out def einsum(data): x, A = data return np.einsum('ji,jk,ki->i', x, A, x) def dot(data): x, A = data return (xTdot(A)*xT).sum(axis=1) perfplot.show( setup=setup, kernels=[loop, einsum, dot], n_range=[2**k for k in range(10)], logx=True, logy=True, xlabel='n, k' ) 
+3
source

You cannot do it faster if you do not parallelize all of this: one stream per column. You will still use loops - you cannot get away from it.

Reducing a map is a good way to look at this problem: multiply multiple steps, reduce the sum of steps.

0
source

All Articles