Numpy cov (covariance), what exactly does it calculate?

I assume numpy.cov(X) calculates a sample covariance matrix as:

 1/(N-1) * Sum (x_i - m)(x_i - m)^T (where m is the mean) 

I. the amount of external works. But nowhere in the documentation does this say this, but simply say: "Estimate the covariance matrix."

Can anyone confirm if this is internal? (I know that I can change the constant in front with the bias parameter.)

+6
source share
2 answers

As you can see the source, in the simplest case, without masks and N variables with M samples each, it returns the covariance matrix (N, N) calculated as:

 (xm) * (xm).T.conj() / (N - 1) 

Where * represents the matrix product [1]

It is implemented approximately like this:

 X -= X.mean(axis=0) N = X.shape[1] fact = float(N - 1) return dot(X, XTconj()) / fact 

If you want to view the source, look here instead of the link from Mr E, if you are not interested in the mask in the mask. As you mentioned, the documentation is small.

[1] which in this case is effectively (but not exactly) an external product, because (xm) has N column vectors of length M and, therefore, (xm).T is the same number of vectors. The end result is the sum of all external products. The same * will give an internal (scalar) product if the order is canceled. But, technically, these are just standard matrix multiplications, and the true external product is only the product of the column vector by the row vector.

+3
source

Yes, this is what numpy.cov computes. FWIW, I compared the output of numpy.cov with an explicit iteration over the samples (for example, in the indicated pseudocode) to compare performance, and the difference in the resulting output arrays is what you would expect due to floating point precision.

0
source

All Articles