Python OpenCV PCACompute Eigenvalue

When using Python 2.7.5 with OpenCV (OSX), I run PCA on an image sequence (cols are pixels, lines are frames according to this answer .

How to get eigenvalues ​​corresponding to eigenvectors? This seems to be a property of the PCA object in C ++, but the Python equivalent of PCACompute() is a simple function.

It seems strange to miss such a key part of the PCA.

+6
source share
1 answer

matmul.cpp confirms PCA::Operator() used by PCACompute() , but eigenvalues ​​are discarded. So I did this:

 # The following mimics PCA::operator() implementation from OpenCV's # matmul.cpp() which is wrapped by Python cv2.PCACompute(). We can't # use PCACompute() though as it discards the eigenvalues. # Scrambled is faster for nVariables >> nObservations. Bitmask is 0 and # therefore default / redundant, but included to abide by online docs. covar, mean = cv2.calcCovarMatrix(PCAInput, cv2.cv.CV_COVAR_SCALE | cv2.cv.CV_COVAR_ROWS | cv2.cv.CV_COVAR_SCRAMBLED) eVal, eVec = cv2.eigen(covar, computeEigenvectors=True)[1:] # Conversion + normalisation required due to 'scrambled' mode eVec = cv2.gemm(eVec, PCAInput - mean, 1, None, 0) # apply_along_axis() slices 1D rows, but normalize() returns 4x1 vectors eVec = numpy.apply_along_axis(lambda n: cv2.normalize(n).flat, 1, eVec) 

(simplifying assumptions: rows = comments, cols = variables, and there are many other variables besides observations. Both options are true in my case.)

This pretty much works. Further, old_eVec is the result of cv2.PCACompute() :

 In [101]: eVec Out[101]: array([[ 3.69396088e-05, 1.66745325e-05, 4.97117583e-05, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -7.23531536e-06, -3.07411122e-06, -9.58259793e-06, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 1.01496237e-05, 4.60048715e-06, 1.33919606e-05, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], ..., [ -1.42024751e-04, 5.21386198e-05, 3.59923394e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -5.28685812e-05, 8.50139472e-05, -3.13278542e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 2.96546917e-04, 1.23437674e-04, 4.98598461e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]]) In [102]: old_eVec Out[102]: array([[ 3.69395821e-05, 1.66745194e-05, 4.97117981e-05, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -7.23533140e-06, -3.07411415e-06, -9.58260534e-06, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 1.01496662e-05, 4.60050160e-06, 1.33920075e-05, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], ..., [ -1.42029530e-04, 5.21366564e-05, 3.60067672e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -5.29163444e-05, 8.50261567e-05, -3.13150231e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -7.13724992e-04, -8.52700090e-04, 1.57953508e-03, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]], dtype=float32) 

There is some kind of loss of accuracy visible towards the end of the exits (although in fact the quick construction of the absolute difference does not reflect any picture of inaccuracy).

57% of the elements have a non-zero absolute difference.
Of these, 95% differ by less than 2e-16, and the average AD is 5.3e-4, however, AD can reach 0.059, which is a lot if you think that all the values ​​of the eigenvectors lie between -0.048 to 0.045.

PCA::Operator() has code that translates to the largest ctype; on the other hand, old_eVec was float32 compared to my own code creating float64. It is worth noting that when compiling numpy, I got some errors related to accuracy.

In general, the loss of accuracy is apparently associated with low eigenvalues ​​of valuable eigenvectors, which again indicates a rounding error, etc. The above implementation gives results similar to PCACompute (), duplicating the behavior.

+2
source

All Articles