How to interpret the results of a singular decomposition (Python 3)?

I am trying to learn how to reduce dimensionality in datasets. I came across some tutorials on Principle Component Analysis and Singular Value Decomposition . I understand that it takes the dimension of the largest variance and sequentially collapses the measurements of the next largest variance (oversimplified).

I am confused about how to interpret output matrices. I looked at the documentation, but that did not help. I followed some textbooks and was not sure what exactly happened in the matrices. I have provided some code to get an idea of ​​the distribution of each variable in a dataset ( sklearn.datasets ).

My original input array is the (nxm) matrix of n samples and m attributes . I could make a general PCA-graph of PC1 compared to PC2, but how do I know what sizes are presented on each PC?

Sorry if this is the main question. Many resources are very complex, and I'm fine, but a more intuitive answer would be helpful. No, where I saw talk about how to interpret the output in terms of the original tagged data.

I am open to using sklearn decomposition.PCA

 #Singular Value Decomposition U, s, V = np.linalg.svd(X, full_matrices=True) print(U.shape, s.shape, V.shape, sep="\n") (442, 442) (10,) (10, 10) 
+6
source share
1 answer

As you said above, the matrix M can be decomposed as the product of three matrices: U * S * V * . The geometric meaning is as follows: any transformation can be considered as a rotation sequence (V * ), scaling (S) and rotation again (U). Here's a good description and animation .

What is important to us? The matrix S is diagonal - all its values ​​lying on the main diagonal are 0.

how

 np.diag(s) array([[ 2.00604441, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ], [ 0. , 1.22160478, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ], [ 0. , 0. , 1.09816315, 0. , 0. , 0. , 0. , 0. , 0. , 0. ], [ 0. , 0. , 0. , 0.97748473, 0. , 0. , 0. , 0. , 0. , 0. ], [ 0. , 0. , 0. , 0. , 0.81374786, 0. , 0. , 0. , 0. , 0. ], [ 0. , 0. , 0. , 0. , 0. , 0.77634993, 0. , 0. , 0. , 0. ], [ 0. , 0. , 0. , 0. , 0. , 0. , 0.73250287, 0. , 0. , 0. ], [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.65854628, 0. , 0. ], [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.27985695, 0. ], [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.09252313]]) 

Geometrically - each value is a scale factor along a specific axis. For our purposes (classification and regression), these values ​​show the effect of a particular axis on the overall result.

As you can see, these values ​​are reduced from 2.0 to 0.093. One of the most important applications is the lightweight low-rank matrix approximation with a given accuracy. If you do not need ultra-precise decomposition (this is true for problems with ML), you can reset the lowest values ​​and save only the important ones. Thus, you can step by step clarify your decision: evaluate the quality with the help of a test suite, reset the lowest values ​​and repeat. As a result, you get an easy and reliable solution.

enter image description here

Here the good candidates to be reduced are 8 and 9, then 5-7, and as a last option, you can bring the model closer to only one value - first.

+1
source

All Articles