I want to train my svm classifier for categorizing images using scikit-learn. And I want to use opencv-python SIFT function to extract the image function. The situation is as follows:
1. that the input of the scikit-learn svm classifier is the 2nd array, which means that each line represents one image, and the number of elements in each image is the same: here
2. opencv-python. The SIFT algorithm returns a list of key points, which is an array of numpy forms
. here
So my question is:
How can I handle the SIFT functions to match the input of the SVM classifier? Can you help me?
update1 :
Thanks for the pyan advice, I adapt my proposal as follows: 1. Get the SIFT object vectors from each image
2. perform k-average clustering across all vectors 3. Create a function dictionary, aka cookbook based on the cluster center
4. Repeat the presentation of each image based on the dictionary of functions, of course, the size size of each image will be the same
5. Train and rate my SVM classifier
Update2:
I collected all the vectors of SIFT image objects into an array (x * 128), which is so large, and then I need to perform clustering on it.
The problem is in:
If I use k-means, the cluster parameter number must be set, and I do not know how to set the optimal value; if i don't use k-tool, what algorithm might be suitable for this?
note:I want to use scikit-learn to perform clustering
My suggestion:
1. execute a dbscan cluster on vectors, then I can get label_size and labels.
2. because dbscan in scikit-learn cannot be used for forecasting, I could train a new classifier A based on the result of dbscan; 3. Classifier A is like a cookbook, I can mark all the images of SIFT vectors. After that, each image can be re-rendered,
4. Based on the above work, I can train my last classifier B.
note:for predict a new image, its SIFT vectors must be transform by classifier A into the vector as classifier B input
Can you give me some advice?
source share