How to use SIFT / SURF as functions for machine learning algorithm?

Im working on a problem of automatic image annotation in which I am trying to associate tags with images. For this, I am trying to use the SIFT functions for training. But the problem is that all SIFT functions are a set of key points, each of which has a 2-dimensional array, and the number of key points is also huge. How many and how can I give them for my learning algorithm, which usually only accepts one -d?

+4
image-processing opencv machine-learning feature-extraction sift
source share
4 answers

You can imagine a single SIFT as a “visual word”, which is a single number, and use it as an SVM input, I think this is what you need. This is usually done by clustering a k-means.

This method is called the word bag and is described in this article .

A brief overview of the method overview .

+1
source share

You should read the original article about SIFT, it tells you what SIFT is and how to use it, you should carefully read chapter 7 and take a rest to understand how to use it in practice. Here is the link for the original paper.

+1
source share

You can use the Bag of Words approach, which you can read about in the following post:

http://gilscvblog.wordpress.com/2013/08/23/bag-of-words-models-for-visual-categorization/

+1
source share

Sift and Surf are invariant extractors. There, to match functions, it will help solve many problems.

  • But there is a corresponding problem, since all points may not be the same in two different images. (and in case of a similarity problem). Therefore, you should use functions that match the others.

  • Another problem is that these algorithms extract many functions that cannot be matched in large datasets.

There is a good solution to the problems called “Visual Word Bag”

https://github.com/dermotte/LIRE the complete visual word package is fully implemented. Below is the lire website.

The code is very simple if you know a bag of visual words that you can also modify.

After receiving the visual word, you should use the information search methods used in search engines. By the way, Lire also includes an information search library called lucene . You have to let the lira down until you get the full idea and fulfill your own.

+1
source share

All Articles