I want to perform a classification task in which I correlate a given image of an object with one of the list of predefined constellations in which the object can be located (i.e. find the most likely match). To get image descriptors (on which I will run machine learning algorithms), I was offered to use SIFT with the VLFeat implementation.
First of all, my main question is - I would like to ignore the key point of the screening search and use it only for my descriptors. In the textbook, I saw that it is possible to do just that by calling
[f,d] = vl_sift(I,'frames',fc) ;
where fc indicates key points. My problem is that I want to explicitly specify the bounding box in which I want to calculate the descriptors around the key point, but it seems that I can only specify the scale parameter, which is now a little cryptic for me and does not allow me to explicitly specify the bounding box. Is there any way to achieve this?
The second question is: does setting the scale manually and getting descriptors in this way really make sense? (i.e. get a good descriptor?). Any other suggestions on a better way to get descriptors? (using SIFT with other implementations or other descriptors other than SIFT). I should mention that my object is always the only object in the image, it is centered, has constant illumination and is changed by some types of rotations of its internal parts. And thatβs why I thought SIFT would work, because I realized that it focuses on orientation gradients that would change accordingly when the object rotates.
thanks
image-processing computer-vision classification sift
Dan
source share