Spatial Pyramid Alignment (SPM) for SIFT, then SVM entry in C ++

Question

Spatial Pyramid Alignment (SPM) for SIFT, then SVM entry in C ++

I am trying to classify MRI images of brain tumors into benign and malignant using C ++ and OpenCV. I plan to use the word sum (BoW) method after clustering SIFT descriptors using kmeans. Meaning, I will present each image as a histogram with the entire "code book" / dictionary for the x axis and their number of occurrences in the image for the y axis. These histograms will then be my input for my SVM (with RBF core).

However, the disadvantage of using BoW is that it ignores the spatial information of the descriptors in the image. Someone suggested using SPM instead . I read about it and stumbled upon this link by following these steps:

Calculate K visual words from a set of workouts and match all local functions with his visual word.
For each image, initialize coordinate histograms with several resolutions to zero. Each coordinate histogram consists of L levels and each level i has 4 ^ i cells that evenly split the current image.
For each local object (let its visual identifier be the word k) in this image, select the kth coordinate histogram, and then copy one to count each of the L corresponding cells in this histogram, in accordance with the coordinate of the local function. Cells L cells where the local function falls into L of different resolutions.
Concatenation of coordinate histograms K with multiple resolutions to form the final “long” image histogram. Upon concatenation, the kth histogram is weighted by the probability of the kth visual word.
To calculate the kernel value over two images, summarize all the intersection cells of their "long" histograms.

Now I have the following questions:

What is a coordinate histogram? Does the histogram show the calculations for each grouping along the x axis? How will he provide information about the coordinates of the point?
How would I calculate the probability of the kth visual word?
What will be the use of the "core value" that I get? How will I use it to input SVM? If I understand correctly, is the core value used at the testing stage, and not at the training stage? If so, how will I train my SVM?
Or do you think that I do not need to burden spatial information and just stick to the usual BoW for my situation (benign and malignant tumors)?

Someone please help this poor little student. If you do, you will receive my eternal gratitude. If you have any clarification, please feel free to ask.

+5

c ++ image-processing opencv machine-learning

noobalert Aug 13 '15 at 20:18

source share

1 answer

Bharat · Accepted Answer · 2015-08-14T19:31:39+0000

Here is a link to the actual paper, http://www.csd.uwo.ca/~olga/Courses/Fall2014/CS9840/Papers/lazebnikcvpr06b.pdf

MATLAB code is listed here http://web.engr.illinois.edu/~slazebni/research/SpatialPyramid.zip

The coordinate histogram (indicated in your message) is just a subregion in the image in which you calculate the histogram. These slides visually visualize it, http://web.engr.illinois.edu/~slazebni/slides/ima_poster.pdf .

Here you have several histograms, one for each individual area of the image. Probability (or the number of elements will depend on the sieving points in this subregion).

I think you need to define your pyramid core as indicated in the slides.

A convolutional neural network may be better suited to your task if you have enough training samples. Perhaps you can take a look at a torch or coffee.

Spatial Pyramid Alignment (SPM) for SIFT, then SVM entry in C ++

More articles: