I want to detect objects inside microscopy image cells. I have a lot of annotated images (50,000 images with an object and 500,000 without an object).
So far I have tried to extract functions using HOG and classify using logistic regression and LinearSVC. I tried several parameters for HOG or color spaces (RGB, HSV, LAB), but I do not see much difference, the prediction speed is about 70%.
I have a few questions. How many images should be used for descriptor training? How many images should be used to verify the prediction?
I tried about 1000 images for training, which gives me 55% of positive results and 5000, which gives me about 72% of positive results. However, it also depends heavily on the test suite; sometimes the test suite can reach 80-90% of the positive detected images.
Here are two examples containing an object and two images without an object:




Another problem is that sometimes images contain several objects:

Should I try to increase the training set examples? How to choose images for the training set, just random? What else could I try?
Any help would be greatly appreciated, I just started learning machine learning. I am using Python (scikit image and scikit-learn).
python scikit-learn image object-detection scikit-image
snowflake
source share