Symbol / Image Classification Tips

I am working on a project that requires a classification of characters and characters (mainly OCR, which should handle individual characters and ASCII characters, such as music notation). I work with vector graphics (Paths and Glyphs in WPF), so images can have any resolution and rotation will be careless. He will have to classify (and possibly learn) fonts and paths not in a set of workouts. Performance is important, although high accuracy takes precedence.

I examined some examples of image detection using Emgu CV (.Net wrapper for OpenCV). However, the examples and tutorials that I find deal specifically with image detection, not with classification. I do not need to search for instances of the image in a large image, just determine the type of symbol in the image.

There seems to be a wide range of selection methods that can work, and I'm not sure where to start. Any tips or helpful links would be very helpful.

+4
source share
3 answers

You should probably take a look at the document: Gradient-Based Learning, which applies to document recognition, although this applies to handwritten letters and numbers. You should also read about the Context Form by Belongie and Malik. The keyword you should be looking for is the recognition of numbers / characters / figures (not detection, not classification).

+2
source

If you are using EmguCV, an example SURF function (StopSign detector) will be a good place to start. Another (possibly optional) approach is to use the MatchTemplate (..) method.

However, the examples and tutorials I find seem to be dealing with image detection, not classification. I do not need to search for instances of the image in a large image, just determine the type of character in the image.

When you find instances of the symbol in the image, you actually classifying it. Not sure why you think this is not what you need.

Image<Gray, float> imgMatch = imgSource.MatchTemplate(imgTemplate, Emgu.CV.CvEnum.TM_TYPE.CV_TM_CCOEFF_NORMED); double[] min, max; Point[] pointMin, pointMax; imgMatch.MinMax(out min, out max, out pointMin, out pointMax); //max[0] is the score if (max[0] >= (double) myThreshold) { Rectangle rect = new Rectangle(pointMax[0], new Size(imgTemplate.Width, imgTemplate.Height)); imgSource.Draw(rect, new Bgr(Color.Aquamarine), 1); } 

That max [0] gives an estimate of the best fit.

+2
source

Put all your images in standard resolution (respectively scaled and centered).
Divide the canvas down into n square or rectangular blocks.

For each block, you can measure the number of black pixels or the ratio between black and white in this block and consider this as a function.

Now that you can represent the image as a vector of functions (each function created from another block), you can use many standard classification algorithms to predict which class the image belongs to.

Google "viola jones" for more complex methods of this type.

+1
source

Source: https://habr.com/ru/post/1314625/


All Articles