C # -tesseract get space recoginition in numbers

I am new to tesseract and I am making a class project in which I need to scan number matrices. I managed to read the numbers from the image file, but I have not yet found how to recognize the interval between the numbers. For example, currently I get 14610 for 1 4 6 10.

Picture

enter image description here

The code I'm currently using is:

Bitmap myBmp = new Bitmap(file); var image = myBmp; var ocr = new Tesseract(); ocr.SetVariable("tessedit_char_whitelist", "0123456789"); // If digit only ocr.Init(@"C:\Users\MuhammadShahroz\Documents\Visual Studio 2013\Projects\ConsoleApplication3\tessdata", "eng", false); var results = ocr.DoOCR( image, Rectangle.Empty); foreach (Word word in results) { Console.WriteLine("{0} : {1}", word.Confidence, word.Text); mystring = String.Format("{0 } ",word.Text); } 
+8
c # tesseract
source share
1 answer

I think you will need to set the variable preserve_interword_spaces=1 (see doc )

+4
source share

All Articles