Pytesseract does not work with one digital image

I have code that uses pytesseract and works perfectly, just don’t work when the image I'm trying to recognize is from 0 to 9. If the image has only one digit, it doesn’t give any result.

This is a sample image I'm working on https://drive.google.com/folderview?id=0B68PDhV5SW8BdFdWYVRwODBVZk0&usp=sharing

And this code that I use

import pytesseract varnum= pytesseract.image_to_string(Image.open('images/table/img.jpg')) varnum = float(varnum) print varnum 

Thanks!!!!

With this code I can read all numbers

 import pytesseract start_time = time.clock() y = pytesseract.image_to_string(Image.open('images/table/1.jpg'),config='-psm 10000') x = pytesseract.image_to_string(Image.open('images/table/1.jpg'),config='-psm 10000') print y print x y = pytesseract.image_to_string(Image.open('images/table/68.5.jpg'),config='-psm 10000') x = pytesseract.image_to_string(Image.open('images/table/68.5.jpg'),config='-psm 10000') print y print x print time.clock() - start_time, "seconds" 

result

 >>> 1 1 68.5 68.5 0.485644155358 seconds >>> 
+5
source share
1 answer

You will need to set the Page Segmentation mode to read individual characters / numbers.

From the tesseract-ocr manual (this is what pytesseract uses internally), you can set the page segmentation mode with -

-psm N

Install Tesseract just to launch a subset of the layout and take a specific image shape. Options for N:

10 = Treat the image as a single character.

So, you should set the -psm parameter to 10. Example -

 varnum= pytesseract.image_to_string(Image.open('images/table/img.jpg'),config='-psm 10') 
+6
source

All Articles