I am trying to interact with the tesseract API, I am also new to image processing and I am just struggling with it for the last few days. I tried simple algorithms and I achieved almost 70% accuracy.
I want its accuracy to be 90%. The problem with images is that they are at 72dpi. I also tried to increase the resolution, but did not get good results, the images I'm trying to admit are attached.
Any help would be appreciated, and I'm sorry if I ask for something very simple.



EDIT
I forgot to mention that I try to do all the processing and recognition within 2-2.5 seconds on the Linux platform and the method for detecting the text mentioned in this answer takes a lot of time. I also prefer not to use a command line solution, but I would prefer Leptonica or OpenCV .
Download most images here
I tried the following things to binarize tickets, but no luck
Ticket contains
- a little bad light
- Non-text area
- less resolution
I tried to transfer the image directly to the tesseract API, and it gives me 70% good results in an average of 1 second. But I want to increase accuracy by noting the time factor. So far i tried
- Image Edge Detection
- Blob analysis for blobs
- Adaptive Threshold Binarized Ticket
Then I tried to transfer these binarized images to tesseract, the accuracy was reduced to less than 50-60%, although the binarized image looks perfect.
c ++ image-processing opencv ocr tesseract
Muaz usmani
source share