Is it possible to train tesseract for characters without a font?

I am curious how I can more reliably recognize the value and image of playing cards. Here are two examples:

enter image description here enter image description here

There may be some noise in the images, but I have a large data set that I could use for training (about 10 thousand pages, including all values ​​and costumes).

I can reliably recognize images that I manually classified if I have an exact exact match using the hash method. But since I use hashing images based on their contents, then the slightest noise changes the hash and causes the image to be processed as unknown. This is what I reliably access with further automation.

I looked at the tesseract 3.05 training documentation: https://github.com/tesseract-ocr/tesseract/wiki/Training-Tesseract#automated-method

Is it possible to train tesseract only with images found in fonts? Or can I use it to recognize suits for these cards?

I was hoping I could say that all the images in this folder correspond to 4c (for example, the sample images above), and that tesseract will see the similarities in any future instances of this image (regardless of noise level), and also read that as 4c. Is it possible? Does anyone have any experience with this?

+8
image ocr tesseract macos
source share
1 answer

This was my decision, other than tesseract, until someone proves that the best way. I have a setup:

Getting these functions was the most difficult. Then I used my dataset to train the new caffe network. I prepared my data set in one depth folder structure:

./card ./card/2c ./card/2d ./card/2h ./card/2s ./card/3c ./card/3d ./card/3h ./card/3s ./card/4c ./card/4d ./card/4h ./card/4s ./card/5c ./card/5d ./card/5h ./card/5s ./card/6c ./card/6d ./card/6h ./card/6s ./card/7c ./card/7d ./card/7h ./card/7s ./card/8c ./card/8d ./card/8h ./card/8s ./card/9c ./card/9d ./card/9h ./card/9s ./card/_noise ./card/_table ./card/Ac ./card/Ad ./card/Ah ./card/As ./card/Jc ./card/Jd ./card/Jh ./card/Js ./card/Kc ./card/Kd ./card/Kh ./card/Ks ./card/Qc ./card/Qd ./card/Qh ./card/Qs ./card/Tc ./card/Td ./card/Th ./card/Ts 

Within the numbers, I chose:

  • Dataset Tab
  • New Dataset Images
  • Classification
  • I pointed it to the folder with the card, for example: / path / to / card
  • I set the% check to 13.0%, based on the discussion here: https://stackoverflow.com/a/4648/
  • After creating the dataset, I opened the models tab
  • Select a new dataset.
  • Select GoogLeNet on Standard Networks and leave it to learn.

I did this several times, every time I had new images in a dataset. Each training session took 6-10 hours, but at this point I can use my caffemodel to programmatically evaluate what each image will be using this logic: https://github.com/BVLC/caffe/blob/master/examples/ cpp_classification / classification.cpp

The results are either a map (2c, 7h, etc.), or noise, or a table. Any estimates with an accuracy of more than 90% are most likely correct. The last run correctly recognized 300 of 400 images with three errors. I add new images to the dataset and retrain the existing model, further adjusting the accuracy of the result. Hope this is valuable to others!

While I wanted the high-level steps here, all this was done with great thanks to David Humphrey and his github post, I really recommend reading it and trying it out if you are interested in receiving more details: https: // github. com / humphd / have-fun-with-machine-learning

0
source share

All Articles