Java OCR produces no output

Question

Java OCR produces no output

I use this ocr algorithm http://sourceforge.net/projects/javaocr/ to detect numbers in the image. I tried using tesseract, but I had the same problem, sometimes it didn’t work. It never worked (java ocr). When I used java ocr, it did not produce any output except / n.

The image is completely white and the numbers are black. The only artifacts in the image are two lines near the upper and lower borders that do not even interfere with the characters. Alignment is normal, such as typed text, and not handwriting or skew.

BufferedImage image2 = ImageIO.read(new File("moneyImage"+".bmp")); ImageManipulator.show(image2, 5); OCRScanner scanner = new OCRScanner(); String items = scanner.scan(image2, 0, 0, 0, 0, null); System.out.println(items);

Image 2 clearly shows, and this example was taken from someone else who published it as such. I am not doing anything complicated, and it makes no sense to me why this should not work. This is a simple grayscale image.

When I try to run a standalone program (java ocr one), it works and outputs the correct numbers as output. I do not know how to extract characters from my java project and why it does not work.

My test image:

Also this

 String lastText = null; Tesseract instance = Tesseract.getInstance(); try { lastText = instance.doOCR(imageFile); } catch (TesseractException ex) { Logger.getLogger(ActionAbstraction.class.getName()).log(Level.SEVERE, null, ex); }

produces absolutely no output, even if I give an image of a single digit, as deduced from java ocr. They seem to work, but both just don't output anything when I do the actual scan.

In addition, I use tiff images, and as I said before, character extraction works fine. What doesn't work is java code causing the image to scan. I linked the appropriate libraries (or this created compiler errors)

+7

java ocr

JamesTR Mar 07 '14 at 11:24

source share

2 answers

monojohnny · Answer 1 · 2014-03-12T15:09:56+0000

Not sure: but you won’t tell the scanner to just look at the upper left corner of your image with this line:

 String items = scanner.scan(image2, 0, 0, 0, 0, null);

Maybe change it to (something like):

 String items = scanner.scan(image2, 0, 0, 80, 20, null);

[change 80.20 to any width / height of your image - you can probably get Java to do this for you - I think there is a method in the Image class, if I remember correctly].

I got this (possibly wrong) idea from creating a git source clone:

git clone git://git.code.sf.net/p/javaocr/source javaocr-source

And in the javaocr-source \ core \ src \ main \ java directory: The interface contained in "java.net.sourceforge.javaocr.ImageScanner.java" defines the "scan" interface as follows:

//

 void scan( Image image, DocumentScannerListener listener, int left, int top, int right, int bottom); }

//

Mahdi El Masaoudi · Answer 2 · 2016-03-02T04:43:39+0000

This is the javadoc I found for the scan to project source function:

  /** * Scan an image and return the decoded text. * @param image The <code>Image</code> to be scanned. * @param x1 The leftmost pixel position of the area to be scanned, or * <code>0</code> to start scanning at the left boundary of the image. * @param y1 The topmost pixel position of the area to be scanned, or * <code>0</code> to start scanning at the top boundary of the image. * @param x2 The rightmost pixel position of the area to be scanned, or * <code>0</code> to stop scanning at the right boundary of the image. * @param y2 The bottommost pixel position of the area to be scanned, or * <code>0</code> to stop scanning at the bottom boundary of the image. * @param acceptableChars An array of <code>CharacterRange</code> objects * representing the ranges of characters which are allowed to be decoded, * or <code>null</code> to not limit which characters can be decoded. * @return The decoded text. */

So,

 String items = scanner.scan(image2, 0, 0, 0, 0, null);

seems ok according to the code documentation. However, I tried, and it is not. This is one of the worst documents I've ever seen.

Java OCR produces no output

More articles: