How to increase the image up to 300 DPI?

Accepted answer to the question C ++ Library for Image Recognition: Images containing words in a line recommended you:

  • Zoom in / out the input image to 300 DPI.

How would I do this ... I got the impression that DPI is for monitors, not image formats.

+4
source share
5 answers

I think the more accurate term here is resampling . You want the pixel resolution to be high enough to support accurate OCR. Font size (for example, in points ) is usually measured in units of length, not pixels. Since 72 dots = 1 inch, we need 300/72 pixels per dot for a resolution of 300 dots per inch ("pixels per inch"). This means that a typical 12-point font has a height (or more precisely, a baseline over the distance of the baseline in single-line text) equal to 50 pixels.

Ideally, your source documents should be scanned with the appropriate resolution for a given font size, so that the font on the image is about 50 pixels. If the resolution is too high / low, you can easily rescale the image using a graphics program (such as GIMP ). You can also do this programmatically through a graphics library such as ImageMagick , which has interfaces for many programming languages.

+2
source

DPI makes sense when you associate an image in pixels with a physical device with the image size. In the case of OCR, this usually means scanning resolution, that is, how many pixels you get for every inch of scanning. The 12-dot font is designed to print 12/72 inches per line, and the uppercase character can fill about 80%; thus, when scanning at 300 DPI, it will be approximately 40 pixels.

Many image formats have DPI recorded in them. If the image was scanned, this should be fine-tuning from the scanner. If it comes from a digital camera, it always says 72 DPI, which is the default value specified by the EXIF โ€‹โ€‹specification; this is due to the fact that the camera cannot know the original image size. When you create an image using an image processing program, you may be able to set the DPI for any arbitrary value. It is convenient for you to indicate how you want to use the final image, and has nothing to do with the details contained in the image.

Here is the previous question that asks for the details of resizing an image: How do I scale image quality?

+2
source

OCR software is typically designed to work with "normal" font sizes. From the point of view of the image, this means that he will look for letters, possibly in the range from 30 to 100 pixels. Images with much higher resolution will produce letters that appear too large for the OCR software to process efficiently. Similarly, lower resolution images will not provide enough pixels for the software to recognize letters.

+1
source

"How do I do this ... I got the impression that dpi is for monitors, not image formats."

DPI stands for dots per inch. What does this have to do with monitors? Well, we have a pixel of three RGB subpixels. The higher the DPI, the more detail you squeeze into this space.

DPI is a useful measurement for displays and prints, but nothing useful ... in fact, nothing for the image formats themselves.

The reason that DPI is labeled inside some formats is to instruct devices to display with this resolution, but from what I understand, almost everyone ignores this instruction and does everything possible to optimize the image for a specific result.

You can change 72 dpi to 1 dpi or 6000 dpi in the image format, and this will not affect the monitor. "Upsize / downsize to 300 dpi" does not make sense. Re-sampling also does not change the DPI. Try it in Photoshop, uncheck "Resample" when changing DPI, and you will not see any difference. He will NOT be more or less.

DPI is absolutely pointless for image formats, IMO.

+1
source

If your goal is OCR, DPI makes sense as the number of dots in your image for every inch in the original scanned document. If your dpi is too low, the information goes away forever, and even bicubic interpolation does not fit the brilliant job of recovering it. If your dpi is too high, it's easy to drop bits.

To do the job; I am a big fan of the netpbm / pbmplus toolkit; the tool to start with is pnmscale , although if you have a bitmap, you want to consider related tools like pbmreduce .

0
source

All Articles