How to determine image orientation (text)

My program works with fax stored as separate raster images
I wonder if there is a way to automatically determine the page orientation (vertically or horizontally) to show the preview image for the user in the correct order (implies rotation if necessary)

Any advice is greatly appreciated!

EDIT: Clarification:
When Faxmachine receives a multi-page document, it saves each page as a separate TIFF file.
My application has a built-in viewer that displays these files. All files are scaled to A4 format and saved in TIFF (therefore, there are no changes to determine the orientation by the height / width parameters)
My default viewer shows images in portrait mode

What I would like to do is automatically detect a situation when an org document was printed in landscape mode (for example, in Excel spreadsheets), then I would like to show a detailed preview for the end user to speed up the preview process.

Obviously, there are 4 possible orientation / landscape orientations of the fax orientation and 2 types of rotation.

I'm even more interested in simplified solution detection when org doc was a landscape or a portrait (I noticed that most landscape documents need to be rotated clockwise)

EDIT2: Idea
I think this might be some kind of idea:
If I could draw horizontal and vertical lines and check if the line crosses the (black) point. Then we can compare that more types of lines (horizontal or vertical), and he decides on the orientation of the page.
What do you think?

+7
c # image image-processing bitmap
source share
4 answers

For this you need OCR. Rolling your OCR will be a bit complicated, but maybe there is a library or something out there that is worth paying attention to? In addition, even with good OCR, this is not a 100% reliable solution.

+2
source share

You can perform fast Fourier transform (FFT) to convert your spatial image into a frequency / angle representation. Then find the angle with the most noticeable frequency. It sounds complicated, but it's not that complicated, it's quite effective, and in reality it analyzes every possible angle at the same time, instead of being a hard-coded hack that works only for certain angles. Find an example implementation with search terms such as Numerical Recipes and FFT.

+3
source share

I wonder if there are any text properties that you could use to help you do this.

For example, based on a quick glance, there are much more vertical lines in the text (l, j, k, m, n, etc.) than horizontal lines, so maybe you can start with this.

But even finding them is not easy, you need to use some kind of filter, for example, Sobel or Prewitt . They have horizontal and vertical versions, see here for more information.

Of course, the vertical / horizontal rows of an Excel spreadsheet will be the strongest edges, so you have to ignore them and look only at the text.

Alternative: Can't you just give the user a simple way to rotate images, such as arrows in the Windows Picture viewer, or just show 4 thumbnail thumbnails that they can click on. You may need to cache 4 versions (if you're spinning), so it's fast, but only if speed proves to be a problem?

+2
source share

Here's an article entitled " Combined Script Assessment and Pages Using the OCR Tesseract Mechanism " [pdf]

I could not find an implementation for their work, but the approach looks good to me:

The basic idea of ​​the proposed approach is simple.

The form classifier is taught to characters (classes) from all scripts of interest. At run time, classification starts independently of each connected component (CC) in the image, and the process repeats after each CC rotates into three other candidate orientations (90 Β°, 180 Β°, and 270 Β° from the input orientation).

The algorithm tracks the estimated number of characters in each Script for a given orientation, and the accumulated classification score is evaluated for all candidate orientations. The page orientation score is selected as the one with the highest cumulative score, and the Script score is selected as the unit with the most characters in Script for the best score for orientation.

+2
source share

All Articles