Algorithm for detecting corners of a paper sheet in a photograph

What is the best way to determine the angles of an invoice / receipt / sheet of paper in a photograph? This should be used for subsequent perspective correction prior to OCR.

My current approach:

RGB> Gray> Canny Edge Detection with Threshold> Dilate (1)> Delete Small Objects (6)> clear boarder objects> select a large blog based on Convex Area. > [angle detection - not implemented]

I cannot help but think that to handle this type of segmentation there must be a more robust “intelligent” / statistical approach. I have no training examples, but I could probably collect 100 images.

Wider context:

I am using matlab for the prototype and plan to implement the system in OpenCV and Tesserect-OCR. This is the first of many image processing issues that I need to solve for this particular application. Therefore, I am looking to drill my own solution and rethink myself with image processing algorithms.

Here is an example image with which I would like to process the algorithm: if you want large images to be at http://madteckhead.com/tmp

case 1 http://madteckhead.com/tmp/IMG_0773_sml.jpg case 2 http://madteckhead.com/tmp/IMG_0774_sml.jpg case 3 http://madteckhead.com/tmp/IMG_0775_sml.jpg case 4 http: / /madteckhead.com/tmp/IMG_0776_sml.jpg

In the best case, this gives:

case 1 - canny http://madteckhead.com/tmp/IMG_0773_canny.jpg case 1 - post canny http://madteckhead.com/tmp/IMG_0773_postcanny.jpg case 1 - the largest blog http://madteckhead.com/tmp/ IMG_0773_blob.jpg

However, in other cases, it easily copes with:

case 2 - canny http://madteckhead.com/tmp/IMG_0774_canny.jpg case 2 - post canny http://madteckhead.com/tmp/IMG_0774_postcanny.jpg case 2 - the largest blog http://madteckhead.com/tmp/ IMG_0774_blob.jpg

Thanks for all the great ideas! I love SO!

EDIT: Progress Transformation Progress

Q: What algorithm will put the hough strings to find the angles? Following the tips of the answers, I was able to use the Hough Transform, select the lines and filter them. My current approach is pretty rude. I made the assumption that the invoice will always be less than 15deg from the alignment with the image. In the end, I get reasonable results for the strings, if so (see below). But I'm not quite sure of a suitable algorithm to copy strings (or vote) to extrapolate them in corners. Hough lines are not continuous. And in noise images there may be parallel lines, so some shape or distance from the start line metrics is required. Any ideas?

case 1 http://madteckhead.com/tmp/IMG_0773_hough.jpg case 2 http://madteckhead.com/tmp/IMG_0774_hough.jpg case 3 http://madteckhead.com/tmp/IMG_0775_hough.jpg case 4 http: / /madteckhead.com/tmp/IMG_0776_hough.jpg

+88
image-processing opencv edge-detection image-segmentation hough-transform
Jul 02 '11 at 7:12
source share
9 answers

I am a friend of Martin who worked on this earlier this year. This was my first coding project, and it seemed to end up a bit hasty, so the code needs some decoding ... I will give some advice from what I saw that you are already doing, and then collect my code tomorrow at the weekend.

First tip, OpenCV and python are awesome, go to them as soon as possible .: D

Instead of removing small objects and / or noise, omit the channel restrictions, so it takes more edges and then finds the largest closed loop (in OpenCV use findcontour() with some simple parameters, I think I used CV_RETR_LIST ), maybe still fight when he is on a white sheet of paper, but definitely provides the best results.

For the Houghline2() Transform, try with CV_HOUGH_STANDARD as opposed to CV_HOUGH_PROBABILISTIC , it will give rho and theta values, defining the line in polar coordinates, and then you can group the lines within a certain tolerance to them.

My grouping worked like a lookup table, for each row obtained from the hough conversion, it would give a pair of rho and theta. If these values ​​were inside, say, 5% of a pair of values ​​in the table, they were discarded, if they were outside this 5%, a new record was added to the table.

You can make the analysis of parallel lines or the distance between the lines much easier.

Hope this helps.

+26
Jul 10 2018-11-22T00:
source share

A student group at my university recently demonstrated an iPhone app (and an OpenCV app for python) that they wrote to do just that. As I recall, the steps were something like this:

  • The media filter completely deletes text on paper (it was handwritten text on white paper with pretty good lighting and might not work with printed text, it worked very well). The reason was that it makes it easier to detect angles.
  • Convert Hough to Strings
  • Find the peaks in the space of the Hough Transform drive and draw each line throughout the image.
  • Analyze the lines and delete all that are very close to each other and have the same angle (group the lines into one). This is necessary because the Hough transformation is not ideal because it works in the discrete space of the sample.
  • Find pairs of lines that are roughly parallel and intersect other pairs to see which lines form the squares.

It looked pretty good, and they were able to take a picture of a sheet or book, perform angle detection, and then display the document in the image on a flat plane in almost real time (there was one OpenCV function to perform the display). There was no OCR when I saw that it was working.

+17
Jul 02 2018-11-11T00:
source share

Here is what I came up with after several experiments:

 import cv, cv2, numpy as np import sys def get_new(old): new = np.ones(old.shape, np.uint8) cv2.bitwise_not(new,new) return new if __name__ == '__main__': orig = cv2.imread(sys.argv[1]) # these constants are carefully picked MORPH = 9 CANNY = 84 HOUGH = 25 img = cv2.cvtColor(orig, cv2.COLOR_BGR2GRAY) cv2.GaussianBlur(img, (3,3), 0, img) # this is to recognize white on white kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH)) dilated = cv2.dilate(img, kernel) edges = cv2.Canny(dilated, 0, CANNY, apertureSize=3) lines = cv2.HoughLinesP(edges, 1, 3.14/180, HOUGH) for line in lines[0]: cv2.line(edges, (line[0], line[1]), (line[2], line[3]), (255,0,0), 2, 8) # finding contours contours, _ = cv2.findContours(edges.copy(), cv.CV_RETR_EXTERNAL, cv.CV_CHAIN_APPROX_TC89_KCOS) contours = filter(lambda cont: cv2.arcLength(cont, False) > 100, contours) contours = filter(lambda cont: cv2.contourArea(cont) > 10000, contours) # simplify contours down to polygons rects = [] for cont in contours: rect = cv2.approxPolyDP(cont, 40, True).copy().reshape(-1, 2) rects.append(rect) # that basically it cv2.drawContours(orig, rects,-1,(0,255,0),1) # show only contours new = get_new(img) cv2.drawContours(new, rects,-1,(0,255,0),1) cv2.GaussianBlur(new, (9,9), 0, new) new = cv2.Canny(new, 0, CANNY, apertureSize=3) cv2.namedWindow('result', cv2.WINDOW_NORMAL) cv2.imshow('result', orig) cv2.waitKey(0) cv2.imshow('result', dilated) cv2.waitKey(0) cv2.imshow('result', edges) cv2.waitKey(0) cv2.imshow('result', new) cv2.waitKey(0) cv2.destroyAllWindows() 

Not perfect, but at least works for all samples:

one23four

+16
Sep 20 '13 at 2:42 on
source share

Instead of starting edge detection, you can use angle detection.

The Marvin Framework provides an implementation of the Moravec algorithm for this purpose. You can find the corners of newspapers as a starting point. Below is the Moravec algorithm output:

enter image description here

+8
May 25 '13 at 15:42
source share

You can also use MSER (maximum stable extreme areas) according to the result of the Sobel operator to find stable areas of the image. For each region returned by MSER, you can apply the convex hull and poly approximation to get the following:

But this type of detection is useful for real-time detection of more than one picture, which does not always return the best result.

result

+4
Aug 17 '15 at 10:36
source share

After finding the edge, use Hough Transform. Then place these points in the SVM (supporting vector engine) with their labels, if the examples have smooth lines on them, SVM will have no difficulty in separating the necessary parts of the example and other parts. My advice on SVM is to put a parameter such as link and length. That is, if the points are connected and long, they are likely to be the receipt line. Then you can eliminate all other points.

+3
Jul 02 '11 at 10:17
source share

Here you have the @Vanuan code using C ++:

 cv::cvtColor(mat, mat, CV_BGR2GRAY); cv::GaussianBlur(mat, mat, cv::Size(3,3), 0); cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Point(9,9)); cv::Mat dilated; cv::dilate(mat, dilated, kernel); cv::Mat edges; cv::Canny(dilated, edges, 84, 3); std::vector<cv::Vec4i> lines; lines.clear(); cv::HoughLinesP(edges, lines, 1, CV_PI/180, 25); std::vector<cv::Vec4i>::iterator it = lines.begin(); for(; it!=lines.end(); ++it) { cv::Vec4i l = *it; cv::line(edges, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(255,0,0), 2, 8); } std::vector< std::vector<cv::Point> > contours; cv::findContours(edges, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_TC89_KCOS); std::vector< std::vector<cv::Point> > contoursCleaned; for (int i=0; i < contours.size(); i++) { if (cv::arcLength(contours[i], false) > 100) contoursCleaned.push_back(contours[i]); } std::vector<std::vector<cv::Point> > contoursArea; for (int i=0; i < contoursCleaned.size(); i++) { if (cv::contourArea(contoursCleaned[i]) > 10000){ contoursArea.push_back(contoursCleaned[i]); } } std::vector<std::vector<cv::Point> > contoursDraw (contoursCleaned.size()); for (int i=0; i < contoursArea.size(); i++){ cv::approxPolyDP(Mat(contoursArea[i]), contoursDraw[i], 40, true); } Mat drawing = Mat::zeros( mat.size(), CV_8UC3 ); cv::drawContours(drawing, contoursDraw, -1, cv::Scalar(0,255,0),1); 
+3
Sep 29 '13 at 21:10
source share
  • Convert to laboratory space

  • Use a cluster of 2 kmeans segments

  • Then use paths or hough on one of the clusters (intenral)
+1
Oct 29 '14 at 6:37
source share

Find the Java version using opencv below

 package testOpenCV; import org.opencv.core.Core; import org.opencv.core.CvType; import org.opencv.core.Mat; import java.awt.BorderLayout; import java.awt.Container; import java.awt.Image; import java.util.ArrayList; import java.util.List; import java.util.Random; import javax.swing.BoxLayout; import javax.swing.ImageIcon; import javax.swing.JFrame; import javax.swing.JLabel; import javax.swing.JPanel; import javax.swing.JSlider; import javax.swing.event.ChangeEvent; import javax.swing.event.ChangeListener; import org.opencv.core.MatOfPoint; import org.opencv.core.Point; import org.opencv.core.Scalar; import org.opencv.core.Size; import org.opencv.highgui.HighGui; import org.opencv.imgcodecs.Imgcodecs; import org.opencv.imgproc.Imgproc; class FindContours { private Mat srcGray = new Mat(); private Mat src_orig_ref = new Mat(); private Mat srcGray_after = new Mat(); private JFrame frame; private JLabel imgSrcLabel; private JLabel imgContoursLabel; private static final int MAX_THRESHOLD = 255; private int threshold = 40; private Random rng = new Random(12345); private Mat src = new Mat(); public FindContours(String[] args) { //OpenCVNativeLoader String filename = "C:\\Desktop\\opencv\\4.PNG"; src = Imgcodecs.imread(filename); src_orig_ref = Imgcodecs.imread(filename); if (src.empty()) { System.err.println("Cannot read image: " + filename); System.exit(0); } Imgproc.cvtColor(src, srcGray, Imgproc.COLOR_BGR2GRAY); Imgproc.blur(srcGray, srcGray, new Size(3, 3)); Mat kernel = new Mat(new Size(3, 3), CvType.CV_8UC1, new Scalar(255)); Imgproc.morphologyEx(srcGray, srcGray, Imgproc.MORPH_OPEN, kernel); Imgproc.morphologyEx(srcGray, srcGray, Imgproc.MORPH_CLOSE, kernel); // Create and set up the window. frame = new JFrame("Finding contours in your image demo"); frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE); // Set up the content pane. Image img = HighGui.toBufferedImage(src); addComponentsToPane(frame.getContentPane(), img); // Use the content pane default BorderLayout. No need for // setLayout(new BorderLayout()); // Display the window. frame.pack(); frame.setVisible(true); update(); } private void addComponentsToPane(Container pane, Image img) { if (!(pane.getLayout() instanceof BorderLayout)) { pane.add(new JLabel("Container doesn't use BorderLayout!")); return; } JPanel sliderPanel = new JPanel(); sliderPanel.setLayout(new BoxLayout(sliderPanel, BoxLayout.PAGE_AXIS)); sliderPanel.add(new JLabel("Canny threshold: ")); JSlider slider = new JSlider(0, MAX_THRESHOLD, threshold); slider.setMajorTickSpacing(20); slider.setMinorTickSpacing(10); slider.setPaintTicks(true); slider.setPaintLabels(true); slider.addChangeListener(new ChangeListener() { @Override public void stateChanged(ChangeEvent e) { JSlider source = (JSlider) e.getSource(); threshold = source.getValue(); update(); } }); sliderPanel.add(slider); pane.add(sliderPanel, BorderLayout.PAGE_START); JPanel imgPanel = new JPanel(); imgSrcLabel = new JLabel(new ImageIcon(img)); imgPanel.add(imgSrcLabel); Mat blackImg = Mat.zeros(srcGray.size(), CvType.CV_8U); imgContoursLabel = new JLabel(new ImageIcon(HighGui.toBufferedImage(blackImg))); imgPanel.add(imgContoursLabel); pane.add(imgPanel, BorderLayout.CENTER); } private void update() { Mat cannyOutput = new Mat(); Imgproc.Canny(srcGray, cannyOutput, threshold, threshold * 2); List<MatOfPoint> contours = new ArrayList<MatOfPoint>(); Mat hierarchy = new Mat(); Imgproc.findContours(cannyOutput, contours, hierarchy, Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE); Mat drawing = Mat.zeros(cannyOutput.size(), CvType.CV_8UC3); for (int i = 0; i < contours.size(); i++) { //rng.nextInt(256) Scalar color = new Scalar(256, 256, 256); List<MatOfPoint> contours_ele = new ArrayList<MatOfPoint>(); contours_ele.add(contours.get(i)); if(Imgproc.contourArea(contours.get(i))>10) { Imgproc.drawContours(src, contours_ele , -1, color, 2, Imgproc.LINE_8, hierarchy, 0, new Point()); } } Mat cannyOutput_after = new Mat(); Imgproc.cvtColor(src, srcGray_after, Imgproc.COLOR_BGR2GRAY); Imgproc.blur(srcGray_after, srcGray_after, new Size(3, 3)); Mat kernel = new Mat(new Size(3, 3), CvType.CV_8UC1, new Scalar(255)); Imgproc.morphologyEx(srcGray_after, srcGray_after, Imgproc.MORPH_OPEN, kernel); Imgproc.morphologyEx(srcGray_after, srcGray_after, Imgproc.MORPH_CLOSE, kernel); Imgproc.Canny(srcGray_after, cannyOutput_after, threshold, threshold * 2); List<MatOfPoint> contours_dw = new ArrayList<MatOfPoint>(); Mat hierarchy_dw = new Mat(); Mat drawing_dw = Mat.zeros(cannyOutput.size(), CvType.CV_8UC3); Imgproc.findContours(cannyOutput_after, contours_dw, hierarchy_dw, Imgproc.RETR_TREE, Imgproc.CHAIN_APPROX_SIMPLE); for (int i = 0; i < contours_dw.size(); i++) { Scalar color = new Scalar(rng.nextInt(256), rng.nextInt(256), rng.nextInt(256)); if(Imgproc.contourArea(contours_dw.get(i))>100000) { Imgproc.drawContours(src_orig_ref, contours_dw, i, color, 2, Imgproc.LINE_8, hierarchy, 0, new Point()); } } imgContoursLabel.setIcon(new ImageIcon(HighGui.toBufferedImage(src_orig_ref))); frame.repaint(); } } public class hello { public static void main( String[] args ) { System.loadLibrary( Core.NATIVE_LIBRARY_NAME ); Mat mat = Mat.eye( 3, 3, CvType.CV_8UC1 ); System.out.println( "mat = " + mat.dump() ); javax.swing.SwingUtilities.invokeLater(new Runnable() { @Override public void run() { new FindContours(null); } }); } } 
0
Jul 10 '19 at 20:01
source share



All Articles