I don’t know whether to post this question here or not? But if someone knows this, answer?
What are the algorithms for determining which region in the image is textual and which is graphic? Means how to separate such regions? (picture or diagram)
Most OCR programs, such as Ocropus , support the mock analysis you need.
Mao, Rosenfeld and Kanungo (2003) Document structure analysis algorithms: A literature review provides a fairly recent overview of layout analysis algorithms.
the first step is likely to be a sharper contrast between the text and the image. This can be done by taking the derivative of the image. This will show a color change, and high values are more likely to be matched with text forms.