One of the difficult topics in computer vision is the processing of document scans. This usually involves several steps, such as noise removal, color analysis, binarization, text block identification, OCR, and then possibly some contextual analysis and correction.
I'm curious if anyone understands, knows, or can point me to literature on how Google identifies text blocks before the OCR stage. Any ideas?
source share