I built optical character recognition for Sinhala (language in Sri Lanka). I have had some success. Now I need to do post processing using dictionary data.
What would be the best approach to change misspelled words into correct words? Can anyone give suggestions?
I have unicode dictionary data files, and also my OCR output is also a unicode file. I am doing this using C ++. I have already tried string matching algorithms without success. I want to start the most relevant approach to this problem. Can anybody help me?
Thanks in advance.
source share