Python provides the NLTK library, which is an extensive resource of text and corpus, as well as many text mining and processing techniques. Is there a way to compare sentences based on the value they pass in for possible match? That is, intelligent suggestion assistant?
For example, a sentence like giggling at bad jokes and I like to laugh myself silly at poor jokes . Both convey the same meaning, but sentences do not match remotely (the words are different, Levenstein Distance will fail!).
Now imagine that we have an API that provides functionality such as that found here . Therefore, based on this, we have mechanisms to find out that the words giggle and laugh correspond to the meaning that they convey. Bad will not match poor , so we may need to add additional layers (for example, they match in the context of words like joke , since bad joke usually the same as poor joke , although a bad person not like poor person !).
The main problem is to discard things that do not greatly change the meaning of the sentence. Thus, the algorithm should return the same degree of mathematics between the first sentence and the following: I like to laugh myself silly at poor jokes, even though they are completely senseless, full of crap and serious chances of heart-attack!
So, with the available, is there any algorithm that was conceived? Or do I need to reinvent the wheel?
SexyBeast
source share