Syntactic similarity / distance between two sentences / line / text using nltk

Question

Syntactic similarity / distance between two sentences / line / text using nltk

I have 2 texts below

Text1: John loves an apple

Text2: Mike hates orange

If you check above 2 texts, they are both syntactically similar, but semantically have a different meaning.

I want to find

1) Syntactic distance between two texts

2) The semantic distance between two texts

Is there a way to do this with nltk since I'm new to NLP?

+5

python scikit-learn machine-learning nlp nltk

Ganesh deshvini Aug 16 '16 at 13:46

source share

2 answers

For semantics, you can try word2vec. You can safely average the similarity of words in a sentence, or you can come up with your own way to weigh words according to its syntax.

from gensim.models import Word2Vec model = Word2Vec.load(path/to/your/model) model.similarity('apple', 'orange')

+3

Aaron Aug 16 '16 at 23:59

source share

Masoud · Accepted Answer · 2016-08-16T14:25:03+0000

Yes, but not limited to nltk. One way to use syntactic distance is the Part Of Speech (POS Tagging) tag, which maps each sentence word to a specific tag: https://en.wikipedia.org/wiki/Part-of-speech_tagging

For example, it displays your suggestions:
Text1: noun noun noun

Text2: noun noun noun

Then you can measure the distance of these two sentences.

And for semantics, you need a semantic network of words and find synonyms for each word in a sentence, then try to find the intersection of the synonyms of words in each sentence

Syntactic similarity / distance between two sentences / line / text using nltk

More articles: