Search for words from Wordnet, separated by a fixed editing distance from a given word

I write spellchecking with nltk and wordnet, I have a few misspelled words that say "believe." What I want to do is to find all the words from wordnet that are separated by the editing distance of leftstein 1 or 2 from the given word. Does nltk provide any methods for this? How to do it?


Maybe I'm wrong. the method edit_distancetakes 2 arguments of type edit_distance(word1,word2)returns the levenshtein distance between word1 and word2. I want to find the editing distance between the word that I pass with every other word in wordnet.

+5
source share
2 answers

It really provides a method edit_distance. See Docs here

+1
source

Well, finally came up with a solution:

from nltk.corpus import wordnet
f=open("wordnet_wordlist.txt","w")
for syn in list(wordnet.all_synsets()):
    f.write(syn.name[:-5])
    f.write("\n")

f.close()

f = open("wordnet_wordlist.txt")
f2 = open("wordnet_wordlist_final.txt", "w")
uniquelines = set(f.read().split("\n"))
f2.write("".join([line + "\n" for line in uniquelines]))
f2.close()

Now, having read wordlist_final from the final file and using nltk.edit_distance, the list can be found

wordnetobj=open("wordnet_wordlist_final.txt","r")
wordnet=wordnetobj.readlines()
def edit(word,distance):
    validlist=[]
    for valid in wordnet:
        valids=valid[:-1]
        if(abs(len(valids)-len(word))<=2):
            if(nltk.edit_distance(word,valids)==distance):
                validlist.append(valids)

    return validlist 
0
source

All Articles