How to use spacy lemmatizer to get a word in basic form

I am new to knowing spacy, and I want to use its lemmatizer founction, but I don’t know how to use it, as I am in a string of words that will return a string that has the basic form of words. like 'words' => word, 'did' => 'do', thanks.

+6
source share
3 answers

The previous answer is confusing and cannot be edited, so it’s more conditional here.

# make sure your downloaded the english model with "python -m spacy download en" import spacy nlp = spacy.load('en') doc = nlp(u"Apples and oranges are similar. Boots and hippos aren't.") for token in doc: print(token, token.lemma, token.lemma_) 

Output:

 Apples 6617 apples and 512 and oranges 7024 orange are 536 be similar 1447 similar . 453 . Boots 4622 boot and 512 and hippos 98365 hippo are 536 be n't 538 not . 453 . 

From the official tour coverage

+11
source

The code:

 import os from spacy.en import English, LOCAL_DATA_DIR data_dir = os.environ.get('SPACY_DATA', LOCAL_DATA_DIR) nlp = English(data_dir=data_dir) doc3 = nlp(u"this is spacy lemmatize testing. programming books are more better than others") for token in doc3: print token, token.lemma, token.lemma_ 

Exit:

 this 496 this is 488 be spacy 173779 spacy lemmatize 1510965 lemmatize testing 2900 testing . 419 . programming 3408 programming books 1011 book are 488 be more 529 more better 615 better than 555 than others 871 others 

Example: here

+9
source

If you want to use only Lemmatizer. You can do it as follows.

 from spacy.lemmatizer import Lemmatizer from spacy.lang.en import LEMMA_INDEX, LEMMA_EXC, LEMMA_RULES lemmatizer = Lemmatizer(LEMMA_INDEX, LEMMA_EXC, LEMMA_RULES) lemmas = lemmatizer(u'ducks', u'NOUN') print(lemmas) 

Exit

 ['duck'] 
0
source

All Articles