WordNet: Iteration of Syntheses

For a project, I would like to measure the number of "person-oriented words" in the text. I plan to do this using WordNet. I have never used it, and I'm not quite sure how to approach this task. I want to use WordNet to count the number of words related to specific synsets, for example, sysnets 'human and' person.

I came up with the following (simple) piece of code:

word = 'girlfriend'
word_synsets = wn.synsets(word)[0]

hypernyms = word_synsets.hypernym_paths()[0]

for element in hypernyms:
    print element

Results in:

Synset('entity.n.01')
Synset('physical_entity.n.01')
Synset('causal_agent.n.01')
Synset('person.n.01')
Synset('friend.n.01')
Synset('girlfriend.n.01')

My first question is: how to properly sort hypernim? The above code prints them just fine. However, when using an if if statement, for example:

count_humancenteredness = 0
for element in hypernyms:
    if element == 'person':
        print 'found person hypernym'
        count_humancenteredness +=1

'AttributeError: ' str ' ' _name '. (, ), .

-, ? , . , WordNet .

!

+4
2

hypernyms = word_synsets.hypernym_paths() SynSet s.

if element == 'person':

SynSet . SynSet.

-

target_synsets = wn.synsets('person')
if element in target_synsets:
    ...

if u'person' in element.lemma_names():
    ...

.

- . , . , , , .

, " " , , .

-

person_words = set(w for s in p.closure(lambda s: s.hyponyms()) for w in s.lemma_names())

. ~ 10,000, .

-

from collections import Counter

word_count = Counter()
for word in (w.lower() for w in words if w in person_words):         
    word_count[word] += 1

, , WordNet.

+4

synset, ( NLTK 3.0.3, dhke ):

def get_hyponyms(synset):
    hyponyms = set()
    for hyponym in synset.hyponyms():
        hyponyms |= set(get_hyponyms(hyponym))
    return hyponyms | set(synset.hyponyms())

:

from nltk.corpus import wordnet
food = wordnet.synset('food.n.01')
print(len(get_hyponyms(food))) # returns 1526
0

All Articles