Missing Spanish wordnet from NLTK

I am trying to use Spanish Wordnet from Open Multilingual Wordnet in NLTK 3.0, but it seems that it has not been downloaded with the 'omw' package. For example, with code similar to the following:

from nltk.corpus import wordnet as wn

print [el.lemma_names('spa') for el in wn.synsets('bank')]

The following error message appears:

IOError: No such file or directory: u'***/nltk_data/corpora/omw/spa/wn-data-spa.tab'

According to the documentation , the package "omw" should include Spanish, but it was not downloaded with it. Do you know why this could happen?

+4
source share
2 answers

Here's the complete trace error if the language is missing in Open Multilingual WordNet in the directory nltk_data:

>>> from nltk.corpus import wordnet as wn
>>> wn.synsets('bank')[0].lemma_names('spa')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/reader/wordnet.py", line 418, in lemma_names
    self._wordnet_corpus_reader._load_lang_data(lang)
  File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/reader/wordnet.py", line 1070, in _load_lang_data
    f = self._omw_reader.open('{0:}/wn-data-{0:}.tab'.format(lang))
  File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/reader/api.py", line 198, in open
    stream = self._root.join(file).open(encoding)
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 309, in join
    return FileSystemPathPointer(_path)
  File "/usr/local/lib/python2.7/dist-packages/nltk/compat.py", line 380, in _decorator
    return init_func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 287, in __init__
    raise IOError('No such file or directory: %r' % _path)
IOError: No such file or directory: u'/home/alvas/nltk_data/corpora/omw/spa/wn-data-spa.tab'

So, the first thing to check is whether it is installed automatically:

>>> import nltk
>>> nltk.download('omw')
[nltk_data] Downloading package omw to /home/alvas/nltk_data...
[nltk_data]   Package omw is already up-to-date!
Tru

nltk_data , "spa" :

alvas@ubi:~/nltk_data/corpora/omw$ ls
als  arb  cmn  dan  eng  fas  fin  fra  fre  heb  ita  jpn  mcr  msa  nor  pol  por  README  tha

, :

$ wget http://compling.hss.ntu.edu.sg/omw/wns/spa.zip
$ mkdir ~/nltk_data/corpora/omw/spa
$ unzip -p spa.zip mcr/wn-data-spa.tab > ~/nltk_data/corpora/omw/spa/wn-data-spa.tab

, nltk_data/corpora/omw/mcr/wn-data-spa.tab.

[]:

>>> from nltk.corpus import wordnet as wn
>>> wn.synsets('bank')[0].lemma_names('spa')
[u'margen', u'orilla', u'vera']

lemma_names() , Open Multilingusl Wordnet, (http://compling.hss.ntu.edu.sg/omw/) nltk_data.

NLTK OMW API NLTK.

+12

(OMW) .

+5

All Articles