Error loading Word2Vec model in gensim

I get an AttributeError when I load the gensim model available in the word2vec repository:

 from gensim import models w = models.Word2Vec() w.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True) print w["queen"] --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-3-8219e36ba1f6> in <module>() ----> 1 w["queen"] C:\Anaconda64\lib\site-packages\gensim\models\word2vec.pyc in __getitem__(self, word) 761 762 """ --> 763 return self.syn0[self.vocab[word].index] 764 765 AttributeError: 'Word2Vec' object has no attribute 'syn0' 

Is this a known issue?

+10
source share
4 answers

Fixed issue with:

 from gensim import models w = models.Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True) print w["queen"] 
+11
source

To split the vector query code between different training algos (Word2Vec, Fastext, WordRank, VarEmbed), the authors divided the storage and query of word vectors into a separate class KeyedVectors.

Two methods and several attributes in the word2vec class are deprecated.

Methods

  • load_word2vec_format
  • save_word2vec_format

Attributes

  • syn0norm
  • syn0
  • Vocab
  • index2word

They have been ported to the KeyedVectors class.

After upgrading to this version, you may receive exceptions about deprecated methods or missing attributes.

To remove exceptions you should use

 KeyedVectors.load_word2vec_format (instead ofWord2Vec.load_word2vec_format) word2vec_model.wv.save_word2vec_format (instead of word2vec_model.save_word2vec_format) model.wv.syn0norm instead of (model.syn0norm) model.wv.syn0 instead of (model.syn0) model.wv.vocab instead of (model.vocab) model.wv.index2word instead of (model.index2word) 
+3
source

Currently, since models.Word2Vec deprecated, you need to use models.KeyedVectors.load_word2vec_format instead of models.Word2Vec.load_word2vec_format as shown below.

 from gensim import models w = models.KeyedVectors.load_word2vec_format('model.bin', binary=True) 
0
source
 from gensim import models w = models.KeyedVectors.load_word2vec_format('model.bin', binary=True) 

when I use it, it will throw an error:

 /usr/local/lib/python3.6/dist-packages/smart_open/smart_open_lib.py:398: UserWarning: This function is deprecated, use smart_open.open instead. See the migration notes for details: https://github.com/RaRe-Technologies/smart_open/blob/master/README.rst#migrating-to-the-new-open-function 'See the migration notes for details: %s' % _MIGRATION_NOTES_URL --------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent call last) <ipython-input-22-894908fd0f6c> in <module>() ----> 1 model=gensim.models.KeyedVectors.load_word2vec_format('hi.bin',binary=True) 2 frames /usr/local/lib/python3.6/dist-packages/gensim/utils.py in any2unicode(text, encoding, errors) 357 if isinstance(text, unicode): 358 return text --> 359 return unicode(text, encoding, errors=errors) 360 361 UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte 
0
source

All Articles