DL4J is very slow in the GoogleNews-vectors file

I tried to run the following example on DL4J (loading a file of pre-prepared vectors):

File gModel = new File("./GoogleNews-vectors-negative300.bin.gz"); Word2Vec vec = WordVectorSerializer.loadGoogleModel(gModel, true); InputStreamReader r = new InputStreamReader(System.in); BufferedReader br = new BufferedReader(r); for (; ; ) { System.out.print("Word: "); String word = br.readLine(); if ("EXIT".equals(word)) break; Collection<String> lst = vec.wordsNearest(word, 20); System.out.println(word + " -> " + lst); } 

But it is very slow (it takes ~ 10 minutes to calculate the next words, although they are correct).

Enough memory ( -Xms20g -Xmx20g ).

When I run the same Word2Vec example from https://code.google.com/p/word2vec/

he gives the quickest words.

DL4J uses ND4J, which claims to be twice as fast as Numpy: http://nd4j.org/benchmarking

Is there something wrong with my code?

UPDATE: it is based on https://github.com/deeplearning4j/dl4j-0.4-examples.git (I did not touch any dependencies, just tried to read the pre-prepared Google vectors file). Word2VecRawTextExample works just fine (but the data size is relatively small).

+6
source share
1 answer

To improve performance, I suggest you do the following:

  • Set the environment variable OMP_NUM_THREADS to the number of your logical cores

  • Install Intel Math Kernel Library if you are using Intel processors

  • In your path, add information about where mkl_intel_thread.dll from the Intel Math Kernel library lives
0
source

All Articles