Check or Disable Numpy / SciPy Concurrency

Question

Check or Disable Numpy / SciPy Concurrency

I am running K-Means clusters from the sklearn package.

Although I am setting the parameter n_jobs = 1as indicated in the sklearn documentation, and although one process is running, this process is likely to consume all the processors on my machine. That is, in top, I see that the work of python is used, say, 400% on a 4-core computer.

To be clear, if I install n_jobs = 2, say, then I run two instances of python, but each uses 200% of the CPU, again consuming all 4 cores of the machine.

I believe that the problem may be parallelization at the NumPy / SciPy level.

Is there a way to test my guess? Is there a way to disable any parallelization in NumPy / SciPy, for example?

+4

python numpy scipy scikit-learn parallel-processing

Emiller Sep 05 '14 at 20:56

source share

1 answer

Emiller · Accepted Answer · 2014-09-09T16:53:06+0000

Indeed, BLAS, or in my case OpenBLAS, was doing parallelization.

The solution was to set the environment variable OMP_NUM_THREADSto 1.

Then everything is right with the world.

Check or Disable Numpy / SciPy Concurrency

More articles: