Check or Disable Numpy / SciPy Concurrency

I am running K-Means clusters from the sklearn package.

Although I am setting the parameter n_jobs = 1as indicated in the sklearn documentation, and although one process is running, this process is likely to consume all the processors on my machine. That is, in top, I see that the work of python is used, say, 400% on a 4-core computer.

To be clear, if I install n_jobs = 2, say, then I run two instances of python, but each uses 200% of the CPU, again consuming all 4 cores of the machine.

I believe that the problem may be parallelization at the NumPy / SciPy level.

Is there a way to test my guess? Is there a way to disable any parallelization in NumPy / SciPy, for example?

+4
source share
1 answer

Indeed, BLAS, or in my case OpenBLAS, was doing parallelization.

The solution was to set the environment variable OMP_NUM_THREADSto 1.

Then everything is right with the world.

+1
source

All Articles