I am running K-Means clusters from the sklearn package.
Although I am setting the parameter n_jobs = 1as indicated in the sklearn documentation, and although one process is running, this process is likely to consume all the processors on my machine. That is, in top, I see that the work of python is used, say, 400% on a 4-core computer.
To be clear, if I install n_jobs = 2, say, then I run two instances of python, but each uses 200% of the CPU, again consuming all 4 cores of the machine.
I believe that the problem may be parallelization at the NumPy / SciPy level.
Is there a way to test my guess? Is there a way to disable any parallelization in NumPy / SciPy, for example?
source
share