Short answer
export OMP_NUM_THREADS=1 or dask-worker --nthreads 1
Explanation
The OMP_NUM_THREADS environment OMP_NUM_THREADS controls the number of threads that many libraries, including the numpy.dot BLAS power numpy.dot , use, for example, a matrix in their calculations.
The conflict is that you have two parallel libraries that call each other, BLAS and dask.distributed. Each library is designed to use as many threads as possible, since the system has logical kernels.
For example, if you had eight cores, then dask.distributed can run your function f eight times in a row on different threads. A call to the numpy.dot function in f will use eight threads for each call, as a result of which a stream of 64 threads will be executed immediately.
This is actually great, you have run into a performance hit, but everything may work correctly, but it will be slower than if you only used eight threads at a time, either by restricting dask.distributed or by limiting BLAS.
Your system probably has OMP_THREAD_LIMIT installed on some reasonable amount, such as 16, to alert you to this event when this happens.
source share