Why is numpy / scipy faster without OpenBLAS?

I made two installations:

  • brew install numpy (and scipy) --with-openblas
  • GIT cloned repositories (for numpy and scipy) and built themselves

After I cloned two handy scripts to test these libraries in a multi-threaded environment:

git clone https://gist.github.com/3842524.git

Then for each installation I do show_config:

python -c "import scipy as np; np.show_config()"

All this is good for installation 1:

lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/opt/openblas/lib']
    language = f77
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/opt/openblas/lib']
    language = f77
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/opt/openblas/lib']
    language = f77
blas_mkl_info:
    NOT AVAILABLE

But setting 2 things are not so bright:

lapack_opt_info:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    extra_compile_args = ['-msse3']
    define_macros = [('NO_ATLAS_INFO', 3)]
blas_opt_info:
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    extra_compile_args = ['-msse3', '- I/System/Library/Frameworks/vecLib.framework/Headers']
define_macros = [('NO_ATLAS_INFO', 3)]

It seems so when I was unable to properly link OpenBLAS. But at the moment this is wonderful, here are the results of the work. All tests are performed on iMac, Yosemite, i7-4790K, 4 cores, hyperthreads.

First installation with OpenBLAS:

Numpy:

OMP_NUM_THREADS=1 python test_numpy.py
FAST BLAS
version: 1.9.2
maxint: 9223372036854775807
dot: 0.126578998566 sec

OMP_NUM_THREADS=2 python test_numpy.py
FAST BLAS
version: 1.9.2
maxint: 9223372036854775807
dot: 0.0640147686005 sec

OMP_NUM_THREADS=4 python test_numpy.py
FAST BLAS
version: 1.9.2
maxint: 9223372036854775807
dot: 0.0360922336578 sec

OMP_NUM_THREADS=8 python test_numpy.py
FAST BLAS
version: 1.9.2
maxint: 9223372036854775807
dot: 0.0364527702332 sec

SciPy:

OMP_NUM_THREADS=1 python test_scipy.py
cholesky: 0.0276656150818 sec
svd: 0.732437372208 sec

OMP_NUM_THREADS=2 python test_scipy.py
cholesky: 0.0182101726532 sec
svd: 0.441690778732 sec

OMP_NUM_THREADS=4 python test_scipy.py
cholesky: 0.0130400180817 sec
svd: 0.316107988358 sec

OMP_NUM_THREADS=8 python test_scipy.py
cholesky: 0.012854385376 sec
svd: 0.315939807892 sec

Second install without OpenBLAS:

Numpy:

OMP_NUM_THREADS=1 python test_numpy.py
slow blas
version: 1.10.0.dev0+3c5409e
maxint: 9223372036854775807
dot: 0.0371072292328 sec

OMP_NUM_THREADS=2 python test_numpy.py
slow blas
version: 1.10.0.dev0+3c5409e
maxint: 9223372036854775807
dot: 0.0215149879456 sec

OMP_NUM_THREADS=4 python test_numpy.py
slow blas
version: 1.10.0.dev0+3c5409e
maxint: 9223372036854775807
dot: 0.0146862030029 sec

OMP_NUM_THREADS=8 python test_numpy.py
slow blas
version: 1.10.0.dev0+3c5409e
maxint: 9223372036854775807
dot: 0.0141334056854 sec

SciPy:

OMP_NUM_THREADS=1 python test_scipy.py
cholesky: 0.0109382152557 sec
svd: 0.32529540062 sec

OMP_NUM_THREADS=2 python test_scipy.py
cholesky: 0.00988121032715 sec
svd: 0.331357002258 sec

OMP_NUM_THREADS=4 python test_scipy.py
cholesky: 0.00916676521301 sec
svd: 0.318637990952 sec

OMP_NUM_THREADS=8 python test_scipy.py
cholesky: 0.00931282043457 sec
svd: 0.324427986145 sec

, . scipy , ​​, 4 OpenBLAS.

- , ?

+4
1

, :

  • numpy. , OpenBLAS, Homebrew, 1.9.1, , , - 1.10.0.dev0 + 3c5409e.

  • OpenBLAS, Apple Accelerate Framework, BLAS.


, script slow blas , numpy. script , numpy BLAS numpy.core._dotblas:

try:
    import numpy.core._dotblas
    print 'FAST BLAS'
except ImportError:
    print 'slow blas'

numpy C , BLAS. _dotblas > 1.10.0 ( SO), script slow blas .

numpy script, BLAS ; .

+8

All Articles