You need to modify np.linalg.det to get speed. The idea is that det() is a Python function, it first does a lot of checking and calls the fortran procedure and computes some array to get the result.
Here is the code from numpy:
def slogdet(a): a = asarray(a) _assertRank2(a) _assertSquareness(a) t, result_t = _commonType(a) a = _fastCopyAndTranspose(t, a) a = _to_native_byte_order(a) n = a.shape[0] if isComplexType(t): lapack_routine = lapack_lite.zgetrf else: lapack_routine = lapack_lite.dgetrf pivots = zeros((n,), fortran_int) results = lapack_routine(n, n, a, n, pivots, 0) info = results['info'] if (info < 0): raise TypeError, "Illegal input to Fortran routine" elif (info > 0): return (t(0.0), _realType(t)(-Inf)) sign = 1. - 2. * (add.reduce(pivots != arange(1, n + 1)) % 2) d = diagonal(a) absd = absolute(d) sign *= multiply.reduce(d / absd) log(absd, absd) logdet = add.reduce(absd, axis=-1) return sign, logdet def det(a): sign, logdet = slogdet(a) return sign * exp(logdet)
To speed up this function, you can omit the check (it becomes your responsibility to keep the input to the right) and collect the fortran results in an array and perform the final calculations for all small arrays without a loop.
Here is my result:
import numpy as np from numpy.core import intc from numpy.linalg import lapack_lite N = 1000 M = np.random.rand(N*10*10).reshape(N, 10, 10) def dets(a): length = a.shape[0] dm = np.zeros(length) for i in xrange(length): dm[i] = np.linalg.det(M[i]) return dm def dets_fast(a): m = a.shape[0] n = a.shape[1] lapack_routine = lapack_lite.dgetrf pivots = np.zeros((m, n), intc) flags = np.arange(1, n + 1).reshape(1, -1) for i in xrange(m): tmp = a[i] lapack_routine(n, n, tmp, n, pivots[i], 0) sign = 1. - 2. * (np.add.reduce(pivots != flags, axis=1) % 2) idx = np.arange(n) d = a[:, idx, idx] absd = np.absolute(d) sign *= np.multiply.reduce(d / absd, axis=1) np.log(absd, absd) logdet = np.add.reduce(absd, axis=-1) return sign * np.exp(logdet) print np.allclose(dets(M), dets_fast(M.copy()))
and speed:
timeit dets(M) 10 loops, best of 3: 159 ms per loop timeit dets_fast(M) 100 loops, best of 3: 10.7 ms per loop
So, having done this, you can speed it up 15 times. This is a good result without compiled code.
Note: I skipped error checking for the fortran procedure.