Faster way to split a numpy array according to threshold

Suppose I have a random numpy array:

X = np.arange(1000) 

and threshold:

 thresh = 50 

I want to divide X into two sections X_l and X_r so that each element of X_l less than or equal to thresh , and in X_r each element is greater than thresh . After that, these two sections are given a recursive function.

Using numpy, I create a boolean array, and I use it to split X :

 Z = X <= thresh X_l, X_r = X[Z == 0], X[Z == 1] recursive_call(X_l, X_r) 

This is done several times, is there a way to make things faster? Is it possible to avoid creating a copy of partitions with every call?

+4
source share
2 answers

X[~Z] faster than X[Z==0] :

 In [13]: import numpy as np In [14]: X = np.random.random_integers(0, 1000, size=1000) In [15]: thresh = 50 In [18]: Z = X <= thresh In [19]: %timeit X_l, X_r = X[Z == 0], X[Z == 1] 10000 loops, best of 3: 23.9 us per loop In [20]: %timeit X_l, X_r = X[~Z], X[Z] 100000 loops, best of 3: 16.4 us per loop 

Did you profile to determine that this is really a bottleneck in your code? If your code spends only 1% of its time performing this split operation, no matter how much you optimize this operation, there will be no more than 1% impact on overall performance.

You can make more use of rethinking your algorithm or data structure than optimizing this operation. And if this is really a bottleneck, you might also be better off rewriting this piece of code in C or Cython ...

When you have numpy arrays of size 1000, there is a possibility that using Python lists / sets / dicts might be faster. The speed advantage of NumPy arrays is sometimes not shown until the arrays are large enough. You might want to rewrite your code in pure Python and compare the two versions with timeit .

Hm, let me rephrase that. This is not the size of the array, which makes NumPy faster or slower. Its just that, with small NumPy arrays, it is sometimes a sign that you are creating many small NumPy arrays, and creating a NumPy array is much slower than creating, say, a Python list:

 In [21]: %timeit np.array([]) 100000 loops, best of 3: 4.31 us per loop In [22]: %timeit [] 10000000 loops, best of 3: 29.5 ns per loop In [23]: 4310/295. Out[23]: 14.610169491525424 

In addition, when coding in pure Python, you are most likely to use dicts and sets, for which there is no direct equivalent to NumPy. This can lead to an alternative algorithm, which is faster.

+7
source

Is your array always sorted? In your example, you use arange , which is sorted, so there is no need to do boolean indexing, you can just slice the array in half at the appropriate place. This avoids the use of "advanced indexing", so you do not need to copy the array.

 X = np.arange(0, 2*thresh) i = X.searchsorted(thresh, side='right') # side='right' for `<=` X_l, X_r = X[:i], X[i:] 

This saves a lot of time for sorted arrays, but obviously won't work otherwise:

 thresh = 500 X = np.arange(2*thresh) %%timeit i = X.searchsorted(thresh, side='right') X_l, X_r = X[:i], X[i:] 100000 loops, best of 3: 5.16 ยตs per loop %%timeit Z = X <= thresh X_l, X_r = X[Z], X[~Z] 100000 loops, best of 3: 12.1 ยตs per loop 
+1
source

All Articles