Speed ​​up numpy where function

I am trying to extract indices of all 1D values ​​of an array of numbers that exceed a certain threshold. The array has a length of 1e9 .

My approach is NumPy :

 idxs = where(data>threshold) 

It takes about 20 minutes, which is unacceptable. How can I speed up this feature? Or are there any faster alternatives?

(To be specific, Mac OS X requires 10.6.7, 1.86 GHz Intel, 4 GB of RAM, doing nothing).

+6
source share
1 answer

Give it a try. This creates an idea of ​​the same data.

Thus, the syntax will look like this:

  b=a[a>threshold] 

b is not a new array (as opposed to where), but is a representation of where the elements satisfy the boolean in the index.

Example:

 import numpy as np import time a=np.random.random_sample(int(1e9)) t1=time.time() b=a[a>0.5] print(time.time()-t1,'seconds') 

On my machine that prints 22.389815092086792 seconds


change

I tried the same with np.where and it is just as fast. I'm suspicious: are you deleting these values ​​from the array?

+4
source

All Articles