How to find all elements in a two-dimensional numpy array that matches a specific list?

I have a two-dimensional NumPy array, for example:

array([[1, 1, 0, 2, 2], [1, 1, 0, 2, 0], [0, 0, 0, 0, 0], [3, 3, 0, 4, 4], [3, 3, 0, 4, 4]]) 

I would like to get all the elements from this array that are in a specific list, for example (1, 3, 4). The desired result in the example would be:

 array([[1, 1, 0, 0, 0], [1, 1, 0, 0, 0], [0, 0, 0, 0, 0], [3, 3, 0, 4, 4], [3, 3, 0, 4, 4]]) 

I know I can just do it (as recommended by Numpy here : find elements within range ):

 np.logical_or( np.logical_or(cc_labeled == 1, cc_labeled == 3), cc_labeled == 4 ) 

but it will only be reasonably effective in the example. Actually, the iterative use for the loop and numpy.logical_or turned out to be very slow, since the list of possible values ​​is in thousands (and the numpy array has a size of about 1000 x 1000).

+4
source share
2 answers

You can use np.in1d -

 A*np.in1d(A,[1,3,4]).reshape(A.shape) 

Alternatively, np.where can be used -

 np.where(np.in1d(A,[1,3,4]).reshape(A.shape),A,0) 

You can also use np.searchsorted to find matches by using the optional 'side' argument with inputs like left and right and noting that for matches, search sorts would produce different results with these two inputs. So the equivalent of np.in1d(A,[1,3,4]) would be -

 M = np.searchsorted([1,3,4],A.ravel(),'left') != \ np.searchsorted([1,3,4],A.ravel(),'right') 

So the end result will be -

 out = A*M.reshape(A.shape) 

Note that if the input search list is not sorted, you need to use the optional sorter argument with its argsort indices in np.searchsorted .

Run Example -

 In [321]: A Out[321]: array([[1, 1, 0, 2, 2], [1, 1, 0, 2, 0], [0, 0, 0, 0, 0], [3, 3, 0, 4, 4], [3, 3, 0, 4, 4]]) In [322]: A*np.in1d(A,[1,3,4]).reshape(A.shape) Out[322]: array([[1, 1, 0, 0, 0], [1, 1, 0, 0, 0], [0, 0, 0, 0, 0], [3, 3, 0, 4, 4], [3, 3, 0, 4, 4]]) In [323]: np.where(np.in1d(A,[1,3,4]).reshape(A.shape),A,0) Out[323]: array([[1, 1, 0, 0, 0], [1, 1, 0, 0, 0], [0, 0, 0, 0, 0], [3, 3, 0, 4, 4], [3, 3, 0, 4, 4]]) In [324]: M = np.searchsorted([1,3,4],A.ravel(),'left') != \ ...: np.searchsorted([1,3,4],A.ravel(),'right') ...: A*M.reshape(A.shape) ...: Out[324]: array([[1, 1, 0, 0, 0], [1, 1, 0, 0, 0], [0, 0, 0, 0, 0], [3, 3, 0, 4, 4], [3, 3, 0, 4, 4]]) 

Run-time and exit checks -

 In [309]: # Inputs ...: A = np.random.randint(0,1000,(400,500)) ...: lst = np.sort(np.random.randint(0,1000,(100))).tolist() ...: ...: def func1(A,lst): ...: return A*np.in1d(A,lst).reshape(A.shape) ...: ...: def func2(A,lst): ...: return np.where(np.in1d(A,lst).reshape(A.shape),A,0) ...: ...: def func3(A,lst): ...: mask = np.searchsorted(lst,A.ravel(),'left') != \ ...: np.searchsorted(lst,A.ravel(),'right') ...: return A*mask.reshape(A.shape) ...: In [310]: np.allclose(func1(A,lst),func2(A,lst)) Out[310]: True In [311]: np.allclose(func1(A,lst),func3(A,lst)) Out[311]: True In [312]: %timeit func1(A,lst) 10 loops, best of 3: 30.9 ms per loop In [313]: %timeit func2(A,lst) 10 loops, best of 3: 30.9 ms per loop In [314]: %timeit func3(A,lst) 10 loops, best of 3: 28.6 ms per loop 
+3
source

Use np.in1d :

 np.in1d(arr, [1,3,4]).reshape(arr.shape) 

in1d , as the name implies, works on a flattened array, so you need to change the shape after the operation.

+3
source

All Articles