A shorter version of this numpy array indexing

I have the following code in python (numpy array or scipy.sparse.matrices), it works:

X[a,:][:,b] 

But it does not look elegant. 'a' and 'b' are a 1-D boolean mask.

'a' is the same length as X.shape [0] and 'b' is the same length as X.shape [1]

I tried X[a,b] , but it does not work.

What I'm trying to do is select individual rows and columns at the same time. For example, select row 0.7.8, then from this result select all rows from column 2,3,4

How would you make it shorter and more elegant?

+5
source share
1 answer

You can use np.ix_ for such broadcasted indexing , for example:

 X[np.ix_(a,b)] 

Although it will not be shorter than the source code, but I hope it should be faster. This is because we avoid intermediate output, as with the source code that created X[a,:] with one slice and then with another slice X[a,:][:,b] to give us the final result .

In addition, this method will work for a and b for both int and boolean arrays.

Run example

 In [141]: X = np.random.randint(0,99,(6,5)) In [142]: m,n = X.shape In [143]: a = np.in1d(np.arange(m),np.random.randint(0,m,(m))) In [144]: b = np.in1d(np.arange(n),np.random.randint(0,n,(n))) In [145]: X[a,:][:,b] Out[145]: array([[17, 81, 64], [87, 16, 54], [98, 22, 11], [26, 54, 64]]) In [146]: X[np.ix_(a,b)] Out[146]: array([[17, 81, 64], [87, 16, 54], [98, 22, 11], [26, 54, 64]]) 

Runtime test

 In [147]: X = np.random.randint(0,99,(600,500)) In [148]: m,n = X.shape In [149]: a = np.in1d(np.arange(m),np.random.randint(0,m,(m))) In [150]: b = np.in1d(np.arange(n),np.random.randint(0,n,(n))) In [151]: %timeit X[a,:][:,b] 1000 loops, best of 3: 1.74 ms per loop In [152]: %timeit X[np.ix_(a,b)] 1000 loops, best of 3: 1.24 ms per loop 
+5
source

All Articles