The most efficient way to pull given rows from a 2nd array?

Question

The most efficient way to pull given rows from a 2nd array?

I have a 2-D numpy array with 100,000+ lines. I need to return a subset of these lines (and I need to perform these operations many thousands of times, so efficiency is important).

An example layout is this:

import numpy as np
a = np.array([[1,5.5],
             [2,4.5],
             [3,9.0],
             [4,8.01]])
b = np.array([2,4])

So ... I want to return an array from a with the rows specified in the first column using b:

c=[[2,4.5],
   [4,8.01]]

The difference, of course, is that there are many more lines in both and in b, so I would like to avoid a loop. Also, I played with creating a dictionary and using np.nonzero, but still a little puzzled.

Thanks in advance for any ideas!

EDIT: Note that in this case, b are identifiers, not indexes. Here is a revised example:

import numpy as np
a = np.array([[102,5.5],
             [204,4.5],
             [343,9.0],
             [40,8.01]])
b = np.array([102,343])

And I want to return:

c = [[102,5.5],
     [343,9.0]]

+5

python arrays numpy mask

mishaF 31 . '11 19:41

2

:

c = a[(a[:,0] == b[:,None]).any(0)]

.

. b , :

b.sort()
c = a[b[np.searchsorted(b, a[:, 0]) - len(b)] == a[:,0]]

+4

Sven Marnach 31 . '11 20:14

JoshAdel · Accepted Answer · 2011-03-31T19:44:51+0000

: , . :

ii = np.where((a[:,0] - b.reshape(-1,1)) == 0)[1]
c = a[ii,:]

b a, , . , , b ints.

EDIT 2 :

ii = np.where(a[:,0] == b.reshape(-1,1))[1]
c = a[ii,:]

, .

3 (~ 10 , Sven ):

c = a[np.searchsorted(a[:,0],b),:]

, a[:,0] b a[:,0].

The most efficient way to pull given rows from a 2nd array?

More articles: