Numpy table - advanced selection of several criteria

I have a table that looks something like this:

IDs    Timestamp     Values

124    300.6          1.23
124    350.1         -2.4
309    300.6          10.3
12     123.4          9.00
18     350.1          2.11
309    350.1          8.3

       ...

and I would like to select all rows belonging to the identifier group. I know I can do something like

table[table.IDs == 124]

to select all from one row id and i could do

table[(table.IDs == 124) | (table.IDs == 309)]

to get two lines of ID. But imagine that I have ~ 100,000 rows with more than 1000 unique identifiers (which are different from row indices), and I want to select all rows that correspond to a set of 10 identifiers. Intuitively, I would like to do this:

# id_list: a list of 10 IDs
table[ table.IDs in id_list ]

but Python rejects this syntax. The only way I can come up with is to do the following:

table[ (table.IDs == id_list[0]) |
       (table.IDs == id_list[1]) |
       (table.IDs == id_list[2]) |
       (table.IDs == id_list[3]) |
       (table.IDs == id_list[4]) |
       (table.IDs == id_list[5]) |
       (table.IDs == id_list[6]) |
       (table.IDs == id_list[7]) |
       (table.IDs == id_list[8]) |
       (table.IDs == id_list[9]) ]

- . , , .any()? .

+5
2

:

subset = table[np.array([i in id_list for i in table.IDs])]

numpy, in1d, :

subset = table[np.in1d(table.IDs, id_list)]

: numpy recarray

+7

, , , , python for. , , in1d. , 2D ids.size table.IDs.size. ids - numpy id_list.

result = table[~np.all(table.IDs[None]-ids[None].T, 0)]
0

All Articles