How to filter Pandas dataframe rows by checking the value of a sublayer index in a list?

Question

How to filter Pandas dataframe rows by checking the value of a sublayer index in a list?

I have a Pandas dataframe df sample with index multi_level:

 >>> df STK_Name ROIC mg_r STK_ID RPT_Date 002410 20111231 ??? 0.401 0.956 300204 20111231 ??? 0.375 0.881 300295 20111231 ???? 2.370 0.867 300288 20111231 ???? 1.195 0.861 600106 20111231 ???? 1.214 0.857 300113 20111231 ???? 0.837 0.852

and stk_list defined as stk_list = ['600106','300204','300113']

I want to get df strings, sub_level STK_ID index STK_ID is within stk_list . The output is as follows:

  STK_Name ROIC mg_r STK_ID RPT_Date 300204 20111231 ??? 0.375 0.881 600106 20111231 ???? 1.214 0.857 300113 20111231 ???? 0.837 0.852

Basically, I can achieve the goal for this sample data:

 df = df.reset_index() ; df[df.STK_ID.isin(stk_list)]

But I already have the "STK_ID" and "RPT_Date" columns in my application data frame, so reset_index () will result in an error. Anyway, I want to filter the index directly instead of columns.

Learn from this: How to filter by sub-level index in Pandas

I try df[df.index.map(lambda x: x[0].isin(stk_list))] , and Pandas 0.8.1 gives AttributeError: 'unicode' object has no attribute 'isin' ,

My question is: how do I filter the Pandas dataframe rows by checking the sublayer index value in the list without using the reset_index() and set_index() methods?

+7

python pandas

bigbug Nov 18 '12 at 9:56

source share

5 answers

What about using the level parameter in DataFrame.reindex ?

 In [14]: df Out[14]: 0 1 a 0 0.007288 -0.840392 1 0.652740 0.597250 b 0 -1.197735 0.822150 1 -0.242030 -0.655058 In [15]: stk_list = ['a'] In [16]: df.reindex(stk_list, level=0) Out[16]: 0 1 a 0 0.007288 -0.840392 1 0.652740 0.597250

+11

Chang she Nov 19 '12 at 4:26

source share

I'm very late to the party, but by far the most readable and intuitive way to do this is to use index.levels[n].isin ?

It works as follows:

 >>> stk_list = [600106, 300204, 300113] >>> df[df.index.levels[0].isin(stk_list)] STK_Name ROIC mg_r STK_ID RPT_Date 300204 20111231 ??? 0.375 0.881 300295 20111231 ???? 2.370 0.867 300113 20111231 ???? 0.837 0.852

What I like about this approach is that the team can be read as an English sentence.

ps in OP, stk_list is a list of strings. A little understanding of the -fu list will deal with this:

 df[df.index.levels[0].isin([int(i) for i in stk_list])]

+7

London rob Jul 13 '15 at 13:00

source share

For me, it only worked if I take zero from x as follows:

 a[a.index.map(lambda x: x in b)]

+1

tsando Apr 23 '16 at 18:00

source share

get_level_values :

 df[df.index.get_level_values(level = 0).isin(stk_list)]

0

Shoresh Jan 25 '17 at 20:58

source share

Avaris · Accepted Answer · 2012-11-19T11:48:31+0000

You can try:

 df[df.index.map(lambda x: x[0] in stk_list)]

Example:

 In : stk_list Out: ['600106', '300204', '300113'] In : df Out: STK_Name ROIC mg_r STK_ID RPT_Date 002410 20111231 ??? 0.401 0.956 300204 20111231 ??? 0.375 0.881 300295 20111231 ???? 2.370 0.867 300288 20111231 ???? 1.195 0.861 600106 20111231 ???? 1.214 0.857 300113 20111231 ???? 0.837 0.852 In : df[df.index.map(lambda x: x[0] in stk_list)] Out: STK_Name ROIC mg_r STK_ID RPT_Date 300204 20111231 ??? 0.375 0.881 600106 20111231 ???? 1.214 0.857 300113 20111231 ???? 0.837 0.852

How to filter Pandas dataframe rows by checking the value of a sublayer index in a list?

More articles: