Pandas DataFrame Filter Slider

I do not understand pandas DataFrame filter .

Customization

 import pandas as pd df = pd.DataFrame( [ ['Hello', 'World'], ['Just', 'Wanted'], ['To', 'Say'], ['I\'m', 'Tired'] ] ) 

Problem

 df.filter([0], regex=r'(Hel|Just)', axis=0) 

I expect that [0] will indicate the 1st column as the one to look at, and axis=0 to indicate the filter rows. I get the following:

  0 1 0 Hello World 

I expected

  0 1 0 Hello World 1 Just Wanted 

Question

  • What would I understand, what was I expecting?
+6
source share
2 answers

Per documents ,

The arguments are mutually exclusive, but this is not verified for

So, it appears that the first optional argument items=[0] superior to the third optional argument regex=r'(Hel|Just)' .

 In [194]: df.filter([0], regex=r'(Hel|Just)', axis=0) Out[194]: 0 1 0 Hello World 

equivalently

 In [201]: df.filter([0], axis=0) Out[201]: 0 1 0 Hello World 

which simply selects rows (rows) with indices at [0] along the 0 axis.


To get the desired result, you can use str.contains to create a boolean mask, and use df.loc to select the strings:

 In [210]: df.loc[df.iloc[:,0].str.contains(r'(Hel|Just)')] Out[210]: 0 1 0 Hello World 1 Just Wanted 
+8
source

This should work:

df[df[0].str.contains('(Hel|Just)', regex=True)]

+3
source

All Articles