The most concise way to select rows where any column contains a row in a Pandas dataframe?

Question

The most concise way to select rows where any column contains a row in a Pandas dataframe?

What is the most concise way to select all rows where any column contains a row in a Pandas dataframe?

For example, given the following data file, what's the best way to select those rows where the value in any column contains b ?

 df = pd.DataFrame({ 'x': ['foo', 'foo', 'bar'], 'y': ['foo', 'foo', 'foo'], 'z': ['foo', 'baz', 'foo'] })

I'm inexperienced with Pandas, and the best I've come up with so far is rather cumbersome df[df.apply(lambda r: r.str.contains('b').any(), axis=1)] . Is there a simpler solution?

Critically, I want to check for fit in any columns, not a specific column. Other similar questions, as far as I can tell, concern only one or the list of columns.

+6

python pandas

Reason Aug 16 '16 at 16:55

source share

1 answer

ihightower · Answer 1 · 2017-03-25T15:30:33+0000

No answer was given to this question ... but the question itself and the comments received an answer that already worked very well for me .. and I did not find the answer wherever I looked.

So I just copied the answer to the question for those who might find this useful. I added case = False for case insensitive serach

Solution from @Reason:

the best I've come up with so far is rather cumbersome

this one worked for me.

 df[df.apply(lambda r: r.str.contains('b', case=False).any(), axis=1)]

Solution from @rbinnun:

this one worked for me for a test data set .. but for some real data set .. it returned an error in unicode as shown below, but as a rule, this is also a good solution.

df[df.apply(lambda row: row.astype(str).str.contains('b', case=False).any(), axis=1)]

takes care of non-row columns, nans, etc.

 UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 5: ordinal not in range(128)

The most concise way to select rows where any column contains a row in a Pandas dataframe?

More articles: