Pandas efficient check if a column contains a row in another column

I am trying to get the logical index of whether one column contains a row from the same row in another column:

ab boop beep bop zorp zorpfoo zip foo zip fa 

To check if column b contains a row, I would like to get:

 [False, True, True] 

I'm trying to use this approach now, but it's slow:

 df.apply(lambda row: row['a'] in row['b'], axis=1) 

Is there a .str method for this?

+6
source share
1 answer

df.apply (..., axis = 1) is very slow! You must avoid using it!

 from random import sample from string import lowercase from pandas import DataFrame df = DataFrame({ 'a': map(lambda x: ''.join(sample(lowercase, 2)), range(100000)), 'b': map(lambda x: ''.join(sample(lowercase, 5)), range(100000)) }) %time map(lambda (x, y): x in y, zip(df['a'], df['b'])) %time df.apply(lambda x: x[0] in x[1], axis=1) 
+1
source

All Articles