You can avoid the loop by attaching your list of words to the creation of a regular expression and using str.contains :
pat = '|'.join(thing) for_new_df = df[df['COLUMN'].str.contains(pat)]
should just work
So the regex pattern becomes: 'A1|B2|C3' and it will match anywhere in your lines containing any of these lines
Example:
In [65]: things = ['A1','B2','C3'] pat = '|'.join(things) df = pd.DataFrame({'a':['Wow;Here;This=A1;10001;0', 'B2', 'asdasda', 'asdas']}) df[df['a'].str.contains(pat)] Out[65]: a 0 Wow;Here;This=A1;10001;0 1 B2
What did not work out before:
if df[df['COLUMN'].str.contains(mp)]
this line:
df[df['COLUMN'].str.contains(mp)]
returns df masked by the boolean array of your internal str.contains , if does not understand how to evaluate the array of logical elements, hence the error. If you are thinking about it, what if you are 1 True or all but one True? It expects a scalar, not an array, as a value.
source share