I use Python Pandas to work with two data frames. The first block of data contains entries from the customer database (first name, last name, email address, etc.). The second data block contains a list of domain names , for example. gmail.com, hotmail.com etc.
I am trying to exclude entries from the client data frame when the email address contains the domain name from the second list. On the other hand, I need to remove the client when its email address domain appears in the domain blacklist.
Here is an example of data:
>>> customer = pd.DataFrame({'Email': [
"bob@example.com",
"jim@example.com",
"joe@gmail.com"], 'First Name': [
"Bob",
"Jim",
"Joe"]})
>>> blacklist = pd.DataFrame({'Domain': ["gmail.com", "outlook.com"]})
>>> customer
Email First Name
0 bob@example.com Bob
1 jim@example.com Jim
2 joe@gmail.com Joe
>>> blacklist
Domain
0 gmail.com
1 outlook.com
My desired result:
>>> filtered_list = magic_happens_here(customer, blacklist)
>>> filtered_list
Email First Name
0 bob@example.com Bob
1 jim@example.com Jim
What I have tried so far:
- ,
df1[df1['email'].isin(~df2['email'])... , . df.apply, , , . : df1['Email'].apply(lambda x: x for i in ['gmail.com', 'outlook.com'] if i in x). , , TypeError: 'generator' object is not callable.
: