Pandas / Python Merge two data frames with duplicate rows

Well, it seems like this should be easy to do by merging or concatenating, but I can't hack it. I work in pandas.

I have two data frames with duplicate rows between them, and I want to combine them so as not to duplicate rows or columns. This will work as follows

df1:

A B 
a 1
b 2
c 3

df2:

A B 
b 2
c 3
d 4

df3 = df1 combined with df2

A B 
a 1
b 2
c 3
d 4

Some methods that I tried are to select the rows that are in one and not the other (XOR), and then add them, but I cannot figure out how to make the choice. Another idea I have is to add them and they delete duplicate lines, but I don't know how to do this.

+4
source share
2 answers

Do you want : outer merge

In [103]:
df1.merge(df2, how='outer')

Out[103]:
   A  B
0  a  1
1  b  2
2  c  3
3  d  4

, , , df .

+3

, :

pd.concat([df1, df2]).drop_duplicates() 
0

All Articles