Merge duplicates based on a column?

Question

Merge duplicates based on a column?

Here is my situation -

In[9]: df
Out[9]: 
    fruit  val1  val2
0  Orange     1     1
1  orANGE     2     2
2   apple     3     3
3   APPLE     4     4
4   mango     5     5
5   appLE     6     6

In[10]: type(df)
Out[10]: pandas.core.frame.DataFrame

How to remove case-insensitive duplicates so that the total is fruitall lower with val1as the sum of each val1and val2as the sum of each val2s

Expected Result:

  fruit    val1 val2
0 orange    3    3
1 apple     13   13
2 mango     5    5

+4

python pandas grouping

Computerfellow Jan 6 '14 at 18:26

source share

1 answer

Justin · Accepted Answer · 2014-01-06T18:32:31+0000

In two stages:

df['fruit'] = df['fruit'].map(lambda x: x.lower())

res = df.groupby('fruit').sum()

res    
#         val1  val2
# fruit             
# apple     13    13
# mango      5     5
# orange     3     3

And to restore your structure:

res.reset_index()

according to the comment, the lower case can be made in a more direct way as follows:

df['fruit'] = df['fruit'].str.lower()

Merge duplicates based on a column?

More articles: