Pandas Groupby and the amount is only one column

So, I have a dataframe, df1, which looks like this:

ABC 1 foo 12 California 2 foo 22 California 3 bar 8 Rhode Island 4 bar 32 Rhode Island 5 baz 15 Ohio 6 baz 26 Ohio 

I want to group by column A and then summarize column B, storing the value in column C. Something like this:

  ABC 1 foo 34 California 2 bar 40 Rhode Island 3 baz 41 Ohio 

The problem is when I say df.groupby ('A'). Column sum () C is deleted, returning

  B A bar 40 baz 41 foo 34 

How can I get around this and save column C when I group and sum?

+5
source share
2 answers

The only way to do this is to include C in your group (the groupby function can accept a list).

Try:

 df.groupby(['A','C'])['B'].sum() 

One more note: if you need to work with df after aggregation, you can also use the as_index = False parameter to return a dataframe object. This question gave me problems when I first worked with Pandas. Example:

 df.groupby(['A','C'], as_index=False)['B'].sum() 
+5
source

If you don't care what's in your C column and just want the nth value, you can just do this:

 df.groupby('A').agg({'B' : 'sum', 'C' : lambda x: x.iloc[n]}) 
0
source

All Articles