Pandas Groupby and the amount is only one column

Question

Pandas Groupby and the amount is only one column

So, I have a dataframe, df1, which looks like this:

ABC 1 foo 12 California 2 foo 22 California 3 bar 8 Rhode Island 4 bar 32 Rhode Island 5 baz 15 Ohio 6 baz 26 Ohio

I want to group by column A and then summarize column B, storing the value in column C. Something like this:

  ABC 1 foo 34 California 2 bar 40 Rhode Island 3 baz 41 Ohio

The problem is when I say df.groupby ('A'). Column sum () C is deleted, returning

  B A bar 40 baz 41 foo 34

How can I get around this and save column C when I group and sum?

+5

python pandas

JSolomonCulp Aug 16 '16 at 9:51

source share

2 answers

If you don't care what's in your C column and just want the nth value, you can just do this:

 df.groupby('A').agg({'B' : 'sum', 'C' : lambda x: x.iloc[n]})

0

Kartik Aug 16 '16 at 10:02

source share

Sevyns · Accepted Answer · 2016-08-16T21:58:06+0000

The only way to do this is to include C in your group (the groupby function can accept a list).

Try:

 df.groupby(['A','C'])['B'].sum()

One more note: if you need to work with df after aggregation, you can also use the as_index = False parameter to return a dataframe object. This question gave me problems when I first worked with Pandas. Example:

 df.groupby(['A','C'], as_index=False)['B'].sum()

Pandas Groupby and the amount is only one column

More articles: