The sum of several columns from the pandas frame

So to speak, I have the following table:

In [2]: df = pd.DataFrame({'a': [1,2,3], 'b':[2,4,6], 'c':[1,1,1]}) In [3]: df Out[3]: abc 0 1 2 1 1 2 4 1 2 3 6 1 

I can summarize a and b as follows:

 In [4]: sum(df['a']) + sum(df['b']) Out[4]: 18 

However, this is not very convenient for a larger data block, where you need to sum several columns together.

Is there an easier way to sum columns (similar to the one below)? What if I want to summarize an entire DataFrame without specifying columns?

 In [4]: sum(df[['a', 'b']]) #that will not work! Out[4]: 18 In [4]: sum(df) #that will not work! Out[4]: 21 
+6
source share
1 answer

I think you can use double sum - first create DataFrame.sum Series sums, and second Series.sum get the sum of Series :

 print (df[['a','b']].sum()) a 6 b 12 dtype: int64 print (df[['a','b']].sum().sum()) 18 

You can also use:

 print (df[['a','b']].sum(axis=1)) 0 3 1 6 2 9 dtype: int64 print (df[['a','b']].sum(axis=1).sum()) 18 

Thanks pirSquared for another solution - convert df to numpy array using values and then sum :

 print (df[['a','b']].values.sum()) 18 

 print (df.sum().sum()) 21 
+7
source

All Articles