R-type sums ddply by the sum of the group

Question

R-type sums ddply by the sum of the group

I have data.frame like this

x <- data.frame(Category=factor(c("One", "One", "Four", "Two","Two", "Three", "Two", "Four","Three")), City=factor(c("D","A","B","B","A","D","A","C","C")), Frequency=c(10,1,5,2,14,8,20,3,5)) Category City Frequency 1 One D 10 2 One A 1 3 Four B 5 4 Two B 2 5 Two A 14 6 Three D 8 7 Two A 20 8 Four C 3 9 Three C 5

I want to make a pivot table with the sum (Frequency) and use the ddply function as follows:

 ddply(x,.(Category,City),summarize,Total=sum(Frequency)) Category City Total 1 Four B 5 2 Four C 3 3 One A 1 4 One D 10 5 Three C 5 6 Three D 8 7 Two A 34 8 Two B 2

But I need these results, sorted by the sum in each group of categories. Something like that:

 Category City Frequency 1 Two A 34 2 Two B 2 3 Three D 14 4 Three C 5 5 One D 10 6 One A 1 7 Four B 5 8 Four C 3

I looked and tried to sort, organize, organize, but nothing similar does what I need. How can I do this in R?

+5

r pivot-table plyr

Liliana pacheco Apr 08 '15 at 17:52

source share

2 answers

This is a good question, and I cannot come up with a direct way to do this, rather than creating a general index and then sorting it. The data.table approach is possible data.table , which uses the setorder function, which will organize your data by reference

 library(data.table) Res <- setDT(x)[, .(Total = sum(Frequency)), by = .(Category, City)] setorder(Res[, size := sum(Total), by = Category], -size, -Total, Category)[] # Category City Total size # 1: Two A 34 36 # 2: Two B 2 36 # 3: Three D 8 13 # 4: Three C 5 13 # 5: One D 10 11 # 6: One A 1 11 # 7: Four B 5 8 # 8: Four C 3 8

Or, if you are deep in Hdleyverse, we can achieve a similar result using the new dplyr package (as suggested by @akrun)

 library(dplyr) x %>% group_by(Category, City) %>% summarise(Total = sum(Frequency)) %>% mutate(size= sum(Total)) %>% ungroup %>% arrange(-size, -Total, Category)

+5

David Arenburg Apr 08 '15 at 18:06

source share

Brodieg · Accepted Answer · 2015-04-08T18:45:31+0000

Here is the basic version of R, where DF is the result of your ddply call:

 with(DF, DF[order(-ave(Total, Category, FUN=sum), Category, -Total), ])

gives:

  Category City Total 7 Two A 34 8 Two B 2 6 Three D 8 5 Three C 5 4 One D 10 3 One A 1 1 Four B 5 2 Four C 3

The logic is basically the same as David's, calculate the Total sum for each Category , use that number for all the rows in each Category (we do this with ave(..., FUN=sum) ), and then sort by what is plus some switches to make sure the material comes out as expected.

R-type sums ddply by the sum of the group

More articles: