Ddply for multiple columns equivalent in data.table

I am a big fan of the data.table package, and I am having problems converting some code in ddply of the plyr package to the equivalent in the data table. Code for ddply:

dfx <- data.frame( group = c(rep('A', 8), rep('B', 15), rep('C', 6)), sex = sample(c("M", "F"), size = 29, replace = TRUE), age = runif(n = 29, min = 18, max = 54), age2 = runif(n = 29, min = 18, max = 54) ) ddply(dfx, .(group, sex), numcolwise(sum)) 

What I want to do is sum over several columns without having to manually specify column names. The manual equivalent in the data.table package:

 dfx.dt = data.table(dfx) dfx.dt[ , sum.age := sum(age), by="group,sex"] dfx.dt[ , sum.age2 := sum(age2), by="group,sex"] dfx.dt[!duplicated(dfx.dt[ , {list(group, sex)}]), ] 

To be explicit, my question is: "Is there a way to make the ddply code equivalent in data.table?"

Any help is greatly appreciated, thanks.

+8
r data.table plyr
source share
1 answer

Yes, there is a way:

 dfx.dt[,lapply(.SD,sum),by='group,sex'] 

This is mentioned in the FAQ section 2.1 for the data table.

+6
source share

All Articles