I am a big fan of the data.table package, and I am having problems converting some code in ddply of the plyr package to the equivalent in the data table. Code for ddply:
dfx <- data.frame( group = c(rep('A', 8), rep('B', 15), rep('C', 6)), sex = sample(c("M", "F"), size = 29, replace = TRUE), age = runif(n = 29, min = 18, max = 54), age2 = runif(n = 29, min = 18, max = 54) ) ddply(dfx, .(group, sex), numcolwise(sum))
What I want to do is sum over several columns without having to manually specify column names. The manual equivalent in the data.table package:
dfx.dt = data.table(dfx) dfx.dt[ , sum.age := sum(age), by="group,sex"] dfx.dt[ , sum.age2 := sum(age2), by="group,sex"] dfx.dt[!duplicated(dfx.dt[ , {list(group, sex)}]), ]
To be explicit, my question is: "Is there a way to make the ddply code equivalent in data.table?"
Any help is greatly appreciated, thanks.
r data.table plyr
Jonathan
source share