EDIT
I found this SO post , which includes the best way to insert Lack of rows in a data table. The fun_DT function fun_DT adjusted accordingly. Now the code is cleaner; I do not see any speed improvements though.
See my update on another post. Arun solution also works, but you must manually insert the missing combinations. Since you have more columns of identifiers (ID, month) here, I just came up with a dirty solution (first I create ID2, then create all the combinations of the ID2 category, then fill in the data.table and then reformat).
I am sure this is not the best solution, but if this FR is built in, these steps can be performed automatically.
The solutions are approximately the same in speed, although it would be interesting to see how it scales (my machine is too slow, so I do not want to increase n further ... the computer crashed often ;-)
library(data.table) library(rbenchmark) fun_reshape <- function(n) { DT <- data.table( ID=sample(1:100, n, replace=TRUE), Month=sample(1:12, n, replace=TRUE), Category=sample(1:10, n, replace=TRUE), Qty=runif(n)*500, key=c('ID', 'Month') ) agg <- DT[, list(Qty = sum(Qty)), by = c("ID", "Month", "Category")] reshape(agg, v.names = "Qty", idvar = c("ID", "Month"), timevar = "Category", direction = "wide") }
Christoph_J
source share