Data table for nested list

I would like to convert:

library(data.table) n <- 12 DT <- data.table( level1 = rep(paste0("Manu", 1:2), each = n / 2), level2 = rep(paste0("Dept", 1:4), each = n / 4), level3 = rep(paste0("Store", 1:n)) ) > DT level1 level2 level3 1: Manu1 Dept1 Store1 2: Manu1 Dept1 Store2 3: Manu1 Dept1 Store3 4: Manu1 Dept2 Store4 5: Manu1 Dept2 Store5 6: Manu1 Dept2 Store6 7: Manu2 Dept3 Store7 8: Manu2 Dept3 Store8 9: Manu2 Dept3 Store9 10: Manu2 Dept4 Store10 11: Manu2 Dept4 Store11 12: Manu2 Dept4 Store12 

For this:

 goal <- list( Manu1 = list( Dept1 = paste0("Store", 1:(n / 4)), Dept2 = paste0("Store", (n/4 + 1):(n / 2)) ), Manu2 = list( Dept3 = paste0("Store", (n/2 + 1):(3 * n / 4)), Dept4 = paste0("Store", (3 * n / 4 + 1):n) ) ) > goal $Manu1 $Manu1$Dept1 [1] "Store1" "Store2" "Store3" $Manu1$Dept2 [1] "Store4" "Store5" "Store6" $Manu2 $Manu2$Dept3 [1] "Store7" "Store8" "Store9" $Manu2$Dept4 [1] "Store10" "Store11" "Store12" 

What is the data.table way for this?

+7
r data.table
source share
3 answers

Borrowing from @eddi's comment (which requires updating data.table to 1.9.8 +):

 s = split(DT, by = c('level1', 'level2'), keep.by = FALSE, flatten = FALSE) rapply(relist(DT[['level3']], s), unname, how="replace") $Manu1 $Manu1$Dept1 [1] "Store1" "Store2" "Store3" $Manu1$Dept2 [1] "Store4" "Store5" "Store6" $Manu2 $Manu2$Dept3 [1] "Store7" "Store8" "Store9" $Manu2$Dept4 [1] "Store10" "Store11" "Store12" 

Computationally, this looks pretty wasteful (repeating the tree structure three times), but at least it should extend to a deeper split.data.table than two levels (thanks to split.data.table in 1.9.8 +).

+4
source share

Using assign and friends instead of a more global <<- you can use a more rigid environment, but here's a quick and dirty way to do this:

 l = list() DT[, {l[[level1]][[level2]] <<- c(level3); NULL}, by = .(level1, level2)] l #$Manu1 #$Manu1$Dept1 #[1] "Store1" "Store2" "Store3" # #$Manu1$Dept2 #[1] "Store4" "Store5" "Store6" # # #$Manu2 #$Manu2$Dept3 #[1] "Store7" "Store8" "Store9" # #$Manu2$Dept4 #[1] "Store10" "Store11" "Store12" 
+4
source share

You can do this with the dlply function from the plyr package:

 library(plyr) res <- dlply(DT, .(level1), function(dt) { dlply(dt, .(level2), function(dt) {return (unique(dt$level3))}) }) 
+3
source share

All Articles