Data aggregation. table for list column

I am trying to combine data from a data.table to create a new column that is a list of previous rows. This is easier to see with an example:

dt <- data.table(id = c(1,1,1,1,2,2,3,3,3), letter = c('a','a','b','c','a','c','b','b','a')) 

I would like to aggregate this so that the result is

  id letter 1: 1 a,a,b,c 2: 2 a,c 3: 3 b,b,a 

Intuitively I tried

 dt[,j = list(list(letter)), by = id] 

but that will not work. Oddly enough, when I go in each case, for example:

 > dt[id == 1,j = list(list(letter)), by = id] id V1 1: 1 a,a,b,c 

the result is fine ... I feel like I'm missing .SD somewhere or something like that ...

Can someone point me in the right direction?

Thanks!

+4
r data.table
source share
2 answers

Update: The behavior of DT[, list(list(.)), by=.] Sometimes led to incorrect results in version R> = 3.1.0. This is now fixed in commit # 1280 in the current version of data.table v1.9.3. From NEWS :

  • DT[, list(list(.)), by=.] Returns the correct results in R> = 3.1.0. The error occurred due to recent (welcome) changes in R v3.1.0, where list(.) Does not lead to copying. Closes # 481 .

With this update, it is no longer necessary for I() . You can simply do: DT[, list(list(.)), by=.] , As before.


This seems to be a problem with known bug # 5585 . In your case, I think you could just use

 dt[, paste(letter, collapse=","), by = id] 

to fix your problem.

As @ilir noted, if you really want to get a list (rather than a displayed character), you can use the workaround suggested in the error report:

 dt[, list(list(I(letter))), by = id] 
+5
source share

The syntax below works for me:

 dt[, list(lst=list(letter)), by=id] 

I am using R version 3.0.3, data.table_1.9.2.

+1
source share

All Articles