Data.table: Bypass setkey when using monotone key variable conversion

Is the "sorted" attribute part of the official data.table API?

I often do things like outputting a week / month / quarter / year variable from a date variable, which, of course, is a monotonous conversion. Then I do something using one of these monotonous derived variables.

I am wondering if it is possible to directly replace my date variable with the name of the week / month / etc. variables in a sorted attribute and that everything works correctly? i.e. safely below:

library(data.table) library(lubridate) DT <- data.table(day=as.Date(c('2006-01-30', '2006-01-31', '2006-02-01', '2006-02-02')), d=1:4, key='day') DT[, month := floor_date(day, unit='month')] # is this safe? attr(DT, 'sorted') <- 'month' 

I could not understand if there were any other underlying data structures that reference the table that might cause problems with this technique.

+7
r data.table
source share
1 answer

Yes, I use this trick all the time when I'm sure the data is sorted, but use setattr instead to avoid copying:

 setattr(DT, 'sorted', 'month') 

If you look at the setkeyv code, you will see that it does this - it sorts the data, and then sets the "sorted" attribute.

+3
source share

All Articles