I have the following problem: a given set of non-overlapping intervals in the data table. Report the intervals between the intervals.
I implemented this once in SQL, but I am struggling with data.table due to the lack of a leading function or a delay function. For completeness, I have SQL code here . I know that the functionality was implemented in data.table version 1.9.5. like using a change log . Is this possible with data.table without doing many merges and without a delay or lead function?
Basically, I'm not completely against using merges (aka join) until performance suffers. I think this has an easy implementation, but I can't figure out how to “get” the previous end time to start the time of my break table.
For example:
# The numbers represent seconds from 1970-01-01 01:00:01 dat <- structure( list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), stime = structure(c(as.POSIXct("2014-01-15 08:00:00"), as.POSIXct("2014-01-15 11:00:00"), as.POSIXct("2014-01-16 11:30:00"), as.POSIXct("2014-01-15 09:30:00"), as.POSIXct("2014-01-15 12:30:00"), as.POSIXct("2014-01-15 13:30:00") ), class = c("POSIXct", "POSIXt"), tzone = ""), etime = structure(c(as.POSIXct("2014-01-15 10:30:00"), as.POSIXct("2014-01-15 12:00:00"), as.POSIXct("2014-01-16 13:00:00"), as.POSIXct("2014-01-15 11:00:00"), as.POSIXct("2014-01-15 12:45:00"), as.POSIXct("2014-01-15 14:30:00") ), class = c("POSIXct", "POSIXt"), tzone = "") ), .Names = c("ID", "stime", "etime"), sorted = c("ID", "stime", "etime"), class = c("data.table", "data.frame"), row.names = c(NA,-6L) ) dat <- data.table(dat)
This leads to:
ID stime etime 1 2014-01-15 10:30:00 2014-01-15 11:00:00 1 2014-01-15 12:00:00 2014-01-16 11:30:00 2 2014-01-15 11:00:00 2014-01-15 12:30:00 2 2014-01-15 12:45:00 2014-01-15 13:30:00
Please note: gaps are reported evenly after a few days.