Group lines based on spaces in time series

I have a timeout of gps data that needs to be segmented into smaller parts based on spaces at timestamps.

As an example, consider the following data frame: I want to add a segment number that segments all the β€œchunks” of timestamps, effectively spitting out data every time there is a space in the series time of at least 30 seconds .

The resulting data.frame will look something like this:

timestamp segment 1 1 1 2 3 1 3 5 1 4 10 1 5 42 2 6 45 2 7 92 3 8 156 4 9 160 4 10 162 4 11 163 4 12 164 4 13 200 5 14 203 5 

Any way to do this efficiently? Data.frame is a grouped tbl_df (dplyr package) with several separate time series and can be quite large.

+3
source share
2 answers

Your example data

 t <- c(1, 3, 5, 10, 42, 45, 92, 156, 160, 162, 163, 164, 200, 203) 

Segment Numbers

 s <- cumsum(c(TRUE,diff(t)>=30)) 

Output

 data.frame(timestamp=t,segment=s) 
  timestamp segment
 1 1 1
 2 3 1
 3 5 1
 4 10 1
 5 42 2
 6 45 2
 7 92 3
 8 156 4
 9 160 4
 10 162 4
 11 163 4
 12 164 4
 13,200 5
 14 203 5
+3
source

If the name of your data.frame is "df"

 df$segment[1] <- 1 for (i in 2:nrow(df)) { if (df$timestamp[i] < (df$timestamp[i-1] + 30)) { df$segment[i] <- df$segment[i-1] } else { df$segment[i] <- (df$segment[i-1] + 1) } } 
0
source

All Articles