Use diff () only on consecutive days

I have the following data, and I would like to use the diff() function only on consecutive days: diff(data$ch, differences = 1, lag = 1) returns the differences between all consecutive ch values ​​(23-12, 4-23, 78-4, 120-78, 94-120, ...). I would like diff() return NA when dates are not consecutive. The result I'm trying to get from the data below:

 11, -19, 74, NA, -26, NA, -34, 39, NA 

Is there anyone who knows how I can do this?

 Date ch 2013-01-01 12 2013-01-02 23 2013-01-03 4 2013-01-04 78 2013-01-10 120 2013-01-11 94 2013-02-26 36 2013-02-27 2 2013-02-28 41 2003-03-05 22 
+5
source share
3 answers

You can do this in base R without installing any external packages.

Assuming that the β€œDate” column has a Date class, we take the β€œDate” diff and based on whether the difference between neighboring elements is more than 1 or not, we can create a grouping index ('indx') by taking the total amount ( cumsum ) logical vector.

  indx <- cumsum(c(TRUE,abs(diff(df1$Date))>1)) 

In the second stage, we can use ave with "indx" as a grouping vector and take diff of ch. The length of the diff output will be 1 less than the length of the "ch" column. Therefore, we can add NA to make the lengths the same.

  ave(df1$ch, indx, FUN=function(x) c(diff(x),NA)) #[1] 11 -19 74 NA -26 NA -34 39 NA NA 

data

 df1 <- structure(list(Date = structure(c(15706, 15707, 15708, 15709, 15715, 15716, 15762, 15763, 15764, 12116), class = "Date"), ch = c(12L, 23L, 4L, 78L, 120L, 94L, 36L, 2L, 41L, 22L)), .Names = c("Date", "ch"), row.names = c(NA, -10L), class = "data.frame") 
+5
source

The following simply "... returns NA when dates are not consecutive" if there are no complicated cases in which it is not taken into account:

 replace(diff(df1$ch), abs(diff(df1$Date)) > 1, NA) #[1] 11 -19 74 NA -26 NA -34 39 NA 
+5
source

Try this with the lubridate and dplyr

If you do not have them, do it once install.packages("dplyr");install.packages("lubridate")

code

 library(lubridate) library(dplyr) data$Date <- ymd(data$Date) data2 <- data %>% mutate(diff=ifelse(Date==lag(Date)+days(1), ch-lag(ch), NA)) 

Data

 data <- data.frame(Date=c("2013-01-01", "2013-01-02", "2013-01-03", "2013-01-04", "2013-01-10", "2013-01-11", "2013-01-26", "2013-01-27", "2013-01-28", "2013-03-05"), ch=c(12, 23, 4, 78, 120, 94, 36, 2, 41, 22)) 
+2
source

All Articles