Use diff () only on consecutive days

Question

Use diff () only on consecutive days

I have the following data, and I would like to use the diff() function only on consecutive days: diff(data$ch, differences = 1, lag = 1) returns the differences between all consecutive ch values (23-12, 4-23, 78-4, 120-78, 94-120, ...). I would like diff() return NA when dates are not consecutive. The result I'm trying to get from the data below:

 11, -19, 74, NA, -26, NA, -34, 39, NA

Is there anyone who knows how I can do this?

 Date ch 2013-01-01 12 2013-01-02 23 2013-01-03 4 2013-01-04 78 2013-01-10 120 2013-01-11 94 2013-02-26 36 2013-02-27 2 2013-02-28 41 2003-03-05 22

+5

date diff r

stem Aug 3 '15 at 13:36

source share

3 answers

akrun · Answer 1 · 2015-08-03T13:39:48+0000

You can do this in base R without installing any external packages.

Assuming that the “Date” column has a Date class, we take the “Date” diff and based on whether the difference between neighboring elements is more than 1 or not, we can create a grouping index ('indx') by taking the total amount ( cumsum ) logical vector.

  indx <- cumsum(c(TRUE,abs(diff(df1$Date))>1))

In the second stage, we can use ave with "indx" as a grouping vector and take diff of ch. The length of the diff output will be 1 less than the length of the "ch" column. Therefore, we can add NA to make the lengths the same.

  ave(df1$ch, indx, FUN=function(x) c(diff(x),NA)) #[1] 11 -19 74 NA -26 NA -34 39 NA NA

data

 df1 <- structure(list(Date = structure(c(15706, 15707, 15708, 15709, 15715, 15716, 15762, 15763, 15764, 12116), class = "Date"), ch = c(12L, 23L, 4L, 78L, 120L, 94L, 36L, 2L, 41L, 22L)), .Names = c("Date", "ch"), row.names = c(NA, -10L), class = "data.frame")

alexis_laz · Answer 2 · 2015-08-03T14:40:47+0000

The following simply "... returns NA when dates are not consecutive" if there are no complicated cases in which it is not taken into account:

 replace(diff(df1$ch), abs(diff(df1$Date)) > 1, NA) #[1] 11 -19 74 NA -26 NA -34 39 NA

dimitris_ps · Answer 3 · 2015-08-03T13:46:02+0000

Try this with the lubridate and dplyr

If you do not have them, do it once install.packages("dplyr");install.packages("lubridate")

code

 library(lubridate) library(dplyr) data$Date <- ymd(data$Date) data2 <- data %>% mutate(diff=ifelse(Date==lag(Date)+days(1), ch-lag(ch), NA))

Data

 data <- data.frame(Date=c("2013-01-01", "2013-01-02", "2013-01-03", "2013-01-04", "2013-01-10", "2013-01-11", "2013-01-26", "2013-01-27", "2013-01-28", "2013-03-05"), ch=c(12, 23, 4, 78, 120, 94, 36, 2, 41, 22))

Use diff () only on consecutive days

data

More articles: