Calculate the number of days between two dates in r

I need to calculate the number of days elapsed between several dates in two ways, and then output these results in new columns: i) the number of days elapsed compared to the first date (e.g., $ FIRST RESULTS) and ii) between successive dates (e.g. RESULTS $ BETWEEN). Here is an example with the desired results. Thanks in advance.

library(lubridate) DATA = data.frame(DATE = mdy(c("7/8/2013", "8/1/2013", "8/30/2013", "10/23/2013", "12/16/2013", "12/16/2015"))) RESULTS = data.frame(DATE = mdy(c("7/8/2013", "8/1/2013", "8/30/2013", "10/23/2013", "12/16/2013", "12/16/2015")), FIRST = c(0, 24, 53, 107, 161, 891), BETWEEN = c(0, 24, 29, 54, 54, 730)) 
+5
source share
4 answers
 #Using dplyr package library(dplyr) df1 %>% # your dataframe mutate(BETWEEN0=as.numeric(difftime(DATE,lag(DATE,1))),BETWEEN=ifelse(is.na(BETWEEN0),0,BETWEEN0),FIRST=cumsum(as.numeric(BETWEEN)))%>% select(-BETWEEN0) DATE BETWEEN FIRST 1 2013-07-08 0 0 2 2013-08-01 24 24 3 2013-08-30 29 53 4 2013-10-23 54 107 5 2013-12-16 54 161 6 2015-12-16 730 891 
+6
source

This will give you what you want:

 d <- as.Date(DATA$DATE, format="%m/%d/%Y") first <- c() for (i in seq_along(d)) first[i] <- d[i] - d[1] between <- c(0, diff(d)) 

This uses the as.Date() function in the base package to transfer the vector of string dates to date values ​​using the specified format. Since you have dates like month / day / year, you specify format="%m/%d/%Y" to make sure it is interpreted correctly.

diff() - lag. Since it lags behind, it does not include the difference between element 1 and itself, so you can combine 0.

Differences between Date objects are specified in days by default.

Then building the output data block is simple:

 RESULTS <- data.frame(DATE=DATA$DATE, FIRST=first, BETWEEN=between) 
+1
source

In the first part:

 DATA = data.frame((c("7/8/2013", "8/1/2013", "8/30/2013", "10/23/2013","12/16/2013", "12/16/2015"))) names(DATA)[1] = "V1" date = as.Date(DATA$V1, format="%m/%d/%Y") print(date-date[1]) 

Result:

 [1] 0 24 53 107 161 891 

For the second part - just use the for loop

0
source

You can simply add each column with simple difftime and lagged diff calculations.

 DATA$FIRST <- c(0, with(DATA, difftime(DATE[2:length(DATE)],DATE[1], unit="days") ) ) DATA$BETWEEN <- c(0, with(DATA, diff(DATE[1:(length(DATE) - 1)], unit="days") ) ) identical(DATA, RESULTS) [1] TRUE 
0
source

Source: https://habr.com/ru/post/1214336/


All Articles