Combining two different data frames in R

I have two data frames. One of them consists of three variables: “date”, “hit” and “volume” with 20 observations per day, 100 per month and 1200 per year (on trading days), which is as follows

Date Price Vol 2008-09-01 20 0.2 2008-09-01 30 0.5 ... 

So, for each month, I have certain values ​​for the price and volume, from 10 to 40, from 0.1 to 0.7, respectively.
The second is the interpolated values ​​from the first. Therefore, I no longer have a date, but small steps for other variables:

  Price Vol 20 0.2 21 0.21 22 0.24 30 0.5 

So, although one frame shows values ​​in discrete time, the other is more or less continuous. Now my question is: how can R be defined to combine the second data frame into the first, taking continuous price / volume dates between two discrete ones, to get something like this:

 Date Price Vol 2008-09-01 20 0.2 2008-09-01 21 0.21 2008-09-01 22 0.24 ... 2008-09-01 30 0.5 

I just can't figure out how to do this. I always got NA values ​​for dates that are no longer in ascending order.

Thanks so much for your support.
Dani

+4
source share
2 answers

I completely missed the point with the first post. It has a date. But I agree with Shane that if data frames are not required for any downstream function, then a temporary series is a good idea.

 A <- data.frame(date=rep("2001-05-25", 2), price=c(20, 30), vol=c(0.2, 0.5)) B <- data.frame(price=seq(min(A$price), max(A$price), by=1)) C <- merge(A, B, all=TRUE) index <- which(!is.na(C$vol)) for (i in seq(nrow(A))[-1]) { C$date[index[i-1]:index[i]] <- rep(A$date[i-1], A$price[i] - A$price[i-1] + 1) C$vol[index[i-1]:index[i]] <- seq(A$vol[i-1], A$vol[i], length=(A$price[i] - A$price[i-1] + 1)) } ans <- C[, c(2, 1, 3)] ans date price vol 1 2001-05-25 20 0.20 2 2001-05-25 21 0.23 3 2001-05-25 22 0.26 4 2001-05-25 23 0.29 5 2001-05-25 24 0.32 6 2001-05-25 25 0.35 7 2001-05-25 26 0.38 8 2001-05-25 27 0.41 9 2001-05-25 28 0.44 10 2001-05-25 29 0.47 11 2001-05-25 30 0.50 
+2
source

First use a time series class (e.g. zoo or xts ).

Your second interpolated time series should still have a timestamp, even if it runs hourly or every minute, etc. Use merge to bring them together, then use na.locf to move values ​​forward from the lower frequency series.

Here is an example:

 ts1 <- zoo(1:5, as.POSIXct(as.Date("2010-10-01") + 1:5)) ts2 <- zoo(1:(5 * 24), as.POSIXct("2010-10-01 00:00:00") + (1:(5 * 24) * 3600)) na.locf(merge(ts1, ts2)) 
+4
source

All Articles