Interpolation of time series data with specific output time

I have a database with time data. I want to interpolate data for a specific time step.

Id Time humid humtemp prtemp press t 1 2012-01-21 18:41:50 47.7 14.12 13.870 1005.70 -0.05277778 1 2012-01-21 18:46:43 44.5 15.37 15.100 1005.20 0.02861111 1 2012-01-21 18:51:35 43.2 15.88 15.576 1005.10 0.10972222 1 2012-01-21 18:56:28 42.5 16.17 15.833 1004.90 0.19111111 1 2012-01-21 19:01:21 42.2 16.31 15.986 1004.80 0.27250000 1 2012-01-21 19:06:14 41.8 16.47 16.118 1004.60 0.35388889 1 2012-01-21 19:11:07 41.6 16.51 16.177 1004.60 0.43527778 

I want to get the data in steps below by doing interpolation.

  Id Time humid humtemp prtemp press t 1 2012-01-21 18:45:00 .... ... ..... .... .... 1 2012-01-21 18:50:00 .... 1 2012-01-21 18:55:00 .... 1 2012-01-21 19:00:00 .... 1 2012-01-21 19:05:00 .... 1 2012-01-21 19:10:00 .... 

I tried using different methods, but I did not find a solution. For example, I create a zoo object.

  z <- zoo(MTS01m,order.by=MTS01m$Time) tstart2<-asP("2012-01-21 18:45:00") Ts<-1*60 y <- merge(z, zoo(order.by=seq(tstart2, end(z), by=Ts))) xa <- na.approx(y) xs <- na.spline(y) 

but an error occurs:

  Errore in approx(x[!na], y[!na], xout, ...) : need at least two non-NA values to interpolate Inoltre: Warning message: In xy.coords(x, y) : si รจ prodotto un NA per coercizione 

I create a secundary index t that starts where I want to have data, but I don't know how to use the thid index.

Do you have any suggestions?

+7
source share
3 answers

Try this (if your POSIXct time index):

 library(zoo) st <- as.POSIXct("2012-01-21 18:45") g <- seq(st, end(z), by = "15 min") # grid na.approx(z, xout = g) 

See ?na.approx.zoo more details.

Note: Since the question did not provide data in reproducible form, we do it here:

 Lines <- "Id date Time humid humtemp prtemp press t1 1 2012-01-21 18:41:50 47.7 14.12 13.870 1005.70 -0.05277778 1 2012-01-21 18:46:43 44.5 15.37 15.100 1005.20 0.02861111 1 2012-01-21 18:51:35 43.2 15.88 15.576 1005.10 0.10972222 1 2012-01-21 18:56:28 42.5 16.17 15.833 1004.90 0.19111111 1 2012-01-21 19:01:21 42.2 16.31 15.986 1004.80 0.27250000 1 2012-01-21 19:06:14 41.8 16.47 16.118 1004.60 0.35388889 1 2012-01-21 19:11:07 41.6 16.51 16.177 1004.60 0.43527778" library(zoo) z <- read.zoo(text = Lines, header = TRUE, index = 2:3, tz = "") st <- as.POSIXct("2012-01-21 18:45") g <- seq(st, end(z), by = "15 min") # grid na.approx(z, xout = g) 

giving:

  Id humid humtemp prtemp press t1 2012-01-21 18:45:00 1 45.62491 14.93058 14.66761 1005.376 -1.501706e-09 2012-01-21 19:00:00 1 42.28294 16.27130 15.94370 1004.828 2.500000e-01 
+3
source

You can see the process as follows:

  • Create a sequence based on data ranges.
  • Combine sequence and data.
  • Interpolate values: constant or linear method.

Creating a dataset:

 data1 <- read.table(text="1 2012-01-21 18:41:50 47.7 14.12 13.870 1005.70 -0.05277778 1 2012-01-21 18:46:43 44.5 15.37 15.100 1005.20 0.02861111 1 2012-01-21 18:51:35 43.2 15.88 15.576 1005.10 0.10972222 1 2012-01-21 18:56:28 42.5 16.17 15.833 1004.90 0.19111111 1 2012-01-21 19:01:21 42.2 16.31 15.986 1004.80 0.27250000 1 2012-01-21 19:06:14 41.8 16.47 16.118 1004.60 0.35388889 1 2012-01-21 19:11:07 41.6 16.51 16.177 1004.60 0.43527778", col.names=c("Id","date","Time","humid","humtemp","prtemp","press","t1")) data1$datetime <- strptime(as.character(paste(d$date,d$Time, sep=" ")),"%Y-%m-%d %H:%M:%S") 

Library Zoo:

 library(zoo) 

Step 1:

 # sequence interval 5 seconds seq1 <- zoo(order.by=(as.POSIXlt( seq(min(data1$datetime), max(data1$datetime), by=5) ))) 

Step 2:

 mer1 <- merge(zoo(x=data1[4:7],order.by=data1$datetime), seq1) 

Step 3:

 #Constant interpolation dataC <- na.approx(mer1, method="constant") #Linear interpolation dataL <- na.approx(mer1) 

Visualization

 head(dataC) humid humtemp prtemp press 2012-01-21 18:41:50 47.7 14.12 13.87 1005.7 2012-01-21 18:41:55 47.7 14.12 13.87 1005.7 2012-01-21 18:42:00 47.7 14.12 13.87 1005.7 2012-01-21 18:42:05 47.7 14.12 13.87 1005.7 2012-01-21 18:42:10 47.7 14.12 13.87 1005.7 2012-01-21 18:42:15 47.7 14.12 13.87 1005.7 head(dataL) humid humtemp prtemp press 2012-01-21 18:41:50 47.70000 14.12000 13.87000 1005.700 2012-01-21 18:41:55 47.64539 14.14133 13.89099 1005.691 2012-01-21 18:42:00 47.59078 14.16266 13.91198 1005.683 2012-01-21 18:42:05 47.53618 14.18399 13.93297 1005.674 2012-01-21 18:42:10 47.48157 14.20532 13.95396 1005.666 2012-01-21 18:42:15 47.42696 14.22666 13.97495 1005.657 
+2
source

I can not find a function in the xts package (or the zoo) that approaches the dates indicated.

So my idea is to insert NA into the source ts for given dates.

  ids <- as.POSIXct( align.time(index(dat.xts),60*5)) # range dates # I create an xts with NA y <- xts(x=matrix(data=NA,nrow=dim(dat.xts)[1], ncol=dim(dat.xts)[2]), order.by=ids) rbind(y,dat.xts) 

   humid humtemp prtemp press t 2012-01-21 18:41:50 47.7 14.12 13.870 1005.7 -0.05277778 2012-01-21 18:45:00 NA NA NA NA NA 2012-01-21 18:46:43 44.5 15.37 15.100 1005.2 0.02861111 2012-01-21 18:50:00 NA NA NA NA NA 2012-01-21 18:51:35 43.2 15.88 15.576 1005.1 0.10972222 2012-01-21 18:55:00 NA NA NA NA NA 

Now you can use na.approx or na.spline, like this

 na.approx(rbind(y,dat.xts))[index(y)] humid humtemp prtemp press t 2012-01-21 18:45:00 45.62 14.93 14.67 1005.38 0.00 2012-01-21 18:50:00 43.62 15.71 15.42 1005.13 0.08 2012-01-21 18:55:00 42.71 16.08 15.76 1004.96 0.17 2012-01-21 19:00:00 42.28 16.27 15.94 1004.83 0.25 2012-01-21 19:05:00 41.90 16.43 16.08 1004.65 0.33 2012-01-21 19:10:00 41.65 16.50 16.16 1004.60 0.42 
0
source

All Articles