Cut POSIXct at a specific time for daily funds

I am interested in calculating average values ​​for certain time periods in a time series data set.

For such time series:

dtm=as.POSIXct("2007-03-27 05:00", tz="GMT")+3600*(1:240) Count<-c(1:240) DF<-data.frame(dtm,Count) 

In the past, I had to calculate daily averages using

 DF$Day<-cut(DF$dtm,breaks="day") Day_Avg<-aggregate(DF$Count~Day,DF,mean) 

But now I'm trying to shorten the day at certain periods of time, and I'm not sure how to set my β€œbreaks”.

Unlike the average daily value from 0: 00: 24: 00, how, for example, can I get noon at noon?

Or more imagination, how could I set up noon at noon, excluding night times from 7pm to 6am (or vice versa, including only 6 AM-7PM daylight).

+4
source share
3 answers

xts is the ideal time series analysis package

 library(xts) originalTZ <- Sys.getenv("TZ") Sys.setenv(TZ = "GMT") data.xts <- as.xts(1:240, as.POSIXct("2007-03-27 05:00", tz = "GMT") + 3600 * (1:240)) head(data.xts) ## [,1] ## 2007-03-27 06:00:00 1 ## 2007-03-27 07:00:00 2 ## 2007-03-27 08:00:00 3 ## 2007-03-27 09:00:00 4 ## 2007-03-27 10:00:00 5 ## 2007-03-27 11:00:00 6 # You can filter data using ISO-style subsetting data.xts.filterd <- data.xts["T06:00/T19:00"] # You can use builtin functions to apply any function FUN on daily data. apply.daily(data.xts.filtered, mean) ## [,1] ## 2007-03-27 18:00:00 7.5 ## 2007-03-28 18:00:00 31.5 ## 2007-03-29 18:00:00 55.5 ## 2007-03-30 18:00:00 79.5 ## 2007-03-31 18:00:00 103.5 ## 2007-04-01 18:00:00 127.5 ## 2007-04-02 18:00:00 151.5 ## 2007-04-03 18:00:00 175.5 ## 2007-04-04 18:00:00 199.5 ## 2007-04-05 18:00:00 223.5 # OR # now let say you want to find noon to noon average. period.apply(data.xts, c(0, which(.indexhour(data.xts) == 11)), FUN = mean) ## [,1] ## 2007-03-27 11:00:00 3.5 ## 2007-03-28 11:00:00 18.5 ## 2007-03-29 11:00:00 42.5 ## 2007-03-30 11:00:00 66.5 ## 2007-03-31 11:00:00 90.5 ## 2007-04-01 11:00:00 114.5 ## 2007-04-02 11:00:00 138.5 ## 2007-04-03 11:00:00 162.5 ## 2007-04-04 11:00:00 186.5 ## 2007-04-05 11:00:00 210.5 # now if you want to exclude time from 7 PM to 6 AM data.xts.filtered <- data.xts[!data.xts %in% data.xts["T20:00/T05:00"]] head(data.xts.filtered, 20) ## [,1] ## 2007-03-27 06:00:00 1 ## 2007-03-27 07:00:00 2 ## 2007-03-27 08:00:00 3 ## 2007-03-27 09:00:00 4 ## 2007-03-27 10:00:00 5 ## 2007-03-27 11:00:00 6 ## 2007-03-27 12:00:00 7 ## 2007-03-27 13:00:00 8 ## 2007-03-27 14:00:00 9 ## 2007-03-27 15:00:00 10 ## 2007-03-27 16:00:00 11 ## 2007-03-27 17:00:00 12 ## 2007-03-27 18:00:00 13 ## 2007-03-27 19:00:00 14 ## 2007-03-28 06:00:00 25 ## 2007-03-28 07:00:00 26 ## 2007-03-28 08:00:00 27 ## 2007-03-28 09:00:00 28 ## 2007-03-28 10:00:00 29 ## 2007-03-28 11:00:00 30 period.apply(data.xts.filtered, c(0, which(.indexhour(data.xts.filtered) == 11)), FUN = mean) ## [,1] ## 2007-03-27 11:00:00 3.50000 ## 2007-03-28 11:00:00 17.78571 ## 2007-03-29 11:00:00 41.78571 ## 2007-03-30 11:00:00 65.78571 ## 2007-03-31 11:00:00 89.78571 ## 2007-04-01 11:00:00 113.78571 ## 2007-04-02 11:00:00 137.78571 ## 2007-04-03 11:00:00 161.78571 ## 2007-04-04 11:00:00 185.78571 ## 2007-04-05 11:00:00 209.78571 Sys.setenv(TZ = originalTZ) 
+5
source

Let me quickly repeat your code.

 dtm <- as.POSIXct("2007-03-27 05:00", tz="GMT")+3600*(1:240) Count <- c(1:240) DF<-data.frame(dtm,Count) DF$Day<-cut(DF$dtm,breaks="day") Day_Avg<-aggregate(DF$Count~Day,DF,mean) 

If you mix up every 12 hours in a function call, you can still use cut with "day" interruptions. I will save the day when it starts noon to noon, so I subtract 12 hours.

 # Get twelve hours in seconds timeOffset <- 60*60*12 # Subtract the offset to get the start day of the noon to noon DF$Noon_Start_Day <- cut((DF$dtm - timeOffset), breaks="day") # Get the mean NtN_Avg <- aggregate(DF$Count ~ Noon_Start_Day, DF, mean) 

One way to exclude specific hours is to convert dates to POSIXlt . Then you can access hour by the way.

 # Indicate which times are good (use whatever boolean test is needed here) goodTimes <- !(as.POSIXlt(DF$dtm)$hour >= 19) & !(as.POSIXlt(DF$dtm)$hour <= 6) new_NtN_Avg <- aggregate(Count ~ Noon_Start_Day, data=subset(DF, goodTimes), mean) 

I found some help on this in stackoverflow: r-calculate-means-for-subset-of-a-group

+5
source

The noon to floor problem can be easily solved numerically. The key is that the beginning of the day (GMT) has a value of time_t , which is always divided by 86400. This is indicated in POSIX. For example, see http://en.wikipedia.org/wiki/Unix_time

 cuts <- unique(as.numeric(DF$dtm) %/% (86400/2)) * (86400/2) # half-days cuts <- c(cuts, cuts[length(cuts)]+(86400/2)) # One more at the end cuts <- as.POSIXct(cuts, tz="GMT", origin="1970-01-01") # Familiar format DF$halfday <- cut(DF$dtm, cuts) # This is the cut you want. Halfday_Avg <- aggregate(Count~halfday, data=DF, FUN=mean) Halfday_Avg ## halfday Count ## 1 2007-03-27 00:00:00 3.5 ## 2 2007-03-27 12:00:00 12.5 ## 3 2007-03-28 00:00:00 24.5 ## 4 2007-03-28 12:00:00 36.5 ## 5 2007-03-29 00:00:00 48.5 ## 6 2007-03-29 12:00:00 60.5 ## 7 2007-03-30 00:00:00 72.5 ## 8 2007-03-30 12:00:00 84.5 ## 9 2007-03-31 00:00:00 96.5 ## 10 2007-03-31 12:00:00 108.5 ## 11 2007-04-01 00:00:00 120.5 ## 12 2007-04-01 12:00:00 132.5 ## 13 2007-04-02 00:00:00 144.5 ## 14 2007-04-02 12:00:00 156.5 ## 15 2007-04-03 00:00:00 168.5 ## 16 2007-04-03 12:00:00 180.5 ## 17 2007-04-04 00:00:00 192.5 ## 18 2007-04-04 12:00:00 204.5 ## 19 2007-04-05 00:00:00 216.5 ## 20 2007-04-05 12:00:00 228.5 ## 21 2007-04-06 00:00:00 237.5 

Now, to expand this, to solve the rest of the problem. Here is a time range of 6 AM-7PM.

 intraday <- as.numeric(DF$dtm) %% 86400 # Subset DF by the chosen range New_Avg <- aggregate(Count~halfday, data=DF[intraday >= 6*3600 & intraday <= 19*3600,], FUN=mean) New_Avg ## halfday Count ## 1 2007-03-27 00:00:00 3.5 ## 2 2007-03-27 12:00:00 10.5 ## 3 2007-03-28 00:00:00 27.5 ## 4 2007-03-28 12:00:00 34.5 ## 5 2007-03-29 00:00:00 51.5 ## 6 2007-03-29 12:00:00 58.5 ## 7 2007-03-30 00:00:00 75.5 ## 8 2007-03-30 12:00:00 82.5 ## 9 2007-03-31 00:00:00 99.5 ## 10 2007-03-31 12:00:00 106.5 ## 11 2007-04-01 00:00:00 123.5 ## 12 2007-04-01 12:00:00 130.5 ## 13 2007-04-02 00:00:00 147.5 ## 14 2007-04-02 12:00:00 154.5 ## 15 2007-04-03 00:00:00 171.5 ## 16 2007-04-03 12:00:00 178.5 ## 17 2007-04-04 00:00:00 195.5 ## 18 2007-04-04 12:00:00 202.5 ## 19 2007-04-05 00:00:00 219.5 ## 20 2007-04-05 12:00:00 226.5 
+1
source

All Articles