Starting daily time series in R

I have a daily time series on the number of visitors to a website. my series starts from 01/06/2014 to today 14/10/2015 , so I want to predict the number of visitors in the future. How can I read my series with R? I think:

 series <- ts(visitors, frequency=365, start=c(2014, 6)) 

If so, and after running my arimadata=auto.arima() time series arimadata=auto.arima() , I want to predict the visitor number over the next 6o days, how can I do this?

 h=..? forecast(arimadata,h=..), 

h shoud be what value in advance for your help

+8
r time-series
source share
6 answers

The ts specification is incorrect; if you configure it as daily observations, then you need to indicate which day of 2014 is June 1, and indicate this in start :

 ## Create a daily Date object - helps my work on dates inds <- seq(as.Date("2014-06-01"), as.Date("2015-10-14"), by = "day") ## Create a time series object set.seed(25) myts <- ts(rnorm(length(inds)), # random data start = c(2014, as.numeric(format(inds[1], "%j"))), frequency = 365) 

Note that I specify start as c(2014, as.numeric(format(inds[1], "%j"))) . The whole complex bit works on what day of the year is June 1st:

 > as.numeric(format(inds[1], "%j")) [1] 152 

Once you have this, you are effectively there:

 ## use auto.arima to choose ARIMA terms fit <- auto.arima(myts) ## forecast for next 60 time points fore <- forecast(fit, h = 60) ## plot it plot(fore) 

enter image description here

This seems appropriate given the random data I provided ...

You will need to select the appropriate arguments for auto.arima() to match your data.

Note that the x-axis marks refer to 0.5 (half) of the year.

Doing this with a zoo

This might be easier to do with the zoo object created with the zoo package:

 ## create the zoo object as before set.seed(25) myzoo <- zoo(rnorm(length(inds)), inds) 

Please note: now you do not need to specify any start or frequency data; just use inds computed earlier from the daily Date object.

Act like before

 ## use auto.arima to choose ARIMA terms fit <- auto.arima(myts) ## forecast for next 60 time points fore <- forecast(fit, h = 60) 

The plot, although it will cause a problem as the x axis passes through the days from the era (1970-01-01), so we need to suppress the automatic construction of this axis, and then draw it. It is easy, since we have inds

 ## plot it plot(fore, xaxt = "n") # no x-axis Axis(inds, side = 1) 

This creates only a couple of ticks marked; if you want more control, tell R where you want ticks and tags:

 ## plot it plot(fore, xaxt = "n") # no x-axis Axis(inds, side = 1, at = seq(inds[1], tail(inds, 1) + 60, by = "3 months"), format = "%b %Y") 

Here we plan every 3 months.

+14
source share

The Time Series object does not work with creating daily time series. I suggest you use the zoo library.

 Library(zoo) zoo(visitors, seq(from = as.Date("2014-06-01"), to = as.Date("2015-10-14"), by = 1)) 
+3
source share

Below is a step-by-step guide for predicting daily data with multiple seasonality in R. If the time period is very long, the easiest way is to simply set the frequency attribute to 7.

 y <- ts(x, frequency=7) 

Then any of the usual methods of forecasting time series should give reasonable forecasts. for example

 library(forecast) fit <- ets(y) fc <- forecast(fit) plot(fc) 

When the time series is long enough to take more than a year, then you may need to consider annual seasonality as well as weekly seasonality. In this case, a multiple seasonal model such as TBATS is required.

 y <- msts(x, seasonal.periods=c(7,365.25)) fit <- tbats(y) fc <- forecast(fit) plot(fc) 

This should take into account the weekly pattern, as well as the longer annual pattern. Period 365.25 is the average length of a year, calculated for leap years. In some countries, an alternative or additional length of year may be required.

Capturing seasonality related to carry events such as Easter or Chinese New Year is more difficult. Even with monthly data, this can be tricky, because festivals can be in March or April (Easter) or in January or February (for Chinese New Year). Ordinary seasonal patterns do not allow this. The best way to deal with moving holiday effects is to use dummy variables. However, neither ETS models nor TBATS allow covariates. You can use a state-space model of the same form as TBATS, but with several sources of errors and covariances, but we do not have any R-code for this.

Instead, we can use the ARIMA error regression model, where the regression terms include any fictitious holiday effects, as well as longer seasonality. If there are not many decades of data, it is usually reasonable to assume that the annual seasonal form is constant from year to year, and therefore the Fourier terms can be used to model annual seasonality. Suppose we use K = 5 Fourier members to model annual seasonality, and that the holiday dummy variables are in a vector holiday with 100 future values ​​in the holiday. Then the following code will correspond to the corresponding model.

 y <- ts(x, frequency=7) z <- fourier(ts(x, frequency=365.25), K=5) zf <- fourierf(ts(x, frequency=365.25), K=5, h=100) fit <- auto.arima(y, xreg=cbind(z,holiday), seasonal=FALSE) fc <- forecast(fit, xreg=cbind(zf,holidayf), h=100) 

The K order can be selected by minimizing the AIC of the installed model.

+2
source share
 series <- ts(visitors, frequency=365, start=c(2014, 152)) 

152 - 01-06-2014, since it starts on the 152th due to frequency = 365 Forecast for 60 days, h = 60.

 forecast(arimadata , h=60) 
0
source share

This is how I created the time series when I was given some daily observations with the absence of a few comments. @ gavin-simpson was of great help. Hope this saves someone grief.

The initial data looked something like this:

 library(lubridate) set.seed(42) minday = as.Date("2001-01-01") maxday = as.Date("2005-12-31") dates <- seq(minday, maxday, "days") dates <- dates[sample(1:length(dates),length(dates)/4)] # create some holes df <- data.frame(date=sort(dates), val=sin(seq(from=0, to=2*pi, length=length(dates)))) 

To create a time series with this data, I created a β€œdummy” DataFrame with one row per date and combined with an existing file frame:

 df <- merge(df, data.frame(date=seq(minday, maxday, "days")), all=T) 

This data block can be run in timeseries. Missing Dates - NA .

 nts <- ts(df$val, frequency=365, start=c(year(minday), as.numeric(format(minday, "%j")))) plot(nts) 

holey sin wave

0
source share

I have a similar problem, my data is stock data pulled from yahoo finance for dates 2010-6-1 through 2018-10-30, bearing in mind that there are weekends and holidays, how do I set it

0
source share

All Articles