Statsmodel using ARMA

A bit new here, but trying to get an ARMA statistic prediction tool to work. I imported some stock data from Yahoo and got ARMA to give me suitable parameters. However, when I use the forecast code, everything I get is a list of errors that I seem to be unable to understand. Not quite sure what I'm doing wrong here:

import pandas import statsmodels.tsa.api as tsa from pandas.io.data import DataReader start = pandas.datetime(2013,1,1) end = pandas.datetime.today() data = DataReader('GOOG','yahoo') arma =tsa.ARMA(data['Close'], order =(2,2)) results= arma.fit() results.predict(start=start,end=end) 

Errors:

 --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) C:\Windows\system32\<ipython-input-84-25a9b6bc631d> in <module>() 13 results= arma.fit() 14 results.summary() ---> 15 results.predict(start=start,end=end) D:\Python27\lib\site-packages\statsmodels-0.5.0-py2.7.egg\statsmodels\base\wrapp er.pyc in wrapper(self, *args, **kwargs) 88 results = object.__getattribute__(self, '_results') 89 data = results.model.data ---> 90 return data.wrap_output(func(results, *args, **kwargs), how) 91 92 argspec = inspect.getargspec(func) D:\Python27\lib\site-packages\statsmodels-0.5.0-py2.7.egg\statsmodels\tsa\arima_ model.pyc in predict(self, start, end, exog, dynamic) 1265 1266 """ -> 1267 return self.model.predict(self.params, start, end, exog, dynamic ) 1268 1269 def forecast(self, steps=1, exog=None, alpha=.05): D:\Python27\lib\site-packages\statsmodels-0.5.0-py2.7.egg\statsmodels\tsa\arima_ model.pyc in predict(self, params, start, end, exog, dynamic) 497 498 # will return an index of a date --> 499 start = self._get_predict_start(start, dynamic) 500 end, out_of_sample = self._get_predict_end(end, dynamic) 501 if out_of_sample and (exog is None and self.k_exog > 0): D:\Python27\lib\site-packages\statsmodels-0.5.0-py2.7.egg\statsmodels\tsa\arima_ model.pyc in _get_predict_start(self, start, dynamic) 404 #elif 'mle' not in method or dynamic: # should be on a date 405 start = _validate(start, k_ar, k_diff, self.data.dates, --> 406 method) 407 start = super(ARMA, self)._get_predict_start(start) 408 _check_arima_start(start, k_ar, k_diff, method, dynamic) D:\Python27\lib\site-packages\statsmodels-0.5.0-py2.7.egg\statsmodels\tsa\arima_ model.pyc in _validate(start, k_ar, k_diff, dates, method) 160 if isinstance(start, (basestring, datetime)): 161 start_date = start --> 162 start = _index_date(start, dates) 163 start -= k_diff 164 if 'mle' not in method and start < k_ar - k_diff: D:\Python27\lib\site-packages\statsmodels-0.5.0-py2.7.egg\statsmodels\tsa\base\d atetools.pyc in _index_date(date, dates) 37 freq = _infer_freq(dates) 38 # we can start prediction at the end of endog ---> 39 if _idx_from_dates(dates[-1], date, freq) == 1: 40 return len(dates) 41 D:\Python27\lib\site-packages\statsmodels-0.5.0-py2.7.egg\statsmodels\tsa\base\d atetools.pyc in _idx_from_dates(d1, d2, freq) 70 from pandas import DatetimeIndex 71 return len(DatetimeIndex(start=d1, end=d2, ---> 72 freq = _freq_to_pandas[freq])) - 1 73 except ImportError, err: 74 from pandas import DateRange D:\Python27\lib\site-packages\statsmodels-0.5.0-py2.7.egg\statsmodels\tsa\base\d atetools.pyc in __getitem__(self, key) 11 # being lazy, don't want to replace dictionary below 12 def __getitem__(self, key): ---> 13 return get_offset(key) 14 _freq_to_pandas = _freq_to_pandas_class() 15 except ImportError, err: D:\Python27\lib\site-packages\pandas\tseries\frequencies.pyc in get_offset(name) 484 """ 485 if name not in _dont_uppercase: --> 486 name = name.upper() 487 488 if name in _rule_aliases: AttributeError: 'NoneType' object has no attribute 'upper' 
+4
source share
2 answers

Looks like a mistake. I'll see.

https://github.com/statsmodels/statsmodels/issues/712

Change As a workaround, you can simply remove the DatetimeIndex from the DataFrame and pass it a numpy array. This makes the prediction a bit more complicated by date, but it’s already quite difficult to use dates to predict when there is no frequency, so just having start and end dates is almost pointless.

 import pandas import statsmodels.tsa.api as tsa from pandas.io.data import DataReader import pandas data = DataReader('GOOG','yahoo') dates = data.index # start at a date on the index start = dates.get_loc(pandas.datetools.parse("1-2-2013")) end = start + 30 # "steps" # NOTE THE .values arma =tsa.ARMA(data['Close'].values, order =(2,2)) results= arma.fit() results.predict(start, end) 
+4
source

When I run my code, I get:

"ValueError: there is no frequency for these dates and dates 2013-01-01 00:00:00 is not specified in the date. Try to specify the date that is listed in the date index or use an integer

Since trading dates occur at an uneven frequency (holidays and weekends), the model is not smart enough to know the correct calculation frequency.

If you replace the dates with your whole location in the index, you will get your forecasts. Then you can simply return the original index to the results.

 prediction = results.predict(start=0, end=len(data) - 1) prediction.index = data.index print(prediction) 2010-01-04 689.507451 2010-01-05 627.085986 2010-01-06 624.256331 2010-01-07 608.133481 ... 2017-05-09 933.700555 2017-05-10 931.290023 2017-05-11 927.781427 2017-05-12 929.661014 

Alternatively, you may need to run a model like this on daily incomes rather than raw prices. Running it at raw prices is not going to capture momentum and average reversal, as you probably think. Your model is based on absolute values ​​of prices, and not on changes in prices, momentum, moving average, etc. Other factors that you probably want to use. The predictions you create will look pretty good, because they only predict one step forward, so it does not fix the compounding error. It confuses a lot of people. Errors will look small compared to the absolute value of the stock price, but the model will not be very predictive.

I suggest reading this walkthrough for starter:

http://www.johnwittenauer.net/a-simple-time-series-analysis-of-the-sp-500-index/

0
source

All Articles