For my assessment, I have a dataset found in this link ( https://drive.google.com/drive/folders/0B2Iv8dfU4fTUMVFyYTEtWXlzYkk ), as in the following format. The third column (Y) in my dataset is my true value - this is what I wanted to predict (evaluate).
time XY 0.000543 0 10 0.000575 0 10 0.041324 1 10 0.041331 2 10 0.041336 3 10 0.04134 4 10 ... 9.987735 55 239 9.987739 56 239 9.987744 57 239 9.987749 58 239 9.987938 59 239
I want to start casting, for example, 5 window OLS regression estimation , and I tried it with the following script.
# /usr/bin/python -tt import numpy as np import matplotlib.pyplot as plt import pandas as pd df = pd.read_csv('estimated_pred.csv') model = pd.stats.ols.MovingOLS(y=df.Y, x=df[['X']], window_type='rolling', window=5, intercept=True) df['Y_hat'] = model.y_predict print(df['Y_hat']) print (model.summary) df.plot.scatter(x='X', y='Y', s=0.1)
A summary of the regression analysis is shown below.
-------------------------Summary of Regression Analysis------------------------- Formula: Y ~ <X> + <intercept> Number of Observations: 5 Number of Degrees of Freedom: 2 R-squared: -inf Adj R-squared: -inf Rmse: 0.0000 F-stat (1, 3): nan, p-value: nan Degrees of Freedom: model 1, resid 3 -----------------------Summary of Estimated Coefficients------------------------ Variable Coef Std Err t-stat p-value CI 2.5% CI 97.5% -------------------------------------------------------------------------------- X 0.0000 0.0000 1.97 0.1429 0.0000 0.0000 intercept 239.0000 0.0000 14567091934632472.00 0.0000 239.0000 239.0000 ---------------------------------End of Summary---------------------------------

I want to make the inverse prediction of Y at t+1 (i.e. to predict the next value of Y according to the previous value, i.e. p(Y)t+1 , by including the mean square error ( MSE ) - for example, if we look at line 5, the value of X is 2, and the value of Y is 10. Let's say the prediction value ( p(Y)t+1 ) is 6, and so the MSE will be (10-6)^2 How can we do this using statsmodels or scikit-learn for pd.stats.ols.MovingOLS , was removed in Pandas version 0.20.0, and since I cannot find the link?