I really suspect that you are doing the same course online as I do - this allows you to get the right answers. If the task at hand is not very computational (and it is not up to date), we can bypass all the smart details of the step function and just try all the subsets of predictors.
For each subset, we can calculate the AIC as ACI = 2*nvars - 2*result.llf .
And then we just find a subset with minimal AIC:
import itertools import numpy as np import pandas as pd import statsmodels.api as sm AICs = {} for k in range(1,len(predictorcols)+1): for variables in itertools.combinations(predictorcols, k): predictors = train[list(variables)] predictors['Intercept'] = 1 res = sm.OLS(target, predictors).fit() AICs[variables] = 2*(k+1) - 2*res.llf pd.Series(AICs).idxmin()
Kostya
source share