Choosing a Python Regression Variable

I have a basic linear regression with 80 numerical variables (no classification variables). The training kit has 1600 series, testing 700.

I need a python package that iterates through all combinations of columns to find the best user-defined evaluation function or function with a result similar to AIC. OR If it does not exist, what do people use here to select variables? I know that R has packages like this, but I don't want to deal with Rpy2

I have no preference if LM requires learning scikit, numpy, pandas, statsmodels or another.

+4
source share
1 answer

, (Lasso). , , .

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html

, statsmodels, ,

import statsmodels.api as sm

model = sm.OLS()
results = model.fit(train_X,train_Y)

Lasso, , ,

from sklearn import linear_model

model = linear_model.Lasso(alpha=1.0(default))
results = model.fit(train_X,train_Y)

0.0 1.0. , .

.

+3

All Articles