One standard error rule for cross validation in scikit-learn

Question

One standard error rule for cross validation in scikit-learn

I am trying to fit some models in scikit-learn using grisSearchCV, and I would like to use the "one standard error" rule to select the best model, that is, choose the most economical model from a subset of models whose score is within the same standard error of the best result . Is there any way to do this?

+4

scikit-learn

user2212589 Mar 27 '13 at 17:51

source share

1 answer

ogrisel · Answer 1 · 2013-03-28T07:42:11+0000

You can calculate the standard error of the average validation value using:

from scipy.stats import sem

Then access the grid_scores_ attribute of the installed GridSearchCV object. This attribute has changed in the main scikit-learn branch, so please use the interactive shell to examine its structure.

As for the choice of the most economical model, model parameters of models do not always have a degree of freedom of interpretation. The meaning of the parameters often depends on the model, and there is no high-level metadata to interpret their “strengths”. You can code your interpretation on a case-by-case basis for each model class.

One standard error rule for cross validation in scikit-learn

More articles: