One standard error rule for cross validation in scikit-learn

I am trying to fit some models in scikit-learn using grisSearchCV, and I would like to use the "one standard error" rule to select the best model, that is, choose the most economical model from a subset of models whose score is within the same standard error of the best result . Is there any way to do this?

+4
source share
1 answer

You can calculate the standard error of the average validation value using:

from scipy.stats import sem 

Then access the grid_scores_ attribute of the installed GridSearchCV object. This attribute has changed in the main scikit-learn branch, so please use the interactive shell to examine its structure.

As for the choice of the most economical model, model parameters of models do not always have a degree of freedom of interpretation. The meaning of the parameters often depends on the model, and there is no high-level metadata to interpret their β€œstrengths”. You can code your interpretation on a case-by-case basis for each model class.

+2
source

All Articles