Random forests - probability estimates (+ scikit-learn specific)

I am interested in understanding how probability estimates are computed by random forests, both in general and in particular in the scikit-learn Python library (where the probability estimate is returned by pred_proba).

Thanks Guy

+6
source share
2 answers

The probabilities returned by the forest are the average probabilities returned by the ensemble trees ( docs ). The probabilities returned by a single tree are the normalized histograms of the classes of the sheet in which the sample falls.

+11
source

In addition to what Andreas / Dougal said: when you teach RF, enable compute_importances = True . Then check classifier.feature_importances_ to see which functions are found high in RF trees.

+2
source

All Articles