Random forests - probability estimates (+ scikit-learn specific)

Question

Random forests - probability estimates (+ scikit-learn specific)

I am interested in understanding how probability estimates are computed by random forests, both in general and in particular in the scikit-learn Python library (where the probability estimate is returned by pred_proba).

Thanks Guy

+6

scikit-learn machine-learning

Guy adini Jan 7 '13 at 8:30

source share

2 answers

In addition to what Andreas / Dougal said: when you teach RF, enable compute_importances = True . Then check classifier.feature_importances_ to see which functions are found high in RF trees.

+2

smci Mar 19 '13 at 22:59

source share

Andreas Mueller · Accepted Answer · 2013-01-07T10:00:01+0000

The probabilities returned by the forest are the average probabilities returned by the ensemble trees ( docs ). The probabilities returned by a single tree are the normalized histograms of the classes of the sheet in which the sample falls.

Random forests - probability estimates (+ scikit-learn specific)

More articles: