Unconclusive RandomForest documentation in ScikitLearn

In ensemble methods, the Scikit-Learn documentation is http://scikit-learn.org/stable/modules/ensemble.html#id6 in section 1.9.2.3. The options we read are:

(...) The best results are also usually achieved when setting max_depth = None in combination with min_samples_split = 1 (i.e. with the full development of trees). Keep in mind that these values ​​are usually not optimal. The best parameter values ​​should always be cross-checked.

So what is the difference between best results and optimal ones? I thought that best author results mean better cross-validated forecasting results.

Also, note that bootstrap patterns are used by default in random forests (bootstrap = True), and the default strategy is to use the original dataset to create additional trees (bootstrap = False).

I understand this as follows: bootstrapping is used by default in the Scikit-Learns implementation, but the default strategy is not to use loading. If so, what is the source of the default strategy and why is it not a default in the implementation?

+5
source share
1 answer

I agree that the first quote contradicts itself. Perhaps it would be better:

Best results are also often achieved with fully developed trees (max_depth = None and min_samples_split = 1). Keep in mind that these values ​​are usually not guaranteed to be optimal. The best parameter values ​​should always be cross-checked.

The second quote compares the default bootstrap parameter value for random forests ( RandomForestClassifier and RandomForestRegression ) for extremely randomized trees, as implemented in the ExtraTreesClassifier and ExtraTreesRegressor . The following may be more explicit:

Also, note that bootstrap patterns are used by default in random forests (bootstrap = True), and to create additional trees, the default strategy is to use the original dataset (bootstrap = False).

Please do not hesitate to send PR with correction if you find these wordings more clearly.

+3
source

Source: https://habr.com/ru/post/1212972/


All Articles