Why is Single Forest Random Forest so much better than a decision tree classifier?

Question

Why is Single Forest Random Forest so much better than a decision tree classifier?

I study computer learning using a library scikit-learn. I use a decision tree classifier and a random forest classifier for my data using this code:

def decision_tree(train_X, train_Y, test_X, test_Y):

    clf = tree.DecisionTreeClassifier()
    clf.fit(train_X, train_Y)

    return clf.score(test_X, test_Y)


def random_forest(train_X, train_Y, test_X, test_Y):
    clf = RandomForestClassifier(n_estimators=1)
    clf = clf.fit(X, Y)

    return clf.score(test_X, test_Y)

Why is the result much better for a random forest classifier (for 100 runs with a random sample of 2/3 of the data for training and 1/3 for the test)?

100%|███████████████████████████████████████| 100/100 [00:01<00:00, 73.59it/s]
Algorithm: Decision Tree
  Min     : 0.3883495145631068
  Max     : 0.6476190476190476
  Mean    : 0.4861783113770316
  Median  : 0.48868030937802126
  Stdev   : 0.047158171852401135
  Variance: 0.0022238931724605985
100%|███████████████████████████████████████| 100/100 [00:01<00:00, 85.38it/s]
Algorithm: Random Forest
  Min     : 0.6846846846846847
  Max     : 0.8653846153846154
  Mean    : 0.7894823428836184
  Median  : 0.7906101571063208
  Stdev   : 0.03231671150915106
  Variance: 0.0010443698427656967

Random forest assessments with one assessment - is this not only a decision tree? Am I doing something wrong or misunderstood the concept?

Thanks for your reply.

+8

python scikit-learn machine-learning random-forest decision-tree

hallow_me Jan 13 '18 at 11:04

source share

1 answer

desertnaut · Accepted Answer · 2018-01-13T11:59:56+0000

- ?

, , ; Random Forest - , , .

, , Random Forest (RF) , Decision Tree (DT).

- , : DT , RF , max_features (. ).

-, DT , RF- ; :

, , bootstrap = True ( ).

RF : ( . ). , , ; - , , .. ( , . ). , . , ( ), , bootstrap=False RandomForestClassifier(). , , , ...

RandomForestClassifier(), bootstrap=False max_features=None, ..

clf = RandomForestClassifier(n_estimators=1, max_features=None, bootstrap=False)

, , .

Why is Single Forest Random Forest so much better than a decision tree classifier?

More articles: