This is explained in the documentation.
The problem of studying the optimal decision tree, as you know, is NP-complete in several aspects of optimality and even for simple concepts. Therefore, practical decision tree learning algorithms are based on heuristic algorithms such as a greedy algorithm in which locally optimal decisions are made on each node. Such algorithms cannot guarantee the return of a globally optimal decision tree. This can be mitigated by teaching several trees to the student in the ensemble, where functions and samples are selectively selected with replacement.
Thus, basically, the non-optimal greedy algorithm is repeated several times using random samples of features and samples (a similar method used in random forests). The random_state parameter allows random_state to control these random selections.
The documentation says:
If int, random_state is the seed used by the random number generator; If the instance is RandomState, random_state is a random number generator; If None, the random number generator is an instance of RandomState used by np.random.
So a random algorithm will be used anyway. Passing any value (be it a specific int, for example, 0 or an instance of RandomState ) will not change that value. The only rationale for passing an int value (0 or otherwise) is to make the result consistent between calls: if you call it with random_state=0 (or any other value), then you will get the same result every time.
Ami tavory
source share