My initial response to this question was that he did not show much research effort, since "everyone" knows that random forests do not handle the missing values in the predictors. But after checking ?randomForest I have to admit that it can be much more explicit.
(Although the Breiman PDF , linked in the documentation, clearly indicates that missing values are simply not processed at all.)
The only obvious clue in the official documentation that I could see was that the default value for the na.action parameter is na.fail , which might be too cryptic for new users.
In any case, if your predictors are missing values, you have (basically) two options:
- Use a different tool (
rpart handles missing values well.) - Lose missing values
Not surprisingly, the randomForest package has a function for this, rfImpute . The documentation on ?rfImpute goes through a basic example of its use.
If only a small number of cases have missing values, you can also try setting na.action = na.omit to simply remove these cases.
And of course, this answer is a little hunch that your problem is really just missing.
joran Dec 04 '11 at 2:10 2011-12-04 02:10
source share