When I try to test my training model using new test data with fewer factors than my training data, it predict()returns the following:
The type of predictors in the new data does not match the type of training data.
My training data has a variable with 7 levels of factors, and my test data has the same variable with 6 levels of factors (all 6 AREs in the training data).
When I add an observation containing the “missing” 7th factor, the model runs, so I'm not sure why this is happening or even the logic.
I could see if the test set has more / different levels of factors, then randomForest will strangle, but why in the case when the training set has “more” data?
source
share