R - mouse - machine learning: re-design of imputation from trains to test suite

I create a predictive model and use the mice package to impute NA into my training set. Since I need to reuse the same imputation scheme of my test suite, how can I reapply it to my test data?

 # generate example data set.seed(333) mydata <- data.frame(a = as.logical(rbinom(100, 1, 0.5)), b = as.logical(rbinom(100, 1, 0.2)), c = as.logical(rbinom(100, 1, 0.8)), y = as.logical(rbinom(100, 1, 0.6))) na_a <- as.logical(rbinom(100, 1, 0.3)) na_b <- as.logical(rbinom(100, 1, 0.3)) na_c <- as.logical(rbinom(100, 1, 0.3)) mydata$a[na_a] <- NA mydata$b[na_b] <- NA mydata$c[na_c] <- NA # create train/test sets library(caret) inTrain <- createDataPartition(mydata$y, p = .8, list = FALSE) train <- mydata[ inTrain, ] test <- mydata[-inTrain, ] # impute NAs in train set library(mice) imp <- mice(train, method = "logreg") train_imp <- complete(imp) # apply imputation scheme to test set test_imp <- unknown_function(test, imp$unknown_data) 
+6
source share
2 answers

Calculate the mouse symbol in the combined data set and only then divide it into a train and a test, set the machine learning classifier in the train set, and then on the test set.

0
source

When you train a model, you cannot use test data in any sense. Therefore, you cannot assign a complete set of data to MICE before splitting. It is only necessary to use trip data for imputing test data.

0
source

All Articles