An alternative solution that uses the median instead of the average is represented by the na.roughfix function of randomForest . As described in the documentation , it works with a data frame or numeric matrix. In particular, for numeric variables, NAs are replaced by column medians. For factor variables, NAs are replaced by the most frequent levels (random bond breaking). If the object does not contain NAs , it is returned unchanged.
Using the same examples as @Henrik,
library(randomForest) x <- c(56, NA, 70, 96) na.roughfix(x) #[1] 56 70 70 96
or with a larger matrix:
y <- matrix(1:50, nrow = 10) y[sample(1:length(y), 4, replace = FALSE)] <- NA y # [,1] [,2] [,3] [,4] [,5] # [1,] 1 11 21 31 41 # [2,] 2 12 22 32 42 # [3,] 3 NA 23 33 NA # [4,] 4 14 24 34 44 # [5,] 5 15 25 35 45 # [6,] 6 16 NA 36 46 # [7,] 7 17 27 37 47 # [8,] 8 18 28 38 48 # [9,] 9 19 29 39 49 # [10,] 10 20 NA 40 50 na.roughfix(y) # [,1] [,2] [,3] [,4] [,5] # [1,] 1 11 21.0 31 41 # [2,] 2 12 22.0 32 42 # [3,] 3 16 23.0 33 46 # [4,] 4 14 24.0 34 44 # [5,] 5 15 25.0 35 45 # [6,] 6 16 24.5 36 46 # [7,] 7 17 27.0 37 47 # [8,] 8 18 28.0 38 48 # [9,] 9 19 29.0 39 49 #[10,] 10 20 24.5 40 50