Replace missing values ​​with mean (Weka)

Weka has a filter called "ReplaceMissingValues" that allows you to replace all missing values ​​in the dataset using the average value for each attribute. I would like to replace the missing values ​​for a specific attribute using the average of the values ​​belonging to a specific class. For example, in a binary dataset, I believe that it is more correct to replace the missing value for an attribute in a record belonging to a positive class, using the average value calculated only with records belonging to a positive class. So how can this be implemented? How can we replace values ​​only for a record belonging to a particular class?

0
source share
1 answer

If you want to replace the missing values ​​of class A, taking the average value calculated from the training instances of this particular class A, then you "deviate" from your data set. To avoid bias (which will ultimately block your trained model), it’s wise to use the “replace missing values” function by default, that is, consider the average and mode of all training instances, and not just this particular class.

+1
source

All Articles