I am writing a Naive Bayes classifier to perform indoor localization from a Wi-Fi signal strength. So far, this works well, but I have some questions about the missing features. This happens often because I use Wi-Fi signals, and WiFi access points are simply not available everywhere.
Question 1 . Suppose I have two classes: Apple and Banana, and I want to classify a test instance of T1, as shown below.

I fully understand how the Naive Bayes classifier works. Below is the formula that I use from the Wikipedia article in the classifier. I use the unified previous probabilities P (C = c), so I omit it in my implementation.

Now, when I calculate the right side of the equation and sort through all the probabilities of the class probabilities, what set of functions do I use? The test instance of T1 uses functions 1, 3, and 4, but the two classes do not have all of these functions. Therefore, when I run my cycle to calculate the product of probability, I see several options for me to loop:
- Let's move on to combining all the functions from training, namely to functions 1, 2, 3, 4. Since the test instance of T1 does not have function 2, use an artificial tiny probability.
- Sort only those test instances, namely 1, 3, and 4.
- Scroll to the features available for each class. To calculate the conditional probability for Apple, I would use functions 1, 2, and 3, and for Banana, I would use 2, 3, and 4.
Which of the above should I use?
Question 2 . Let's say I want to classify a test instance of T2, where T2 has a function not found in any of the classes. I use logarithmic probabilities to help eliminate the underflow, but I'm not sure about the details of the loop. I am doing something like this (in pseudocode similar to Java):
Double bestLogProbability = -100000; ClassLabel bestClassLabel = null; for (ClassLabel classLabel : allClassLabels) { Double logProbabilitySum = 0.0; for (Feature feature : allFeatures) { Double logProbability = getLogProbability(classLabel, feature); if (logProbability != null) { logProbabilitySum += logProbability; } } if (bestLogProbability < logProbability) { bestLogProbability = logProbabilitySum; bestClassLabel = classLabel; } }
The problem is that if none of the classes has test instance functions (function 5 in the example), then logProbabilitySum will remain 0.0, which will lead to a better logarithmic probability of 0.0 or linear probability of 1.0, which is clearly incorrect. What is the best way to handle this?