It's hard to say, not knowing more, but from a Bayesian point of view, you might be interested in the case of missing functions . I will discuss in general terms. For more details see [Duda and Hart, 2nd ed., P. 54-55].
For any classifier, the Bayes decision rule is the choice of class i, which maximizes the probability of the appearance of class i, given that the data x was observed, i.e. max P (i | x). Vector x contains signs, such as student grades, age, etc.
Not all students take the same classes, so the feature vector x may have empty elements, i.e. "missing features". In this case, you must marginalize the missing functions, i.e. Just summarize the missing features and then decide on the good remaining features.
Example. Suppose a student takes biology, but not chemistry:
P(student drops out | A+ in biology) = P(student drops out, A+ in biology)/P(A+ in biology) = P(student drops out, A+ in biology, A in chemistry) --------------------------------------------------- P(A+ in biology, A in chemistry) + P(student drops out, A+ in biology, B in chemistry) --------------------------------------------------- P(A+ in biology, B in chemistry) + ... + P(student drops out, A+ in biology, F in chemistry) --------------------------------------------------- P(A+ in biology, F in chemistry)
Steve tjoa
source share