Are there classification algorithms that target data with a one-to-many relationship (1: n)?

Has there been any data mining research to classify data from one to many?

For example, a similar problem, let’s say, I’m trying to predict which students will refuse the university based on their classes and personal information. Obviously, there is one, many between the personal information of students and the grades they achieved in their classes.

Obvious approaches include:

  • The unit . Several records can be combined together in some way, reducing the problem to the main classification problem. In the case of classifying students, the average value of their grades can be combined with their personal data. Although this solution is simple, often basic information is lost. For example, what if most students who take organic chemistry and fall below the C-end drop out, even if their average is higher than B +.

  • Voting . Create some classifiers (often weak ones) and ask them to cast their votes to determine a common data class. It would be like two classifiers were created: one for student data and one for their personal data. Each course record will be transferred to the classifier of courses and based on the grade and name of the course, the classifier will predict whether the student will drop out using only this course record. A personal data record will be classified using a personal data classifier. Then all the predictions of the class record together with the prediction of the record of personal information will vote together. This vote could be done in different ways, but most likely it will take into account how accurate the classifiers are and how defined the classifier is. Obviously, this scheme allows the use of more complex classification templates than aggregation, but this creates greater complexity. Also, if voting is not performed well, accuracy can easily suffer.

So, I am looking for other possible solutions for classifying data with a one to many relationship.

+7
source share
4 answers

Why don't you consider each class as a separate function of the same model?

student['age'] = 23 student['gender'] = 'male' ... student['grade_in_organic_chemistry'] = 'B+' student['grade_in_classical_physics'] = 'A-' 

I think I don’t understand why you want to “compile” or combine several classifiers when estimates can only be separate functions?

(Please excuse the lame psuedocode above, but just trying to demonstrate my point)

+2
source

Although this is probably not optimal compared to specialized methods, you can probably use correction SVM for an unbalanced class, as in the following example (using the Python library scikit-learn ):

http://scikit-learn.sourceforge.net/auto_examples/svm/plot_weighted_classes.html

In practice, I had good results with fairly unbalanced classes.

+1
source

It's hard to say, not knowing more, but from a Bayesian point of view, you might be interested in the case of missing functions . I will discuss in general terms. For more details see [Duda and Hart, 2nd ed., P. 54-55].

For any classifier, the Bayes decision rule is the choice of class i, which maximizes the probability of the appearance of class i, given that the data x was observed, i.e. max P (i | x). Vector x contains signs, such as student grades, age, etc.

Not all students take the same classes, so the feature vector x may have empty elements, i.e. "missing features". In this case, you must marginalize the missing functions, i.e. Just summarize the missing features and then decide on the good remaining features.

Example. Suppose a student takes biology, but not chemistry:

 P(student drops out | A+ in biology) = P(student drops out, A+ in biology)/P(A+ in biology) = P(student drops out, A+ in biology, A in chemistry) --------------------------------------------------- P(A+ in biology, A in chemistry) + P(student drops out, A+ in biology, B in chemistry) --------------------------------------------------- P(A+ in biology, B in chemistry) + ... + P(student drops out, A+ in biology, F in chemistry) --------------------------------------------------- P(A+ in biology, F in chemistry) 
0
source

I foresee two main paths ahead:

  • As you call it, an “aggregate” decision in which different resumes of each situation for students will be used: how many classes were accepted, what percentage of classes was introductory 101 class, middle class, junior class quartiles, etc.

  • Some types of evidence stores, such as the naive Bayesian model (as already suggested by Steve) or the fuzzy logic rule base. Such solutions naturally process various amounts of input data. I assume that this can be achieved with enough data, using one gigantic conventional model (neural network, etc.) and a very large set of inputs (most of which will be set to a neutral value for "missing"), but I doubt will work just like other options.

Sorry, but I think that the “gang of simple solutions” in this particular case would be weak. This does not mean that it will not work, but I would start somewhere else.

0
source

All Articles