Weka Machine Learning: How to Interpret Naive Bayes Classifier?

I use the explorer function to classify. My .arff data file has 10 functions of numeric and binary values; (only the identifier of the instances is nominal). I have 16 copies. The class to be predicted, Yes / No, used naive bays, but I can’t interpret the results, does anyone know how to interpret the results of the naive Bayes classification?

+2
machine-learning classification weka
source share
2 answers

Naive Bayes does not select any important features. As you mentioned, the result of training the Naive Bayes classifier is the average value and variance for each function. The classification of the new samples in β€œYes” or β€œNo” is based on whether the values ​​of the characteristics of the sample correspond to the best for the average value and variance of the trained characteristics for β€œYes” or β€œNo”.

You can use other algorithms to find the most informative attributes. In this case, you may need to use a decision tree classifier, for example. J48 in WEKA (this is an open source implementation of the C4.5 decision tree algorithm ). The first node in the resulting decision tree tells you which function has the most intelligent power.

Even better (as Rushdy Shams stated in another post); Weka Explorer offers special build options to find the most useful attributes in a dataset. These options can be found on the Select attributes tab.

+1
source share

As Sicco said, NB cannot offer you the best features. The decision tree is a good choice, because branching can sometimes tell you an important function - BUT NOT ALWAYS. To handle a simple and complex set of functions, you can use the WEKA SELECT ATTRIBUTE tab. There you will find search methods and an attribute evaluator. Depending on your task, you can choose the one that suits you best. They will provide you with a ranking of the functions (either from the training data, or from the k-fold cross check). Personally, I think decision trees do not work well if your data set is processing. In this case, ranking functions is the standard way to select the best features. In most cases, I use the infogain and ranker algorithm. When you see that your attributes range from 1 to k, it’s really nice to find out the necessary functions and the unnecessary.

+1
source share

All Articles