Get a Classification Report That Reported High Accuracy and Class Recall for Nive Bayes Multidimensional Solutions Using 10x Cross Validation

Question

Get a Classification Report That Reported High Accuracy and Class Recall for Nive Bayes Multidimensional Solutions Using 10x Cross Validation

I have the following code snippet that uses the NB classifier for the task of classifying multiple classes. The function performs a preliminary check, maintaining accuracy and printing the average value later. What I want instead is a classification report defining the reasonable accuracy of the class and recall, rather than ultimately getting an average accuracy rating.

import random from sklearn import cross_validation from sklearn.naive_bayes import MultinomialNB def multinomial_nb_with_cv(x_train, y_train): random.shuffle(X) kf = cross_validation.KFold(len(X), n_folds=10) acc = [] for train_index, test_index in kf: y_true = y_train[test_index] clf = MultinomialNB().fit(x_train[train_index], y_train[train_index]) y_pred = clf.predict(x_train[test_index]) acc.append(accuracy_score(y_true, y_pred))

If I am not doing cross validation, all I need to do is:

  from sklearn.metrics import classification_report from sklearn.naive_bayes import MultinomialNB def multinomial_nb(x_train, y_train, x_test, y_test): clf = MultinomialNB().fit(x_train, y_train) y_pred = clf.predict(x_test) y_true = y_test print classification_report(y_true, y_pred)

And he gives me this report:

  precision recall f1-score support 0 0.50 0.24 0.33 221 1 0.00 0.00 0.00 18 2 0.00 0.00 0.00 27 3 0.00 0.00 0.00 28 4 0.00 0.00 0.00 32 5 0.04 0.02 0.02 57 6 0.00 0.00 0.00 26 7 0.00 0.00 0.00 25 8 0.00 0.00 0.00 43 9 0.00 0.00 0.00 99 10 0.63 0.98 0.76 716 avg / total 0.44 0.59 0.48 1292

How can I get a similar report even in the case of cross-validation?

+5

python scikit-learn machine-learning naivebayes

Curiouscat Jul 02 '15 at 3:32

source share

1 answer

Jianxun li · Accepted Answer · 2015-07-02T06:13:24+0000

You can use cross_val_predict to generate your cross-validation forecast, and then use classification_report .

 from sklearn.datasets import make_classification from sklearn.cross_validation import cross_val_predict from sklearn.naive_bayes import GaussianNB from sklearn.metrics import classification_report # generate some artificial data with 11 classes X, y = make_classification(n_samples=2000, n_features=20, n_informative=10, n_classes=11, random_state=0) # your classifier, assume GaussianNB here for non-integer data X estimator = GaussianNB() # generate your cross-validation prediction with 10 fold Stratified sampling y_pred = cross_val_predict(estimator, X, y, cv=10) y_pred.shape Out[91]: (2000,) # generate report print(classification_report(y, y_pred)) precision recall f1-score support 0 0.47 0.36 0.41 181 1 0.38 0.46 0.41 181 2 0.45 0.53 0.48 182 3 0.29 0.45 0.35 183 4 0.37 0.33 0.35 183 5 0.40 0.44 0.42 182 6 0.27 0.13 0.17 183 7 0.47 0.44 0.45 182 8 0.34 0.27 0.30 182 9 0.41 0.44 0.42 179 10 0.42 0.41 0.41 182 avg / total 0.39 0.39 0.38 2000

Get a Classification Report That Reported High Accuracy and Class Recall for Nive Bayes Multidimensional Solutions Using 10x Cross Validation

More articles: