Naive Bayes text classifier with features like hasDate, hasLocation, first word, etc.

Question

Naive Bayes text classifier with features like hasDate, hasLocation, first word, etc.

I am trying to work on a Naive Bayes text classifier. I have already created a dictionary approach in words. In my documents, I noticed many features that are unique to certain classifications. Examples of these functions include whether the document contains a location, date, or name. All of them are logical values and can be determined before the text is classified. There are other functions, such as the first word, etc.

I understand the basic approach of Naive Bayes. But it was not possible to find information on the inclusion of these functions in the classifier.

My question is, is it possible to include the above functions with a bag of words? If so, then I could follow. If this is not so, what would you recommend?

thanks

+4

machine-learning classification bayesian

Benny33 Dec 20 '12 at 2:09

source share

1 answer

Ben allison · Accepted Answer · 2012-12-20T10:23:43+0000

As part of Naive Bayes, there is nothing stopping you from adding additional features that are not based on a summary word representation. Say you have a class probability p (document | class_1) = l_1 based on your word functions. You have reason to believe that some binary functions b_1 and b_2 will also help in the classification (this could be a document containing the date and time, respectively, to make the example concrete).

You estimate the probability p (b_1 = 1 | class_1) = (# of documents in class 1 with b_1 = 1) / (# documents in class 1) --- p (b_1 = 0 | class_1) = 1 - p (b_1 = 1 | class_1). You do the same for class 2, and for function b_2 for both classes. Now adding these functions to the classification rule is especially simple, since Naive Bayes simply assumes the independence of the functions. So:

p (class_1 | document) \ propto p (class_1) x l_1 xp (b_1 | class_1) xp (b_2 | class_1)

where l_1 means the same as before (probability based on BOW functions), and for the terms p (b_i | class_1) you use either the terms p (b_i = 1 | class_1) or p (b_i = 0 | class_1) depending from what was actually the value of b_i. This can be extended to non-binary functions in the same way, and you can continue to add to your heart content (although you should be aware that you assume independence between the functions and you can switch to a classifier that does not work do not make this assumption) .

Naive Bayes text classifier with features like hasDate, hasLocation, first word, etc.

More articles: