How to train a naive classifier of bays with a position label as a function?

I have two classes of sentences. Each of them has a rather different pos-tag sequence. How can I train the Naive-Bayes classifier using a sequence of POS tags as a function? Does Stanford CoreNLP / NLTK (Java or Python) support any method of constructing a classifier with a pos tag as a function? I know that in python NaiveBayesClassifier NB classifier is allowed to be created, but it uses contains-a-word as a function, but can it be extended to use the pos-tag sequence as a function?

+5
source share
1 answer

If you know how to train and predict texts (or sentences in your case) using the nltk naive bayes classifier and words as functions, you can easily extend this approach to classify texts by position labels. This is because the classifier does not care about whether your function lines are words or tags. Thus, you can simply replace the words of your sentences with pos tags using, for example, nltk standard pos tagger:

 sent = ['So', 'they', 'have', 'internet', 'on', 'computers' , 'now'] tags = [t for w, t in nltk.pos_tag(sent)] print tags 

['IN', 'PRP', 'VBP', 'JJ', 'IN', 'NNS', 'RB']

From now on, you can move on to the "contains-a-word" approach.

+4
source

Source: https://habr.com/ru/post/1214324/


All Articles