Training hidden Markov models without tag data Corpus

Question

Training hidden Markov models without tag data Corpus

For the course of linguistics, we introduced the Part of Speech (POS) tag using a hidden Markov model, where hidden variables were parts of speech. We trained the system with some tagged data, and then tested it and compared our results with gold data.

Would it be possible to train HMM without a labeled training kit?

+4

artificial-intelligence machine-learning nlp linguistics markov-models

Claudiu Dec 16 '09 at 19:01

source share

2 answers

NLP was a couple of years ago, but I believe that without labeling, HMM can help determine the transition probabilities / transition states of the n-gram symbol (that is, what are the chances of a “peace” arising after a “hello”), but not part of speech. To learn how POS is interconnected, a tagged enclosure is required.

If I get away from this, let me know in the comments!

+1

Matt baker Dec 16 '09 at 19:28

source share

bayer · Accepted Answer · 2009-12-18T00:46:15+0000

In theory, you can do this. In this case, you will use the Baum-Welsh algorithm. This is very well described in the Rabiner HMM Tutorial .

However, by applying the HMM to part of the speech, the error you get with the standard form will not satisfy. This is a form of maximizing expectations that only converges to local maxima. Rule-based approaches knock out HMMs hands down, iirc.

I believe the natural language NLTK toolkit for python has an HMM implementation for this specific purpose.

Training hidden Markov models without tag data Corpus

More articles: