In theory, you can do this. In this case, you will use the Baum-Welsh algorithm. This is very well described in the Rabiner HMM Tutorial .
However, by applying the HMM to part of the speech, the error you get with the standard form will not satisfy. This is a form of maximizing expectations that only converges to local maxima. Rule-based approaches knock out HMMs hands down, iirc.
I believe the natural language NLTK toolkit for python has an HMM implementation for this specific purpose.
source share