1. and 2. are correct. You must be careful that the source transition and radiation matrices are not completely homogeneous, they must be slightly randomized for training to work.
3. "Hello" , , .
, Hello: [1,0,1,1,0,0]. 3 'Hello' , :
data = [1,0,1,1,0,0,1,0,1,1,0,0,1,0,1,1,0,0]
, , :
data = [1,0,1,1,0,0; 1,0,1,1,0,0; 1,0,1,1,0,0].
MatLab, HMM toolbox Murphy. , HMM :
M = 3;
N = 2;
% "true" parameters
prior0 = normalise(rand(N ,1));
transmat0 = mk_stochastic(rand(N ,N ));
obsmat0 = mk_stochastic(rand(N ,M));
% training data: a 5*6 matrix, e.g. 5 different 'Hello' sequences of length 6
number_of_seq = 5;
seq_len= 6;
data = dhmm_sample(prior0, transmat0, obsmat0, number_of_seq, seq_len);
% initial guess of parameters
prior1 = normalise(rand(N ,1));
transmat1 = mk_stochastic(rand(N ,N ));
obsmat1 = mk_stochastic(rand(N ,M));
% improve guess of parameters using EM
[LL, prior2, transmat2, obsmat2] = dhmm_em(data, prior1, transmat1, obsmat1, 'max_iter', 5);
LL
4. , , - , HMM:
% use model to compute log[P(Obs|model)]
loglik = dhmm_logprob(data, prior2, transmat2, obsmat2)
: , , - .
, .