How to Compose FST (Transducer)

Question

How to Compose FST (Transducer)

Consider the following FSTs:

T1 0 1 a : b 0 2 b : b 2 3 b : b 0 0 a : a 1 3 b : a T2 0 1 b : a 1 2 b : a 1 1 a : d 1 2 a : c

How to perform a composition operation on these two FSTs (i.e. T1 o T2) I saw some algorithms, but could not understand much. If someone could explain it in a simple way, that would be a big help.

Please note that this is NOT homework. An example is taken from the lecture slides where the solution is given, but I could not figure out how to get to it.

+7

finite-automata nlp finite-state-machine

Tasbeer Apr 15 '10 at 22:40

source share

2 answers

If you are more prone to graphical explanations, the next set of slides in practice provides incremental graphical examples of the composition algorithm, and also includes a discussion of epsilon transitions in component transformers. Epsilon transitions complicate the composition process, and the algorithm described in the outis answer cannot generate the correct result in this case, depending on the half ring used.

See slides 10 ~ 35 for some graphic examples:

http://www.gavo.tu-tokyo.ac.jp/~novakj/wfst-algorithms.pdf

+2

si28719e Jan 30 '12 at 7:50

source share

outis · Accepted Answer · 2010-04-15T23:46:14+0000

Since you did not specify an input format, I assume that 0 is the initial state, any integers that appear in the second column, but not the first, are accepting states (3 for T1 and 2 for T2) and each row is an element of the transition relation, giving the previous state, next state, input letter and output letter.

Any operation on an FST should create a new FST, so we need states, an input alphabet, an output alphabet, initial states, final states, and transition relation (the FST specifications A, B, and W are given below in this order). Assume our FST:

  A = (Q, Σ, Γ, Q ₀ , Q _F , α)
 B = (P, Γ, Δ, P ₀ , P _F , β)

and we want to find

  W = (R, Σ, Δ, R ₀ , R _F , ω) = A ∘ B

Note that we do not need to define alphabets W; composition definition does this.

Imagine that you use A and B in series, and the output tape is served as input cassette B. The combined FST state is simply the combined states of A and B. In other words, the composition states are in the cross product of the states of the individual FSTs.

  R = Q × P

In your example, the W states are pairs of integers:

  R = {(0,0), (0,1), ... (3, 2)}

although we could renumber them and get (for example):

  R = {00, 01, 02, 10, 11, 12, 20, 21, 22, 30, 31, 32}

Similarly, the initial and receiving states of a compiled FST are cross-products of those contained in the components of the FST. In particular, R accepts the string iff A and B both accept the string.

  R ₀ = Q ₀ × P ₀
 R _F = Q _F × P _F

In the example, R ₀ = {00} and R _F = {32}.

It remains only to determine the transition relation ω. To do this, combine each transition rule for A with each transition rule for B that may apply. That is, combine each transition rule A (q _i , σ) → (q _j , γ) with each rule B that has “γ” as the input character.

  ω = {((q _i , p _h ), σ) → ((q _j , p _k ), δ): (q _i , σ) → (q _j , γ) ∈ α, 
                                      (p _h , γ) → (p _k , δ) ∈ β}

In this example, this means combining (for example) 0 1 a : b T1 with 0 1 b : a and 1 2 b : a from T2 to get:

 00 11 a: a
 01 12 a: a

Similarly, you would combine 0 2 b : b T1 with the same 0 1 b : a and 1 2 b : a from T2, 0 0 a : a from T1 with 1 1 a : d and 1 2 a : c from T2 & c.

Note that you may have unreachable states (which never appear as the “next” state) and transitions that will never occur (from unreachable states). As an optimization step, you can remove these states and transitions. However, leaving them will not affect the correct design; it's just an optimization.

How to Compose FST (Transducer)

More articles: