This worksheet shows how to use the ShortestDistance
and Reverse
operations, together
with log weights, to obtain the equivalent of the forward-backward algorithm for Hidden Markov Models.
from pylab import *
import openfst
from openfst import StdVectorFst as FST
from openfst import LogVectorFst as LFST
from fstutils import *
figsize(12,4)
Let's look at the effects of different weights in some simple example.
Here, we consider an FST in which there are two identical paths with identical weights. How do they get combined, for example under minimization?
lf = LFST()
lf.AddString("cat",0.5,0.5)
lf.AddString("cat",1.0)
show_fst(lf)
dfst = LFST()
openfst.Determinize(lf,dfst)
openfst.Minimize(dfst)
show_fst(dfst)
As we see here, the two paths get combined by probabilities. That is, each path can lead to the goal, and the total probability of reaching the goal is the sum of the probabilities for each path (in the log domain).
log(exp(-1.0)+exp(-1.0))
-0.30685281944005466
With the regular (log,+) semiring, the costs for each path get added along the path, but the total cost is just the minimum cost of reaching the goal.
fst = FST()
fst.AddString("cat",0.5,0.5)
fst.AddString("cat",1.0)
show_fst(lf)
dfst = FST()
openfst.Determinize(fst,dfst)
openfst.Minimize(dfst)
show_fst(dfst)
Tropical Semiring:
openfst.Weight()
<openfst.Weight; proxy of <Swig Object of type 'ArcTpl< TropicalWeight >::Weight *' at 0x33dd150> >
openfst.LogWeight()
<openfst.LogWeight; proxy of <Swig Object of type 'ArcTpl< LogWeight >::Weight *' at 0x33dd420> >
openfst.LogWeight.Zero().Value()
inf
openfst.LogWeight.One().Value()
0.0
Built-In Weights
Note that you can get n-best paths by defining a new weight class that...
Log
or Tropical
weightFor getting the equivalent of the Viterbi algorithm, that is the shortest path, we need to use the semiring (min,+).
fst = FST()
fst.AddString("cat",1.0)
fst.AddString("cot",1.9)
fst.AddString("car",1.7)
fst.AddString("dat",1.3)
dfst = FST()
openfst.Determinize(fst,dfst)
openfst.Minimize(dfst)
fst = dfst
show_fst(fst)
out = FST()
openfst.ShortestPath(fst,out,1)
show_fst(out)
For getting the equivalent of the forward-backward algorithm, we need to use log probabilities as weights.
fst = LFST()
fst.AddString("cat",1.0)
fst.AddString("cot",1.9)
fst.AddString("car",1.7)
fst.AddString("dat",1.3)
dfst = LFST()
openfst.Determinize(fst,dfst)
openfst.Minimize(dfst)
fst = dfst
show_fst(fst)
forward = openfst.vector_logweight()
openfst.ShortestDistance(fst,forward)
forward = array([v.Value() for v in forward])
print forward
[ 0. 0.35644531 1.29980469 0.59667969 0.86252433 0.02775925]
rfst = LFST()
openfst.Reverse(fst,rfst)
show_fst(rfst)
reverse = openfst.vector_logweight()
openfst.ShortestDistance(rfst,reverse)
reverse = array([v.Value() for v in reverse])
print reverse
[ 0. 0.02775919 0.00017603 0. 0.00019905 0. 0. ]
total = forward+reverse[1:][::-1]
print [exp(-c) for c in total]
[1.0, 0.70016076046423459, 0.27253077452703905, 0.55063689076056477, 0.42202093463650986, 0.94599457146783861]
show_fst(fst)
Given these algorithms, what would Baum Welch look like?