`pyugm`

models¶This notebook shows how to specify a simple discrete undirected probabilistic graphical model and perform common operations like marginalisation and calibration.

The aim of the package is to provide ways to quickly specify and test out undirected probabilistic graphical models.
At the moment it is too slow to tackle even medium sized problems (like in vision),
but I plan to move the main inference routines to `Cython`

soon, which should should make the package more generally usable.

I hope to incorporate some the nice features of my two favourite machine learning packages, `sklearn`

and `PyMC`

.

`sklearn`

provides a uniform interface to different models and many of the common preprocessing steps.`pyugm`

models should therefore have a similar interface and the necessary helpers to easily apply a model to actual data. A drawback of`sklearn`

, IMHO, is that almost none of the models are Bayesian (even for the Bayesian ridge regression model it is difficult to get the posterior or the predictive distribution).`PyMC`

, on the other hand, is fully Bayesian and a wonderful tool for many proplems. I find, however, that it is sometimes difficult to specify models with many types of observed variables - it seems to me to be more aimed at models with output variables of a single type.

In [2]:

```
import numpy as np
from pyugm.factor import DiscreteFactor
# Specify the potential table
factor_data = np.array([[1, 2], [2, 1]])
# The variable names ("1" and "2") and cardinalities (2 and 2).
variables_names_and_cardinalities = [(1, 2), (2, 2)]
# Construct the factor
factor = DiscreteFactor(variables_names_and_cardinalities, data=factor_data)
print factor
```

In [3]:

```
factor.data # The potential table
```

Out[3]:

In [4]:

```
# Marginalise out all the variables that is not named "1". (i.e. marginalise out variable "2")
marg = factor.marginalize([1])
print marg
print marg.data
```

`Belief`

s are mutable `Factor`

s. They contain the current belief over the variables in the factor.

In [5]:

```
from pyugm.factor import DiscreteBelief
# Create a belief that is based on a factor
belief = DiscreteBelief(factor)
# Reduce the original factor by observing variable "1" taking on the value 0. [TODO: implement efficient factor reduction]
# Evidence is set by a dictionary where the key is a variable name and the value its observed value.
belief.set_evidence({1: 0})
print belief
print belief.data
```

Models are collections of factors. The model automatically builds a cluster graph by greedily adding the factor that have the largest separator set with a factor already in the graph. By using this scheme you will often end up with a tree.

In [6]:

```
from pyugm.model import Model
factor1 = DiscreteFactor([(1, 2), (2, 2)], data=np.array([[1, 2], [2, 1]]))
factor2 = DiscreteFactor([(2, 2), ('variable3', 3)], # Variable names can also be strings
data=np.array([[0, 0.2, 0.3], [0.1, 0.5, 0.3]])) # Cardinalities of 2 and 3 means the factor table must be 2x3
# [TODO: cardinalities can be inferred from data shape when provided]
factor3 = DiscreteFactor([('variable3', 3), (4, 2)], data=np.array([[0, 1], [1, 2], [0.5, 0]]))
model = Model([factor1, factor2, factor3])
```

In [7]:

```
model.edges # returns a set of tuples
```

Out[7]:

The graph:

factor1 -- factor2 -- factor3

has been built.

Models contain immutable `factor`

s, while `Inference`

objects contain `belief`

s. `Inference`

objects contain the `calibrate`

method to calibrate the `belief`

s.

In [158]:

```
from pyugm.model import Model
from pyugm.infer_message import LoopyBeliefUpdateInference
```

In [159]:

```
factor1 = DiscreteFactor([(1, 2), (2, 2)], data=np.array([[1, 2], [2, 1]]))
factor2 = DiscreteFactor([(2, 2), ('variable3', 3)], data=np.array([[0, 0.2, 0.3], [0.1, 0.5, 0.3]]))
factor3 = DiscreteFactor([('variable3', 3), (4, 2)], data=np.array([[0, 1], [1, 2], [0.5, 0.1]]))
model = Model([factor1, factor2, factor3])
inferrer = LoopyBeliefUpdateInference(model)
```

In [160]:

```
inferrer.calibrate()
```

Out[160]:

In [161]:

```
# Calibrated marginals
print inferrer.get_marginals(1)[0], inferrer.get_marginals(1)[0].data
print inferrer.get_marginals(2)[0], inferrer.get_marginals(2)[0].data
```

In [162]:

```
# Natural logarithm of the normalizing factor
print inferrer.partition_approximation()
```

`variable3 = 1`

?¶In [163]:

```
inferrer.calibrate(evidence={'variable3': 1})
```

Out[163]:

In [164]:

```
# Calibrated marginals
print inferrer.get_marginals(1)[0], inferrer.get_marginals(1)[0].data
print inferrer.get_marginals(2)[0], inferrer.get_marginals(2)[0].data
```

In [165]:

```
# Natural logarithm of the normalizing factor
print inferrer.partition_approximation()
```

Although many improvements are necessary I hope this gives a glimpse of what I'm aiming at. I'll discuss parameter learning and different update orderings in another notebook.