- Programming with random variables
- User specifies a generative model for the observed data
- "Tell the story" of how the data were created

- The language runtime/software library automatically performs inference

- Maximum likelihood inference
- Maxumum a posteriori inference
**Full posterior inference**

A certain rare disease affects one in 10,000 people. A test for the disease gives the correct result 99.9% of the time.

What is the probability that a given test is positive?

In [8]:

```
import pymc3 as pm
with pm.Model() as pretest_disease_model:
# This is our prior belief about whether or not the subject has the disease,
# given its prevalence of the disease in the general population
has_disease = pm.Bernoulli('has_disease', 1e-4)
# This is the probability of a positive test given the presence or lack
# of the disease
p_positive = pm.Deterministic('p_positive',
T.switch(T.eq(has_disease, 1),
# If the subject has the disease,
# this is the probability that the
# test is positive
0.999,
# If the subject does not have the
# disease, this is the probability
# that the test is negative
1 - 0.999))
# This is the observed test result
test_positive = pm.Bernoulli('test_positive', p_positive)
```

In [9]:

```
samples = 40000
```

In [10]:

```
with pretest_disease_model:
step = pm.Metropolis()
pretest_disease_trace = pm.sample(samples, step, random_seed=SEED)
```

[-----------------100%-----------------] 40000 of 40000 complete in 6.1 sec

In [11]:

```
pretest_disease_trace['test_positive']
```

Out[11]:

array([0, 0, 0, ..., 0, 0, 0])

In [12]:

```
pretest_disease_trace['test_positive'].size
```

Out[12]:

40000

The probability that a given test is positive is

In [13]:

```
pretest_disease_trace['test_positive'].mean()
```

Out[13]:

0.001575

$$
\begin{align*}
P(\textrm{Test } +)
= &\ P(\textrm{Test } +\ |\ \textrm{Disease}) P(\textrm{Disease}) \\
& + P(\textrm{Test } +\ |\ \textrm{No disease}) P(\textrm{No disease})
\end{align*}
$$

In [14]:

```
0.999 * 1e-4 + (1 - 0.999) * (1 - 1e-4)
```

Out[14]:

0.0010998000000000008

If a subject tests positive, what is the probability they have the disease?

In [15]:

```
with pm.Model() as disease_model:
# This is our prior belief about whether or not the subject has the disease,
# given its prevalence of the disease in the general population
has_disease = pm.Bernoulli('has_disease', 1e-4)
# This is the probability of a positive test given the presence or lack
# of the disease
p_positive = pm.Deterministic('p_positive',
T.switch(T.eq(has_disease, 1),
# If the subject has the disease,
# this is the probability that the
# test is positive
0.999,
# If the subject does not have the
# disease, this is the probability
# that the test is negative
1 - 0.999))
# This is the observed positive test
test_positive = pm.Bernoulli('test_positive', p_positive, observed=1)
```

In [16]:

```
with disease_model:
step = pm.Metropolis()
disease_trace = pm.sample(samples, step, random_seed=SEED)
```

[-----------------100%-----------------] 40000 of 40000 complete in 3.8 sec

The probability that someone who tests positive for the disease has it is

In [17]:

```
disease_trace['has_disease'].mean()
```

Out[17]:

0.096775

$$
\begin{align*}
P(\textrm{Disease}\ |\ \textrm{Test }+)
& = \frac{P(\textrm{Test }+\ |\ \textrm{Disease}) P(\textrm{Disease})}{P(\textrm{Test }+)}
\end{align*}
$$

In [18]:

```
0.999 * 1e-4 / (0.999 * 1e-4 + (1 - 0.999) * (1 - 1e-4))
```

Out[18]:

0.09083469721767587

If we select the first door and Monty opens the third to reveal a goat, should we change doors?

In [19]:

```
with pm.Model() as monty_model:
# We have no idea where the prize is at the beginning
prize = pm.DiscreteUniform('prize', 0, 2)
# The probability that Monty opens each door
p_open = pm.Deterministic('p_open',
T.switch(T.eq(prize, 0),
# If the prize is behind the first door,
# he chooses to open one of the others
# at random
np.array([0., 0.5, 0.5]),
T.switch(T.eq(prize, 1),
# If it is behind the second door,
# he must open the third door
np.array([0., 0., 1.]),
# If it is behind the third door,
# he must open the second door
np.array([0., 1., 0.]))))
# Monty opened the third door, revealing a goat
opened = pm.Categorical('opened', p_open, observed=2)
```

In [20]:

```
samples = 10000
```

In [21]:

```
with monty_model:
step = pm.Metropolis()
monty_trace = pm.sample(samples, step, random_seed=SEED)
monty_trace_df = pm.trace_to_dataframe(monty_trace)
```

[-----------------100%-----------------] 10000 of 10000 complete in 1.4 sec

In [22]:

```
monty_trace_df.head()
```

Out[22]:

p_open__0 | p_open__1 | p_open__2 | prize | |
---|---|---|---|---|

0 | 0.0 | 0.0 | 1.0 | 1 |

1 | 0.0 | 0.5 | 0.5 | 0 |

2 | 0.0 | 0.0 | 1.0 | 1 |

3 | 0.0 | 0.0 | 1.0 | 1 |

4 | 0.0 | 0.0 | 1.0 | 1 |

In [23]:

```
(monty_trace_df.groupby('prize')
.size()
.div(monty_trace_df.shape[0]))
```

Out[23]:

prize 0 0.3196 1 0.6804 dtype: float64

Built on top of `theano`

- Dynamic generation of
`C`

code for models - Automatic differentiation allows for gradient-based inference
- Implements advanced Markov chain Monte carlo algorithms, such at the No U-Turn Sampler
- Implements automatic differentiation variational inference

- Interoperable with both
`pandas`

and`patsy`

- Provides clean support for multilevel models

Suppose variant A sees 510 visitors and 40 purchases, but variant B sees 505 visitors and 50 conversions.

In [24]:

```
with pm.Model() as ab_model:
# A priori, we have no idea what the purchase rate for variant A will be
a_rate = pm.Uniform('a_rate')
# We observed 40 purchases from 510 visitors who saw variant A
a_purchases = pm.Binomial('a_purchases', 510, a_rate, observed=40)
# A priori, we have no idea what the purchase rate for variant B will be
b_rate = pm.Uniform('b_rate')
# We observed 40 purchases from 510 visitors who saw variant A
b_purchases = pm.Binomial('b_purchases', 505, b_rate, observed=45)
```

In [25]:

```
samples = 10000
```

In [26]:

```
with ab_model:
step = pm.NUTS()
ab_trace = pm.sample(samples, step, random_seed=SEED)
```

[-----------------100%-----------------] 10000 of 10000 complete in 3.2 sec

In [28]:

```
fig
```

Out[28]:

The probability that variant B is better than variant A is

In [29]:

```
(ab_trace['a_rate'] < ab_trace['b_rate']).mean()
```

Out[29]:

0.7218

In [32]:

```
fig
```

Out[32]:

In [36]:

```
fig
```

Out[36]:

In [37]:

```
with pm.Model() as ols_model:
# alpha is the slope and beta is the intercept
alpha = pm.Uniform('alpha', -100, 100)
beta = pm.Uniform('beta', -100, 100)
# Ordinary least squares assumes that the noise is normally distributed
y_obs = pm.Normal('y_obs', alpha + beta * x, 1., observed=y)
```

In [38]:

```
samples = 10000
```

In [39]:

```
with ols_model:
step = pm.Metropolis()
ols_trace = pm.sample(samples, step, random_seed=SEED)
```

[-----------------100%-----------------] 10000 of 10000 complete in 2.2 sec

In [40]:

```
post_y = ols_trace['alpha'] + np.outer(plot_X[:, 1], ols_trace['beta'])
```

In [42]:

```
fig
```

Out[42]:

In [44]:

```
fig
```

Out[44]:

In [45]:

```
with pm.Model() as robust_model:
# alpha is the slope and beta is the intercept
alpha = pm.Uniform('alpha', -100, 100)
beta = pm.Uniform('beta', -100, 100)
# t-distributed residuals are less sensitive to outliers
y_obs = pm.StudentT('y_obs', nu=3., mu=alpha + beta * x, observed=y)
```

In [46]:

```
with robust_model:
step = pm.Metropolis()
robust_trace = pm.sample(samples, step, random_seed=SEED)
```

[-----------------100%-----------------] 10000 of 10000 complete in 2.3 sec

In [47]:

```
post_y_robust = robust_trace['alpha'] + np.outer(plot_X[:, 1], robust_trace['beta'])
```

In [49]:

```
fig
```

Out[49]:

In [52]:

```
df.head()
```

Out[52]:

party | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

rep | |||||||||||||||||

0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | NaN | 1 | 1 | 1 | 0 | 1 |

1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | NaN |

2 | 1 | NaN | 1 | 1 | NaN | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 |

3 | 1 | 0 | 1 | 1 | 0 | NaN | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 |

4 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | NaN | 1 | 1 | 1 | 1 |

In [53]:

```
parties
```

Out[53]:

Index(['republican', 'democrat'], dtype='object')

In [56]:

```
grid.fig
```

Out[56]:

$p_{i, j}$ is the probability that representative $i$ votes for bill $j$.

$$ \begin{align*} \log \left(\frac{p_{i, j}}{1 - p_{i, j}}\right) & = \gamma_j \left(\alpha_i - \beta_j\right) \\ & = \textrm{ability of bill } j \textrm{ to discriminate} \times (\textrm{conservativity of represetative } i\ - \textrm{conservativity of bill } j) \end{align*} $$In [57]:

```
with pm.Model() as ideal_point_model:
# The conservativity of representative i
alpha = pm.Normal('alpha', 0, 1., shape=N_REPS)
# The conservativity of bill j
mu_beta = pm.Normal('mu_beta', 0, 1e-3)
sigma_beta = pm.Uniform('sigma_beta', 0, 1e2)
beta = pm.Normal('beta', mu_beta, sd=sigma_beta, shape=N_BILLS)
# The ability of bill j to discriminate
mu_gamma = pm.Normal('mu_gamma', 0, 1e-3)
sigma_gamma = pm.Uniform('sigma_gamma', 0, 1e2)
gamma = pm.Normal('gamma', mu_gamma, sd=sigma_gamma, shape=N_BILLS)
# The probability that representative i votes for bill j
p = pm.Deterministic('p', T.nnet.sigmoid(gamma[vote_df.bill] * (alpha[vote_df.rep] - beta[vote_df.bill])))
# The observed votes
vote = pm.Bernoulli('vote', p, observed=vote_df.vote)
```

In [58]:

```
samples = 20000
```

In [59]:

```
with ideal_point_model:
step = pm.Metropolis()
ideal_point_trace = pm.sample(samples, step, random_seed=SEED)
```

[-----------------100%-----------------] 20000 of 20000 complete in 107.5 sec

In [60]:

```
rep_alpha = ideal_point_trace['alpha'].mean(axis=0)
```

In [62]:

```
fig
```

Out[62]:

In [63]:

```
pm.forestplot(ideal_point_trace, varnames=['gamma']);
```

Source: http://mqscores.berkeley.edu/

Probabilistic Programming and Bayesian Methods for Hackers (Open source, uses PyMC2, PyMC3 port in progress)

Think Bayes (Available freely online, calculations in Python)