%matplotlib inline
import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-darkgrid')
print('Running on PyMC3 v{}'.format(pm.__version__))
Running on PyMC3 v3.4.1
/home/junpenglao/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters
Consider the following AR(1) process, initialized in the infinite past: $$ y_t = \theta y_{t-1} + \epsilon_t, $$ where $\epsilon_t \sim iid{\cal N}(0,1)$. Suppose you'd like to learn about $\theta$ from a a sample of observations $Y^T = \{ y_0, y_1,\ldots, y_T \}$.
First, let's generate our sample.
np.random.seed(seed=1848)
T = 100
y = np.zeros((T,))
for i in range(1,T):
y[i] = 0.95 * y[i-1] + np.random.normal()
plt.plot(y);
Consider the following prior for $\theta$: $\theta \sim {\cal N}(0,\tau^2)$. We can show that the posterior distribution of $\theta$ is of the form
$$ \theta |Y^T \sim {\cal N}( \tilde{\theta}_T, \tilde{V}_T), $$where
$$ \begin{eqnarray} \tilde{\theta}_T &=& \left( \sum_{t=1}^T y_{t-1}^2 + \tau^{-2} \right)^{-1} \sum_{t=1}^T y_{t}y_{t-1} \\ \tilde{V}_T &=& \left( \sum_{t=1}^T y_{t-1}^2 + \tau^{-2} \right)^{-1} \end{eqnarray} $$tau = 1.0
with pm.Model() as ar1:
beta = pm.Normal('beta', mu=0, sd=tau)
data = pm.AR('y', beta, sd=1.0, observed=y)
trace = pm.sample(1000, cores=4)
pm.traceplot(trace);
Auto-assigning NUTS sampler... Initializing NUTS using jitter+adapt_diag... Multiprocess sampling (4 chains in 4 jobs) NUTS: [beta] Sampling 4 chains: 100%|██████████| 6000/6000 [00:01<00:00, 3904.61draws/s] The acceptance probability does not match the target. It is 0.8859800580282127, but should be close to 0.8. Try to increase the number of tuning steps.
mup = ((y[:-1]**2).sum() + tau**-2)**-1 * np.dot(y[:-1],y[1:])
Vp = ((y[:-1]**2).sum() + tau**-2)**-1
print('Mean: {:5.3f} (exact = {:5.3f})'.format(trace['beta'].mean(), mup))
print('Std: {:5.3f} (exact = {:5.3f})'.format(trace['beta'].std(), np.sqrt(Vp)))
Mean: 0.924 (exact = 0.925) Std: 0.040 (exact = 0.040)
import pandas as p
from scipy.stats import norm
ax=p.Series(trace['beta']).plot(kind='kde')
xgrid = np.linspace(0.4, 1.2, 1000)
fgrid = norm(loc=mup, scale=np.sqrt(Vp)).pdf(xgrid)
ax.plot(xgrid,fgrid);
We can instead estimate an AR(2) model using pyMC3.
$$
y_t = \phi_1 y_{t-1} + \phi_2 y_{t-2} + \epsilon_t.
$$
The AR
distribution infers the order of the process by size the of rho
argmument passed to AR
.
with pm.Model() as ar2:
beta = pm.Normal('beta', mu=0, sd=tau, shape=2)
data = pm.AR('y', beta, sd=1.0, observed=y)
trace = pm.sample(1000, cores=4)
pm.traceplot(trace);
Auto-assigning NUTS sampler... Initializing NUTS using jitter+adapt_diag... Multiprocess sampling (4 chains in 4 jobs) NUTS: [beta] Sampling 4 chains: 100%|██████████| 6000/6000 [00:02<00:00, 2092.25draws/s] The number of effective samples is smaller than 25% for some parameters.
You can also pass the set of AR parameters as a list.
with pm.Model() as ar2:
beta = pm.Normal('beta', mu=0, sd=tau)
beta2 = pm.Uniform('beta2')
data = pm.AR('y', [beta, beta2], sd=1.0, observed=y)
trace = pm.sample(1000, tune=1000, cores=4)
pm.traceplot(trace);
Auto-assigning NUTS sampler... Initializing NUTS using jitter+adapt_diag... Multiprocess sampling (4 chains in 4 jobs) NUTS: [beta2, beta] Sampling 4 chains: 100%|██████████| 8000/8000 [00:05<00:00, 1464.13draws/s] The number of effective samples is smaller than 25% for some parameters.