# Changepoint problem - Gibbs Sampler¶

This is based on Example 5.1 (pages 143-5) – Figures 5.1 (page 144) and 5.2 (page 146) of Gamerman and Lopes (2006) which in turn is based on an example of Carlin, Gelfand and Smith (1992)

# Poisson data with change point and conditionally conjugate Gamma priors:¶

Essentially we consider count data where: $$Y_i ~ Poisson(\lambda_i)$$ subject to

$\lambda_i = \lambda_b$ if year $<$ changepoint, and $\lambda_i = \lambda_a$ otherwise. Assume $\lambda_b ~ Gamma(\alpha, \beta)$ and $\lambda_a ~ Gamma(\gamma, \delta)$. I think we have discrete uniform prior on $Pr[Change = Year]$. The posterior density is given by:

$$f(\lambda_b, \lambda_a, changepoint) \propto f(y_1, \ldots, y_n | \lambda_b, \lambda_a, changepoint)p(\lambda_b, \lambda_a, changepoint). Setting the two sufficient statistics s_b=\sum_{i=1}^{changepoint} y_i and s_a = sum_{changepoint+1}^n y_i we have the posterior as:$$f(\lambda_b, \lambda_a, changepoint) \propto \lambda_b^{\alpha + s_m - 1} e^{-(\beta+changepoint)\lambda_b} \lambda_a^{\gamma+s_a-1} e^{-(\delta s_a+n-changepoint)\lambda_a}$$This has three simple conditional densities which are specified later. In total, there are 112 observations of y_i from 1851 to 1962. In [3]: from numpy import array %load_ext rmagic  First, load the data and plot it. Note that we have years in actual years, and idx as a counter for years. In [11]: disasters = ( 4, 5, 4, 0, 1, 4, 3, 4, 0, 6, 3, 3, 4, 0, 2, 6, 3, 3, 5, 4, 5, 3, 1, 4, 4, 1, 5, 5, 3, 4, 2, 5, 2, 2, 3, 4, 2, 1, 3, 2, 2, 1, 1, 1, 1, 3, 0, 0, 1, 0, 1, 1, 0, 0, 3, 1, 0, 3, 2, 2, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 2, 1, 0, 0, 0, 1, 1, 0, 2, 3, 3, 1, 1, 2, 1, 1, 1, 1, 2, 4, 2, 0, 0, 1, 4, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1) years = range(1851,1962) len(disasters) idx = range(111)  Out[11]: 111 In [12]: %Rpush years disasters %R plot(years, disasters)  Next, obtain cumulative totals. Note that this is a slightly longwinded way of doing things, but we had an ulterior motive for wanting to use as much linear algebra as possible. In [13]: import numpy as np idxl = np.tril_indices(111) idxu = np.triu_indices(111) ##create blank matrices of zeros a=np.zeros(shape=(111,111)) b=np.zeros(shape=(111,111)) ## Fill the upper triangle with ones a[idxl] = 1 b[idxu] = 1 ## Now get cumulative sums cumsumafter = np.dot(b, disasters) cumsumbefore = np.dot(a, disasters) ## Now append zero at start/end as appropriate. cumsumafter = np.append(cumsumafter, 0) cumsumbefore = np.append(0, cumsumbefore)  Very simple plot of cumulative sums seems to show a change point. In [16]: %Rpush years cumsumbefore %R plot(years, cumsumbefore[-1])  In [18]: from scipy import random, exp #Provide values for the priors alphaprior = 0.001 betaprior = 0.001 gammaprior = 0.001 deltaprior = 0.001  ## The last full conditional¶ We need to do the last thing first and define a function to calculate the messiest full conditional; namelt:$$f(changepoint) = \frac{\lambda_b^{\alpha + s_m - 1} e^{-(\beta+changepoint)\lambda_b} \lambda_a^{\gamma+s_a-1} e^{-(\delta s_a+n-changepoint)\lambda_a}}{\sum_{i=1}^n \lambda_b^{\alpha + s_m - 1} e^{-(\beta+changepoint)\lambda_b} \lambda_a^{\gamma+s_a-1} e^{-(\delta s_a+n-changepoint)\lambda_a}}$$• This ought perhaps be calculated as logs first (computational overflow looks like a risk here) • This is computed below by a loop which is inefficient • But first, a small function to calculate the numerator (and at least make sure this works OK) Having computed these values, we can sample (random.choice in scipy) a changepoint value. In [19]: def mprob(lb, la, ap, bp, gp, dp, m, sm, sl): return (lb**(ap+sm-1))*(exp(-1*((bp)+m)*lb))*(la**(gp+sl-1))*(exp(-1*((dp)+(111-m))*la))  blah=np.zeros(111) for i in range(111): blah[i] = mprob(lambdabefore, lambdaafter, alphaprior, betaprior, gammaprior, deltaprior, i, cumsumbefore[i], cumsumafter[i]) condalpha = blah / sum(blah) m=random.choice(range(111), size=1, replace = 0, p=condalpha) print lambdabefore print lambdaafter The easier full conditionals to simulate from are the two Poisson means.$$f(\lambda_b) = Gamma(\alpha + s_m, \beta+changepoint)$$and$$f(\lambda_b) = Gamma(\gamma + s_l, \delta + (n-changepoint))

This requires only the sufficient statistics $s_m$, $s_l$ and the current value of changepoint.

The Gibbs sampler proceeds in three stages:

1. Simulate $\lambda_b$ given current sufficient statistics and parameter value
2. Simulate $\lambda_a$ given current (updated) sufficient statistics and parameter values,
3. Simulate $changepoint$ given current (updated) sufficient statistics and parameter values. Repeat.
In [24]:
outlambdabefore=np.zeros(5000)
outlambdaafter=np.zeros(5000)
outm=np.zeros(5000)

outlambdabefore[0]=12
outlambdaafter[0]=3
outm[0]=60

for iter in range(1,5000):
outlambdabefore[iter] = random.gamma(alphaprior+cumsumbefore[outm[iter-1]], 1.0/(betaprior+outm[iter-1]))
outlambdaafter[iter] = random.gamma(gammaprior+cumsumafter[outm[iter-1]], 1.0/(deltaprior+111-outm[iter-1]))

blah=np.zeros(111)
for i in range(111):
blah[i] = mprob(outlambdabefore[iter], outlambdaafter[iter], alphaprior, betaprior, gammaprior, deltaprior, i, cumsumbefore[i], cumsumafter[i])

condalpha =  blah / sum(blah)

outm[iter]=random.choice(range(0,111), size=1, replace = 0, p=condalpha)


In [32]:
%Rpush outlambdabefore outlambdaafter outm
%R plot(c(1:5000), outlambdabefore, type = "l")
%R par(mfrow = c(2,2)); hist(outlambdabefore[c(1000:5000)], xlim = c(0,5)); hist(outlambdaafter[c(1000:5000)], xlim =c(0,5))
%R plot(xtabs(~outm[c(1000:5000)]))
%R acf(outlambdabefore[c(1000:5000)]) ; acf(outlambdaafter[c(1000:5000)])


Out[32]:
array([<rpy2.rinterface.SexpVector - Python:0x109ad4480 / R:0x7fe03ab56dd0>,
rpy2.rinterface.NULL], dtype=object)
• Note we are throwing away the first 1000 values as a burn in when plotting the posterior simulations
• Note also that the change point is defined in terms of the index rather than years. Add 1851 to these values (the posterior mode should be around 1890)

# Conclusions and to-do¶

This was a quick sketch of a Gibbs sampler for a rather famous change point problem. The first thing to note is that analytic values are available and so the answers can be checked.

As a programming exercise there is still a little work: the way the posterior value for the change point is calculated currently uses a for loop which could be vectorised. There are a few other efficiency tweaks that should be made. But it appears that on the whole this works.