Introduction to Monte-Carlo Methods¶

Rohan L. Fernando¶

November 2015¶

Mean and Variance of Truncated Normal¶

Suppose $Y \sim N(\mu_Y,V_Y)$.

The mean and variance of $Y$ given truncation selection are:

\begin{equation} E(Y|Y>t) = \mu_Y + V_Y^{1/2}i \end{equation}

where $$ i = \frac{f(s)}{p} $$ $f(s)$ is the standard normal density function $$ s = \frac{t - \mu_Y}{V_Y^{1/2}} $$ $$ p = \Pr(Y > t) $$

\begin{equation} Var(Y|Y>t) = V_Y[1 - i(i-s)] \end{equation}

Proof:¶

Start with mean and variance for a standard normal variable given truncation selection.

Let $Z \sim N(0,1)$.

The density function of $Z$ is: $$ f(z) = \sqrt{\frac{1}{2\pi}}e^{-\frac{1}{2}z^2} $$

The density function for $Z$ given truncation selection is $$ f(z|z>s) = f(z)/p $$

From the definition of the mean: \begin{equation*} \begin{split} E(Z|Z>s) &= \frac{1}{p} \int_s^{\infty} z f(z)dz\\ &= \frac{1}{p} [-f(z) ] _s^{\infty} \\ &= \frac{f(s)}{p} \\ &= i \end{split} \end{equation*}

because the first derivative of $f(z)$ with respect to $z$ is: \begin{equation*} \begin{split} \frac{d}{dz} f(z) &= \sqrt{\frac{1}{2\pi} } e^{-\frac{1}{2}z^2} (-z)\\ &= -zf(z) \end{split} \end{equation*}

Now, to compute the variance of $Z$ given selection, consider the following identity: \begin{equation*} \begin{split} \frac{d}{dz} z f(z) &= f(z) + z \frac{d}{dz} f(z) \\ &= f(z) - z^2 f(z) \end{split} \end{equation*}

Integrating both sides from $s$ to $\infty$ gives \begin{equation*} zf(z)]_s^\infty = \int_s^\infty f(z) dz - \int_s^\infty z^2 f(z)dz \end{equation*} Upon rearranging this gives: \begin{equation*} \begin{split} \int_s^\infty z^2 f(z)dz &= \int_s^\infty f(z) dz - zf(z)]_s^\infty \\ \frac{1}{p} \int_s^\infty z^2 f(z)dz &= \frac{1}{p} \int_s^\infty f(z) dz + \frac{f(s)}{p}s\\ &= 1 + is \end{split} \end{equation*}

So, \begin{equation} \begin{split} Var(Z|Z>s) &= E(Z^2|Z>s) - [E(Z|Z>s)]^2\\ &= 1 + is - i^2 \\ &= 1 - i(i-s) \end{split} \end{equation}

Results for $Y$¶

Results for $Y$ follow from the fact that $$ \mu_Y + V_Y^{1/2}Z \sim N(\mu_Y,V_Y) $$

So, let $$ Y = \mu_Y + V_Y^{1/2}Z, $$ Then, the condition $$ Y>t $$ is equivalent to \begin{equation*} \begin{split} \mu_Y + V_Y^{1/2}Z &> t \\ V_Y^{1/2}Z &> t - \mu_Y \\ Z &> \frac{t - \mu_Y}{V_Y^{1/2}}\\ Z &> s \end{split} \end{equation*}

Then, \begin{equation*} \begin{split} E(Y|Y>t) &= E(\mu_Y + V_Y^{1/2}Z |Z>s) \\ &= \mu_Y + V_Y^{1/2}i, \end{split} \end{equation*} and \begin{equation*} \begin{split} Var(Y|Y>t) &= Var(\mu_Y + V_Y^{1/2}Z |Z>s) \\ &= V_Y[1 - i(i-s)] \end{split} \end{equation*}

Numerical Example¶

In [21]:

using Distributions 
μ = 10
σ = 10
t = 15
s = (t-μ)/σ
d = Normal(0.0,1.0)
i = pdf(d,s)/(1-cdf(d,s))
meanTruncatedNormal = μ + σ*i
variTruncatedNormal = σ*σ*(1 - i*(i-s))
@printf "mean     = %8.2f  \n" meanTruncatedNormal
@printf "variance = %8.2f  \n" variTruncatedNormal

mean     =    21.41  
variance =    26.85

Monte-Carlo Approach:¶

In [29]:

μ = 10
σ = 10
z = rand(Normal(μ,σ),100000);

In [30]:

mcmcMean = mean(z[z.>t])
mcmcVar = var(z[z.>t])
@printf "MC mean     = %8.2f  \n" mcmcMean
@printf "MC variance = %8.2f  \n" mcmcVar

MC mean     =    21.38  
MC variance =    26.79

Bivariate Normal Example¶

Let $\mathbf(Y) \sim N(\mathbf{\mu},\mathbf{V})$

$ \mathbf{\mu} = \begin{bmatrix} 10\\ 20 \end{bmatrix}, $ $ \mathbf{V} = \begin{bmatrix} 100 & 50\\ 50 & 200 \end{bmatrix} $

In [15]:

μ = [10.0;20.0]
V = [100.0 50.0
    50.0  200.0]
d = MvNormal(μ,V)
XY = rand(d,10000)'

Out[15]:

10000x2 Array{Float64,2}:
 11.9285    19.1144 
 19.2509    36.145  
  4.45011   23.9638 
 20.9049    36.2345 
 14.7587    29.454  
  7.85305   29.5753 
 13.3294    25.8031 
 17.942     21.6667 
  2.00279   20.1188 
 18.2081    29.531  
 22.5693     9.33006
 -6.57064    1.51851
  8.77239   28.1197 
  ⋮                 
 18.5498    51.7603 
 33.5773    27.9552 
 18.3895    26.4949 
 24.3136    46.4041 
  6.92297    6.27365
 18.2313    22.6488 
 -8.52665   28.2996 
 20.986     16.3011 
 -0.377664  24.9399 
 24.0369     7.16565
 12.1059    21.147  
  7.17966   23.7316

In [16]:

sel = XY[:,1].>10

Out[16]:

10000-element BitArray{1}:
  true
  true
 false
  true
  true
 false
  true
  true
 false
  true
  true
 false
 false
     ⋮
  true
  true
  true
  true
 false
  true
 false
  true
 false
  true
  true
 false

In [17]:

selY = XY[sel,2]

Out[17]:

5007-element Array{Float64,1}:
 19.1144 
 36.145  
 36.2345 
 29.454  
 25.8031 
 21.6667 
 29.531  
  9.33006
 25.2757 
 52.031  
 34.1789 
 58.6182 
 10.67   
  ⋮      
 23.9181 
 36.876  
 25.7362 
 17.0781 
 51.7603 
 27.9552 
 26.4949 
 46.4041 
 22.6488 
 16.3011 
  7.16565
 21.147

In [18]:

mean(selY[selY.>30])

Out[18]:

39.060176445384066

In [19]:

var(selY[selY.>30])

Out[19]:

53.619263788780245

Markov Chain Monte-Carlo Methods ¶

Often no closed form for $f(\mathbf{\theta}|\mathbf{y})$
Further, even if computing $f(\theta|{\mathbf y})$ is feasible, obtaining $f(\theta_{i}|{\mathbf y})$ would require integrating over many dimensions
Thus, in many situations, inferences are made using the empirical posterior constructed by drawing samples from $f({\mathbf \theta}|{\mathbf y})$
Gibbs sampler is widely used for drawing samples from posteriors

Gibbs Sampler¶

Want to draw samples from $f(x_{1},x_{2},\ldots,x_{n})$
Even though it may be possible to compute $f(x_{1},x_{2},\ldots,x_{n})$, it is difficult to draw samples directly from $f(x_{1},x_{2},\ldots,x_{n})$
Gibbs:
- Get valid a starting point $\mathbf{x}^{0}$
- Draw sample $\mathbf{x}^{t}$ as: $$\begin{matrix}x_{1}^{t} & \text{from} & f(x_{1}|x_{2}^{t-1},x_{3}^{t-1},\ldots,x_{n}^{t-1})\\ x_{2}^{t} & \text{from} & f(x_{2}|x_{1}^{t},x_{3}^{t-1},\ldots,x_{n}^{t-1})\\ x_{3}^{t} & \text{from} & f(x_{3}|x_{1}^{t},x_{2}^{t},\ldots,x_{n}^{t-1})\\ \vdots & & \vdots\\ x_{n}^{t} & \text{from} & f(x_{n}|x_{1}^{t},x_{2}^{t},\ldots,x_{n-1}^{t}) \end{matrix}$$
The sequence ${\mathbf x}^{1},{\mathbf x}^{2},\ldots,{\mathbf x}^{n}$ is a Markov chain with stationary distribution $f(x_{1},x_{2},\ldots,x_{n})$

Making Inferences from Markov Chain¶

Can show that samples obtained from a Markov chain can be used to draw inferences from $f(x_{1},x_{2},\ldots,x_{n})$ provided the chain is:

Irreducible: can move from any state $i$ to any other state $j$
Positive recurrent: return time to any state has finite expectation
Markov Chains, J. R. Norris (1997)

Bivariate Normal Example¶

Let $f(\mathbf{x})$ be a bivariate normal density with means $$ \mu' = \begin{bmatrix} 1 & 2 \end{bmatrix} $$ and covariance matrix $$ \mathbf{V} = \begin{bmatrix} 1 & 0.5\\ 0.5& 2.0 \end{bmatrix} $$

Suppose we do not know how to draw samples from $f(\mathbf{x})$, but know how to draw samples from $f(x_i|x_j)$, which is univariate normal with mean: $$ \mu_{i.j} = \mu_i + \frac{v_{ij}}{v_{jj}}(x_j - \mu_j) $$ and variance $$ v_{i.j} = v_{ii} - \frac{v^2_{ij}}{v_{jj}} $$

In [37]:

m = fill(0,2)
nSamples = 20000
m = [1.0, 2.0]
v = [1.0 0.5; 0.5 2.0]
y   = fill(0.0,2)
save = fill(0.0,2,nSamples)
sum = fill(0.0,2)
s12 = sqrt( v[1,1] - v[1,2]*v[1,2]/v[2,2])
s21 = sqrt(v[2,2] -  v[1,2]*v[1,2]/v[1,1])
m1 = 0
m2 = 0;
for (iter in 1:nSamples)
    m12 = m[1] + v[1,2]/v[2,2]*(y[2] - m[2])
    y[1] = rand(Normal(m12,s12),1)[1]
    m21 = m[2] + v[1,2]/v[1,1]*(y[1] - m[1])
    y[2] = rand(Normal(m21,s21),1)[1]
    sum += y
    save[:,iter] = y 
    mean = sum/iter
    if iter%1000 == 0 
        @printf "%10d %8.2f %8.2f \n" iter mean[1]  mean[2]
    end
end

      1000     1.00     1.96 
      2000     1.02     2.01 
      3000     1.01     2.00 
      4000     1.02     2.02 
      5000     1.01     2.01 
      6000     1.01     2.01 
      7000     1.01     2.02 
      8000     1.00     2.01 
      9000     1.00     2.00 
     10000     1.00     2.00 
     11000     1.00     2.01 
     12000     1.00     2.00 
     13000     1.01     2.00 
     14000     1.01     2.00 
     15000     1.01     2.00 
     16000     1.00     1.99 
     17000     1.00     1.99 
     18000     1.00     1.99 
     19000     1.00     1.99 
     20000     1.00     2.00

In [38]:

cov(save',)

Out[38]:

2x2 Array{Float64,2}:
 1.00471   0.499934
 0.499934  1.97734

Metropolis-Hastings Algorithm¶

Sometimes may not be able to draw samples directly from $f(x_i|\mathbf{x}_{i\_})$
Convergence of the Gibbs sampler may be too slow
Metropolis-Hastings (MH) for sampling from $f(\mathbf{x})$:

a candidate sample, $y$, is drawn from a proposal distribution $q(y|x^{t-1})$

$$ x^t = \begin{cases} y & \text{with probability}\, \alpha \\ x^{t-1} & \text{with probability}\, 1 - \alpha \\ \end{cases} $$

$$ \alpha = \min(1,\frac{f(y)q(x^{t-1}|y)}{f(x^{t-1})q(y|x^{t-1})}) $$

The samples from MH is a Markov chain with stationary distribution $f(x)$

Bivariate Normal Example¶

In [20]:

nSamples = 10000
m = [1.0, 2.0]
v = [1.0 0.5; 0.5 2.0]
vi = inv(v)
y   = fill(0.0,2)
sum = fill(0.0,2)

m1 = 0
m2 = 0
xx = 0
y1 = 0
delta = 1.0
min1 = -delta*sqrt(v[1,1])
max1 = +delta*sqrt(v[1,1])
min2 = -delta*sqrt(v[2,2])
max2 = +delta*sqrt(v[2,2])
z = y-m
denOld = exp(-0.5*z'*vi*z)
d1 = Uniform(min1,max1)
d2 = Uniform(min2,max2)
ynew = fill(0.0,2);
for (iter in 1:nSamples)
    ynew[1] = y[1] + rand(d1,1)[1]
    ynew[2] = y[2] + rand(d2,1)[1]
   denNew = exp(-0.5*(ynew-m)'*vi*(ynew-m));
   alpha = denNew/denOld;
    u = rand()
    if (u < alpha[1]) 
        y = copy(ynew)
   		denOld = exp(-0.5*(y-m)'*vi*(y-m)) 
    end
    sum += y
    mean = sum/iter
    if iter%1000 == 0 
        @printf "%10d %8.2f %8.2f \n" iter mean[1]  mean[2]
    end
end

      1000     0.93     2.14 
      2000     0.96     2.07 
      3000     0.94     2.06 
      4000     0.96     2.07 
      5000     0.94     2.04 
      6000     0.94     2.02 
      7000     0.93     2.01 
      8000     0.95     2.01 
      9000     0.97     2.00 
     10000     0.97     1.98

Introduction to Monte-Carlo Methods¶

Rohan L. Fernando¶

November 2015¶

Mean and Variance of Truncated Normal¶

Proof:¶

Results for $Y$¶

Numerical Example¶

Monte-Carlo Approach:¶

Bivariate Normal Example¶

Markov Chain Monte-Carlo Methods¶

Gibbs Sampler¶

Making Inferences from Markov Chain¶

Bivariate Normal Example¶

Metropolis-Hastings Algorithm¶

Bivariate Normal Example¶

Markov Chain Monte-Carlo Methods ¶