Exercise

  1. Use Julia to simulate a vector of 1000 values for $\beta_{1}$ from a normal distribution with mean zero and variance 3. Plot a histogram of these values.

  2. Use $\beta_{0}=1$, $\beta_{1}=2$ and $\sigma_{e}^{2}=5$, to generate a vector of observations, $\mathbf{y},$ that follows a simple linear regression model.

  3. Use the Gibbs sampler to draw 10,000 samples for $\beta_{1}$ from its posterior distribution.

    1. Compute the mean and variance of the sampled values.

    2. Draw a histogram of the sampled values. Compare with prior.

Simulation of Data

In [1]:
using Distributions
using StatsBase
In [11]:
n = 10 #number of observations
k = 1    #number of covariates

x = sample([0,1,2],(n,k))
X = [ones(n) x]

betaTrue = [1,2]
y = X*betaTrue+ randn(n)*sqrt(5);

Calculations in Julia:

In [12]:
# loop for Gibbs sampler
niter = 10000 # number of samples
b     = [0.0, 0.0]
meanB = [0.0, 0.0]
a=Float64[]


varBeta = 3
varRes  = 5
λ       =varRes/varBeta

for iter = 1:niter
    
    # sampling intercept
    w = y - X[:,2] * b[2]
    x = X[:,1]
    xpxi = 1/(x'x)[1,1]
    bHat = (xpxi*x'w)[1,1] 
    b[1] = rand(Normal(bHat, sqrt(xpxi*varRes))) # using residual var = 1 
    
    # sampling slope
    w = y - X[:,1]*b[1]
    x = X[:,2]
    xpxi = 1/((x'x)[1,1]+λ)
    bHat = (xpxi*x'w)[1,1]
    b[2] = rand(Normal(bHat, sqrt(xpxi*varRes))) # using residual var = 1 
    meanB = meanB + b   
    push!(a,b[2])


    if ((iter%1000) == 0)
        @printf("Intercept = %6.3f \n", meanB[1]/iter)
        @printf("Slope     = %6.3f \n", meanB[2]/iter)
    end
end
Intercept =  1.625 
Slope     =  1.285 
Intercept =  1.622 
Slope     =  1.287 
Intercept =  1.653 
Slope     =  1.266 
Intercept =  1.617 
Slope     =  1.287 
Intercept =  1.613 
Slope     =  1.296 
Intercept =  1.620 
Slope     =  1.292 
Intercept =  1.636 
Slope     =  1.283 
Intercept =  1.637 
Slope     =  1.282 
Intercept =  1.633 
Slope     =  1.283 
Intercept =  1.638 
Slope     =  1.279 
In [6]:
using Gadfly
In [7]:
plot(x=randn(1000)*sqrt(3), Geom.histogram, 
Guide.title("Prior distribution of β1"),
Guide.ylabel("Frequency"),
Guide.xlabel("β1"))
Out[7]:
β1 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 -20.0 -19.5 -19.0 -18.5 -18.0 -17.5 -17.0 -16.5 -16.0 -15.5 -15.0 -14.5 -14.0 -13.5 -13.0 -12.5 -12.0 -11.5 -11.0 -10.5 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 20.5 21.0 21.5 22.0 22.5 23.0 23.5 24.0 24.5 25.0 -20 0 20 40 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 -25 0 25 50 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 Frequency Prior distribution of β1
In [13]:
mean(a)
Out[13]:
1.278652924515963
In [14]:
var(a)
Out[14]:
0.6829740692336969
In [15]:
plot(x=a, Geom.histogram, 
Guide.title("Posterior distribution of β1 with n=10"),
Guide.ylabel("Frequency"),
Guide.xlabel("β1"))
Out[15]:
β1 -12.5 -10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 -10 0 10 20 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 -250 -240 -230 -220 -210 -200 -190 -180 -170 -160 -150 -140 -130 -120 -110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 420 430 440 450 460 470 480 490 500 -250 0 250 500 -260 -240 -220 -200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500 Frequency Posterior distribution of β1 with n=10
In [17]:
n = 1000 #number of observations
k = 1    #number of covariates

x = sample([0,1,2],(n,k))
X = [ones(n) x]

betaTrue = [1,2]
y = X*betaTrue+ randn(n)*sqrt(5);
# loop for Gibbs sampler
niter = 10000 # number of samples
b     = [0.0, 0.0]
meanB = [0.0, 0.0]
a=Float64[]


varBeta = 3
varRes  = 5
λ       =varRes/varBeta

for iter = 1:niter
    
    # sampling intercept
    w = y - X[:,2] * b[2]
    x = X[:,1]
    xpxi = 1/(x'x)[1,1]
    bHat = (xpxi*x'w)[1,1] 
    b[1] = rand(Normal(bHat, sqrt(xpxi*varRes))) # using residual var = 1 
    
    # sampling slope
    w = y - X[:,1]*b[1]
    x = X[:,2]
    xpxi = 1/((x'x)[1,1]+λ)
    bHat = (xpxi*x'w)[1,1]
    b[2] = rand(Normal(bHat, sqrt(xpxi*varRes))) # using residual var = 1 
    meanB = meanB + b   
    push!(a,b[2])


    if ((iter%1000) == 0)
        @printf("Intercept = %6.3f \n", meanB[1]/iter)
        @printf("Slope     = %6.3f \n", meanB[2]/iter)
    end
end
Intercept =  0.904 
Slope     =  2.043 
Intercept =  0.893 
Slope     =  2.051 
Intercept =  0.894 
Slope     =  2.050 
Intercept =  0.894 
Slope     =  2.051 
Intercept =  0.892 
Slope     =  2.052 
Intercept =  0.891 
Slope     =  2.053 
Intercept =  0.891 
Slope     =  2.053 
Intercept =  0.893 
Slope     =  2.052 
Intercept =  0.894 
Slope     =  2.051 
Intercept =  0.894 
Slope     =  2.051 
In [18]:
mean(a)
Out[18]:
2.0506345429317
In [19]:
var(a)
Out[19]:
0.00798380824289942
In [21]:
plot(x=a, Geom.histogram, 
Guide.title("Posterior distribution of β1 with n=1000"),
Guide.ylabel("Frequency"),
Guide.xlabel("β1"))
Out[21]:
β1 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 -2.5 -2.4 -2.3 -2.2 -2.1 -2.0 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.0 -2.5 0.0 2.5 5.0 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 -500 -480 -460 -440 -420 -400 -380 -360 -340 -320 -300 -280 -260 -240 -220 -200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500 520 540 560 580 600 620 640 660 680 700 720 740 760 780 800 820 840 860 880 900 920 940 960 980 1000 -500 0 500 1000 -500 -450 -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 Frequency Posterior distribution of β1 with n=1000