Consider the multiple regression model
$$y_{i}=\beta_{0}+\sum_{j}x_{ij}\beta_{j}+e_{i}\tag{ 2}$$which extends model (1) to include multiple covariates $x_{ij}$. In matrix notation, this model can be written as
$$ \mathbf{y}=\mathbf{X}\boldsymbol{\beta}+\mathbf{e}, $$where $\boldsymbol{\beta}'=[\beta_{0,}\beta_{1},\beta_{2},\ldots,\beta_{k}]$ and the matrix $\mathbf{X}$ contains the corresponding covariates.
Here we consider a model with a flat prior for $\beta_{0}$ and iid normal priors for the slopes:
$$ \beta_{j}\sim N(0,\sigma_{\beta}^{2})\:\text{for}\, j=1,2,\ldots,k, $$where $\sigma_{\beta}^{2}$ is assumed to be known. The residuals are assumed iid normal with null mean and variance $\sigma_{e}^{2}$, which itself is assigned a scaled inverted chi-square prior. Then, the joint posterior for $\boldsymbol{\theta}$ is
$$\begin{aligned} f(\boldsymbol{\theta}|\mathbf{y}) & \propto f(\mathbf{y}|\boldsymbol{\theta})f(\boldsymbol{\theta})\\ & \propto \left(\sigma_{e}^{2}\right)^{-n/2}\exp\left\{ -\frac{(\mathbf{y}-\mathbf{\mathbf{X}}\boldsymbol{\beta})'(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})}{2\sigma_{e}^{2}}\right\} \\ & \times \left(\sigma_{\beta}^{2}\right)^{-k/2}\exp\left\{ -\frac{\sum_{j=1}^{k}\beta_{j}^{2}}{2\sigma_{\beta}^{2}}\right\} \\ & \times (\sigma_{e}^{2})^{-(2+\nu_{e})/2}\exp\left\{ -\frac{\nu_{e}S_{e}^{2}}{2\sigma_{e}^{2}}\right\} .\\ \\\end{aligned}$$The posterior distribution for $\boldsymbol{\beta}$ can be written as
$$\begin{aligned} f(\boldsymbol{\beta}|\mathbf{y},\sigma_{\beta}^{2},\sigma_{e}^{2}) & =\frac{f(\mathbf{y}|\boldsymbol{\beta},\sigma_{\beta}^{2},\sigma_{e}^{2})f(\boldsymbol{\beta}|\sigma_{\beta}^{2})f(\sigma_{e}^{2})}{f(\mathbf{y},\sigma_{\beta}^{2},\sigma_{e}^{2})}\\ & \propto f(\mathbf{y}|\boldsymbol{\beta},\sigma_{\beta}^{2},\sigma_{e}^{2})f(\boldsymbol{\beta}|\sigma_{\beta}^{2})f(\sigma_{e}^{2})\\ & \propto f(\mathbf{y}|\boldsymbol{\beta},\sigma_{\beta}^{2},\sigma_{e}^{2})f(\boldsymbol{\beta}|\sigma_{\beta}^{2})\\ & \propto \left(\sigma_{e}^{2}\right)^{-n/2}\exp\left\{ -\frac{(\mathbf{y}-\mathbf{\mathbf{X}}\boldsymbol{\beta})'(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})}{2\sigma_{e}^{2}}\right\} \\ & \times \left(\sigma_{\beta}^{2}\right)^{-k/2}\exp\left\{ -\frac{\sum_{j=1}^{k}\beta_{j}^{2}}{2\sigma_{\beta}^{2}}\right\} \\ & \propto \exp\left\{ -\frac{(\mathbf{y}-\mathbf{\mathbf{X}}\boldsymbol{\beta})'(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})+\sum_{j=1}^{k}\beta_{j}^{2}\frac{\sigma_{e}^{2}}{\sigma_{\beta}^{2}}}{2\sigma_{e}^{2}}\right\} \\ & \propto \exp\left\{ -\frac{\mathbf{y}'\mathbf{y}-2\mathbf{y}'\mathbf{X}\boldsymbol{\beta}+\boldsymbol{\beta}'(\mathbf{X}'\mathbf{X}+\mathbf{D}\frac{\sigma_{e}^{2}}{\sigma_{\beta}^{2}})\boldsymbol{\beta}}{2\sigma_{e}^{2}}\right\} \\ & \propto \exp\left\{ -\frac{\mathbf{y}'\mathbf{y}-(\boldsymbol{\beta}-\hat{\boldsymbol{\beta})}'(\mathbf{X}'\mathbf{X}+\mathbf{D}\frac{\sigma_{e}^{2}}{\sigma_{\beta}^{2}})(\boldsymbol{\beta}-\hat{\boldsymbol{\beta})}-\hat{\boldsymbol{\beta}}'(\mathbf{X}'\mathbf{X}+\mathbf{D}\frac{\sigma_{e}^{2}}{\sigma_{\beta}^{2}})\hat{\boldsymbol{\beta}}}{2\sigma_{e}^{2}}\right\} \\ & \propto \exp\left\{ -\frac{(\boldsymbol{\beta}-\hat{\boldsymbol{\beta})}'(\mathbf{X}'\mathbf{X}+\mathbf{D}\frac{\sigma_{e}^{2}}{\sigma_{\beta}^{2}})(\boldsymbol{\beta}-\hat{\boldsymbol{\beta})}}{2\sigma_{e}^{2}}\right\} , \end{aligned}$$for
$$(\mathbf{X}'\mathbf{X}+\mathbf{D}\frac{\sigma_{e}^{2}}{\sigma_{\beta}^{2}})\hat{\boldsymbol{\beta}}=\mathbf{X}'\mathbf{y},\tag{3}$$where $\mathbf{D}$ is a diagonal matrix with zero on the first diagonal and ones on the remaining diagonals. Thus, the full-conditional posterior for $\boldsymbol{\beta}$ is a normal distribution with mean given by (3) and variance $(\mathbf{X}'\mathbf{X}+\mathbf{D}\frac{\sigma_{e}^{2}}{\sigma_{\beta}^{2}})^{-1}\sigma_{e}^{2}.$
The full conditionals for $\beta_{0}$ and $\sigma_{e}^{2}$ are identical to those in simple linear regression.
The full-conditional for $\beta_{j}$ is obtained by dropping from the joint posterior all terms and factors that do not involve $\beta_{j}$:
$$\begin{aligned} f(\beta_{j}|\text{ELSE}) & \propto \exp\left\{ -\frac{(\mathbf{w}_{j}-\mathbf{x}_{j}\beta_{j})'(\mathbf{w}_{j}-\mathbf{x}_{j}\beta_{j})}{2\sigma_{e}^{2}}\right\} \\ & \times \exp\left\{ -\frac{\beta_{j}^{2}}{2\sigma_{\beta}^{2}}\right\} \\ & \propto \exp\left\{ -\frac{\mathbf{w}_{j}'\mathbf{w}_{j}-2\mathbf{w}_{j}'\mathbf{x}_{j}\beta_{j}+\beta_{j}^{2}(\mathbf{x}_{j}'\mathbf{x}_{j}+\sigma_{e}^{2}/\sigma_{\beta}^{2})}{2\sigma_{e}^{2}}\right\} \\ & \propto \exp\left\{ -\frac{\mathbf{w}_{j}'\mathbf{w}_{j}-(\beta_{j}-\hat{\beta}_{j})^{2}(\mathbf{x}_{j}'\mathbf{x}_{j}+\sigma_{e}^{2}/\sigma_{\beta}^{2})-\hat{\beta}_{j}^{2}(\mathbf{x}_{j}'\mathbf{x}_{j}+\sigma_{e}^{2}/\sigma_{\beta}^{2})}{2\sigma_{e}^{2}}\right\} \\ & \propto \exp\left\{ -\frac{(\beta_{j}-\hat{\beta}_{j})^{2}}{\frac{2\sigma_{e}^{2}}{(\mathbf{x}_{j}'\mathbf{x}_{j}+\sigma_{e}^{2}/\sigma_{\beta}^{2})}}\right\} , \end{aligned}$$where $$ \hat{\beta}_{j}=\frac{\mathbf{x}_{j}'\mathbf{w}_{j}}{(\mathbf{x}_{j}'\mathbf{x}_{j}+\sigma_{e}^{2}/\sigma_{\beta}^{2})}, $$ and $\mathbf{w}_{j}=\mathbf{y}- \mathbf{1}\beta_0 - \sum_{l\neq j}\mathbf{x}_{l}\beta_{l}.$ So, the full-conditional posterior for $\beta_{j}$ is a normal distribution with mean $\hat{\beta}_{j}$ and variance $\frac{\sigma_{e}^{2}}{(\mathbf{x}_{j}'\mathbf{x}_{j}+\sigma_{e}^{2}/\sigma_{\beta}^{2})}.$
generate a data set with 10 observations from model (2) with $k=15$ covariates.
Setup and solve the mixed model equations given by (3).
Sample the elements of $\boldsymbol{\beta}$ using Gibbs.
Compute the posterior mean of $\boldsymbol{\beta}$ from the samples and compare with the mixed model solutions.
Compute the posterior covariance matrix from the sampled values. Compare results with inverse of the mixed-model coefficient matrix.
In the previous section, we assumed that $\sigma_{\beta}^{2}$ in the prior of the slopes was known. Here, we will consider this variance to be unknown with a scaled inverted chi-square prior with scale parameter $S_{\beta}^{2}$ and degrees of freedom $\nu_{\beta}.$ The joint posterior for this model is
$$\begin{aligned} f(\boldsymbol{\theta}|\mathbf{y}) & \propto f(\mathbf{y}|\boldsymbol{\theta})f(\boldsymbol{\theta})\\ & \propto \left(\sigma_{e}^{2}\right)^{-n/2}\exp\left\{ -\frac{(\mathbf{y}-\mathbf{\mathbf{X}}\boldsymbol{\beta})'(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})}{2\sigma_{e}^{2}}\right\} \\ & \times \left(\sigma_{\beta}^{2}\right)^{-k/2}\exp\left\{ -\frac{\sum_{j=1}^{k}\beta_{j}^{2}}{2\sigma_{\beta}^{2}}\right\} \\ & \times \left(\sigma_{\beta}^{2}\right)^{-(2+\nu_{\beta})/2}\exp\left\{ -\frac{\nu_{\beta}S_{\beta}^{2}}{2\sigma_{\beta}^{2}}\right\} \\ & \times (\sigma_{e}^{2})^{-(2+\nu_{e})/2}\exp\left\{ -\frac{\nu_{e}S_{e}^{2}}{2\sigma_{e}^{2}}\right\} .\\\end{aligned}$$Then, the full-conditional posterior for $\sigma_{\beta}^{2}$ is
$$\begin{aligned} f(\sigma_{\beta}^{2}|\mathbf{y},\boldsymbol{\beta},\sigma_{e}^{2}) & \propto \left(\sigma_{\beta}^{2}\right)^{-k/2}\exp\left\{ -\frac{\sum_{j=1}^{k}\beta_{j}^{2}}{2\sigma_{\beta}^{2}}\right\} \\ & \times \left(\sigma_{\beta}^{2}\right)^{-(2+\nu_{\beta})/2}\exp\left\{ -\frac{\nu_{\beta}S_{\beta}^{2}}{2\sigma_{\beta}^{2}}\right\} \\ & \propto \left(\sigma_{\beta}^{2}\right)^{-(2+k+\nu_{\beta})/2}\exp\left\{ -\frac{\sum_{j=1}^{k}\beta_{j}^{2}+\nu_{\beta}S_{\beta}^{2}}{2\sigma_{\beta}^{2}}\right\} , \end{aligned}$$which can be recognized as a scaled inverted chi-square distribution with $\tilde{\nu}_{\beta}=k+\nu_{\beta}$ degrees of freedom and scale parameter $\tilde{S}_{\beta}^{2}=(\sum_{j=1}^{k}\beta_{j}^{2}+\nu_{\beta}S_{\beta}^{2})/\tilde{\nu}_{\beta}.$ A sample from this posterior can be obtained as $\frac{\sum_{j=1}^{k}\beta_{j}^{2}+\nu_{\beta}S_{\beta}^{2}}{\chi_{\tilde{\nu}_{\beta}}^{2}}$.
Extend the sampler used in the previous section to treat $\sigma_{\beta}^{2}$ as an unknown. Plot the posterior distribution for $\sigma_{\beta}^{2}$.
Here we consider a model where the prior for the slope corresponding to covariate $j$ is normal with mean 0 and variance $\sigma_{j}^{2}$, where $\sigma_{j}^{2}$ has scaled inverted chi-square prior with scale parameter $S_{\beta}^{2}$ and degrees of freedom $\nu_{\beta}.$ The joint posterior for this model is
$$\begin{aligned} f(\boldsymbol{\theta}|\mathbf{y}) & \propto f(\mathbf{y}|\boldsymbol{\theta})f(\boldsymbol{\theta})\\ & \propto \left(\sigma_{e}^{2}\right)^{-n/2}\exp\left\{ -\frac{(\mathbf{y}-\mathbf{\mathbf{X}}\boldsymbol{\beta})'(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})}{2\sigma_{e}^{2}}\right\} \\ & \times \prod_{j=1}^{k}\left(\sigma_{j}^{2}\right)^{-1/2}\exp\left\{ -\frac{\beta_{j}^{2}}{2\sigma_{j}^{2}}\right\} \\ & \times \prod_{j=1}^{k}\left(\sigma_{j}^{2}\right)^{-(2+\nu_{\beta})/2}\exp\left\{ -\frac{\nu_{\beta}S_{\beta}^{2}}{2\sigma_{j}^{2}}\right\} \\ & \times (\sigma_{e}^{2})^{-(2+\nu_{e})/2}\exp\left\{ -\frac{\nu_{e}S_{e}^{2}}{2\sigma_{e}^{2}}\right\} .\\\end{aligned}$$It can be shown that:
The full-conditional posterior for $\beta_{j}$ is normal with mean $$\hat{\beta}_{j}=\frac{\mathbf{x}_{j}'\mathbf{w}_{j}}{(\mathbf{x}_{j}'\mathbf{x}_{j}+\sigma_{e}^{2}/\sigma_{j}^{2})},$$ and variance $\frac{\sigma_{e}^{2}}{(\mathbf{x}_{j}'\mathbf{x}_{j}+\sigma_{e}^{2}/\sigma_{j}^{2})}$.
The full-conditional posterior for $\sigma_{j}^{2}$ is a scaled inverted chi-square distribution with $\tilde{\nu}_{\beta}=1+\nu_{\beta}$ degrees of freedom and scale parameter $\tilde{S}_{\beta}^{2}=(\beta_{j}^{2}+\nu_{\beta}S_{\beta}^{2})/\tilde{\nu}_{\beta}.$ A sample from this posterior can be obtained as $\frac{\beta_{j}^{2}+\nu_{\beta}S_{\beta}^{2}}{\chi_{\tilde{\nu}_{\beta}}^{2}}$.
Marginally, the prior for $\beta_{j}$ is a scaled $t$ distribution with $\nu_{\beta}$ degrees of freedom, mean 0 and scale parameter $S_{\beta}^{2}.$
Derive the full-conditional posterior for $\beta_{j}.$
Derive the full-conditional posterior for $\sigma_{j}^{2}.$
Use a Gibbs sampler to compute the posterior mean of $\boldsymbol{\beta}$.
As before, a flat prior is used for the intercept, $\mu$. The prior for slope $j$ is a mixture:
$$ \beta_{j}=\begin{cases} 0 & \text{probability}\,\pi\\ \sim N(0,\sigma_{\beta}^{2}) & \text{probability}\,(1-\pi) \end{cases}, $$where $\sigma_{\beta}^{2}$ has a scaled inverted chi-square prior with scale parameter $S_{\beta}^{2}$ and degrees of freedom $\nu_{\beta}.$ In order to use the Gibbs sampler, it is convenient to write $\beta_{j}$ as
$$\beta_{j}=\delta_{j}\gamma_{j},$$where $\delta_{j}$ is a Bernoulli variable with probability $1-\pi$ of being 1:
$$\delta_{j}=\begin{cases} 0 & \text{probability}\,\pi\\ 1 & \text{probability}\,(1-\pi) \end{cases},$$and $\gamma_{j}$ is normally distributed with mean zero and variance $\sigma_{\beta}^{2}.$ Then, the model for the phenotypic values can be written as $$y_{i}=\mu+\sum_{j=1}X_{ij}\gamma_{j}\delta_{j}+e_{i}.$$
The joint posterior for all the parameters is proportional to
$$\begin{aligned} f(\boldsymbol{\theta}|\mathbf{y}) & \propto f(\mathbf{y}|\boldsymbol{\theta})f(\boldsymbol{\theta})\\ & \propto \left(\sigma_{e}^{2}\right)^{-n/2}\exp\left\{ - \frac{(\mathbf{y}-\mathbf{1}\mu-\sum\mathbf{X}_{j}\gamma_{j}\delta_{j})'(\mathbf{y}-\mathbf{1}\mu-\sum\mathbf{X}_{j}\gamma_{j}\delta_{j})}{2\sigma_{e}^{2}}\right\} \\ & \times \prod_{j=1}^{k}\left(\sigma_{\beta}^{2}\right)^{-1/2}\exp\left\{ -\frac{\gamma_{j}^{2}}{2\sigma_{\beta}^{2}}\right\} \\ & \times \prod_{j=1}^{k}\pi^{(1-\delta_{j})}(1-\pi)^{\delta_{j}}\\ & \times (\sigma_{\beta}^{2})^{-(\nu_{\beta}+2)/2}\exp\left\{ -\frac{\nu_{\beta}S_{\beta}^{2}}{2\sigma_{\beta}^{2}}\right\} \\ & \times (\sigma_{e}^{2})^{-(2+\nu_{e})/2}\exp\left\{ -\frac{\nu_{e}S_{e}^{2}}{2\sigma_{e}^{2}}\right\} , \end{aligned}$$where $\boldsymbol{\theta}$ denotes all the unknowns.
The full-conditional for $\mu$ is a normal distribution with mean $\hat{\mu}$ and variance $\frac{\sigma_{e}^{2}}{n}$, where $\hat{\mu}$ is the least-squares estimate of $\mu$ in the model $$\mathbf{y-\sum_{j=1}^{k}}\mathbf{X}_{j}\gamma_{j}\delta_{j}=\mathbf{1}\mu+\mathbf{e},$$ and $\frac{\sigma_{e}^{2}}{n}$ is the variance of this estimator ($n$ is the number of observations).
where
$$\mathbf{w}_{j}=\mathbf{y}-\mathbf{1}\mu-\sum_{l\neq j}\mathbf{X}_{l}\gamma_{l}\delta_{l}.$$So, the full-conditional for $\gamma_{j}$ is a normal distribution with mean
$$\hat{\gamma}_{j}=\frac{\mathbf{X}'_{j}\mathbf{w}_{j}\delta_{j}}{(\mathbf{x}'_{j}\mathbf{x}_{j}\delta_{j}+\sigma_{e}^{2}/\sigma_{\beta}^{2})}$$and variance $\frac{\sigma_{e}^{2}}{(\mathbf{x}'_{j}\mathbf{x}_{j}\delta_{j}+\sigma_{e}^{2}/\sigma_{\beta}^{2})}$.
where $$h(\delta_{j})=\pi^{(1-\delta_{j})}(1-\pi)^{\delta_{j}}\exp\left\{ -\frac{ (\mathbf{w}_{j}-\mathbf{X}_{j}\gamma_{j}\delta_{j})' (\mathbf{w}_{j}-\mathbf{X}_{j}\gamma_{j}\delta_{j}) }{2\sigma_{e}^{2}}\right\} .$$
and this is proportional to a scaled inverted chi-square distribution with $\tilde{\nu}_{\beta}=\nu_{\beta}+k$ and scale parameter $\tilde{S}_{\beta}^{2}=(\sum_{j=1}^{k}\gamma_{j}^{2}+\nu_{\beta}S_{\beta}^{2})/\tilde{\nu}_{\beta}$.
which is proportional to a Beta distribution with parameters $a=k-\sum_{j=1}^{k}\delta_{j}+1$ and $b=\sum\delta_{j}+1.$
which is proportional to a scaled inverted chi-square density with $\tilde{\nu}_{e}=n+\nu_{e}$ degrees of freedom and $\tilde{S_{e}^{2}}=\frac{(\mathbf{y}-\mathbf{1}\mu-\sum\mathbf{X}_{j}\gamma_{j}\delta_{j})'(\mathbf{y}-\mathbf{1}\mu-\sum\mathbf{X}_{j}\gamma_{j}\delta_{j})+\nu_{e}S_{e}^{2}}{\tilde{\nu_{e}}}$ scale parameter.