From Wikipedia:

The Engleâ€“Granger approach as described above suffers from a number of weaknesses. Namely it is restricted to only a single equation with one variable designated as the dependent variable, explained by another variable that is assumed to be weakly exogeneous for the parameters of interest. It also relies on pretesting the time series to find out whether variables are I(0) or I(1). These weaknesses can be addressed through the use of Johansen's procedure. Its advantages include that pretesting is not necessary, there can be numerous cointegrating relationships, all variables are treated as endogenous and tests relating to the long-run parameters are possible.

VECM models add error correction terms to the vector autoregression (VAR) model. Using simple matrix algebra, the model can be written as follows:

$$\,\left[ {\begin{array}{*{20}{c}} {\Delta {y_{1,t}}}\\ {\Delta {y_{2t}}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{\gamma _{01}}}\\ {{\gamma _{02}}} \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} {{\alpha _1}}\\ {{\alpha _2}} \end{array}} \right]\left[ {{\beta _0}\,\,\,{\beta _1}\,\,\,{\beta _2}} \right]\left[ {\begin{array}{*{20}{c}} 1\\ {{y_{1,t - 1}}}\\ {{y_{2,t - 1}}} \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} {{\gamma _{11}}}&{{\gamma _{21}}}\\ {{\gamma _{12}}}&{{\gamma _{22}}} \end{array}} \right]\,\,\left[ {\begin{array}{*{20}{c}} {\Delta {y_{1,t - 1}}}\\ {\Delta {y_{2,t - 1}}} \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} {{\gamma _{31}}}&{{\gamma _{41}}}\\ {{\gamma _{32}}}&{{\gamma _{42}}} \end{array}} \right]\,\,\left[ {\begin{array}{*{20}{c}} {\Delta {y_{1,t - 2}}}\\ {\Delta {y_{2,t - 2}}} \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} {{\nu _{1,t}}}\\ {{\nu _{2,t}}} \end{array}} \right]$$Following this excellent blog post I will choose $y_1$ and $y_2$ to be the following time series:

- SPY (the S&P 500 exchange traded fund)
- SHY (iShares 1-3 year Treasury Bond) prices.

If both series are cointegrated, this information must included in the model. This is done introducing, as mentioned above, error correction terms:

$${\alpha _i}({\beta _0}\, + {\beta _1}{y_{1,t - 1}}\, + {\beta _2}{y_{2,t - 1}}),\,\,\,\,\,i = 1,2$$where

$$({\beta _0}\, + {\beta _1}{y_{1,t - 1}}\, + {\beta _2}{y_{2,t - 1}}),\,\,\,\,\,i = 1,2$$corresponds to the long-run. If in the long-run the last equation is zero we have the folowing relation:

$$\,{y_{2,t - 1}} = - ({\beta _0}/{\beta _2})\, - ({\beta _1}/{\beta _2}){y_{1,t - 1}}$$Hence, the $\beta$s we obtain after fitting the VECM inform us about the equilibrium relationship between the time series. When the two series deviate from equilibrium, the $\alpha$s "push them back".

In [3]:

```
# install.packages('ggplot2')
# install.packages('xts')
# install.packages('quantmod')
# install.packages('broom')
# install.packages('tseries')
# install.packages("kableExtra")
# install.packages("knitr")
# install.packages("vars")
# install.packages("urca")
```

To load the data we will use the library `quantmod`

which contains the function `getSymbols`

. From the documents

getSymbols is a wrapper to load data from various sources, local or remote.

In our case we will load data from Yahoo Finance.

In [49]:

```
rm(list=ls())
library(tseries)
library(dynlm)
library(vars)
library(nlWaldTest)
library(lmtest)
library(broom)
library(car)
library(sandwich)
library(knitr)
library(forecast)
```

In [53]:

```
library(quantmod)
setSymbolLookup(SHY='yahoo',SPY='yahoo')
getSymbols(c('SHY','SPY'))
```

Defining `y1`

and `y2`

as the adjusted prices and joining them:

In [54]:

```
y1 <- SPY$SPY.Adjusted
y2 <- SHY$SHY.Adjusted
time_series <- cbind(y1, y2)
print('Our dataframe is:')
colnames(time_series) <- c('SHY', 'SPY')
head(time_series)
```

We can check for stationary of the series individually:

In [58]:

```
adf.test(time_series$SHY)
adf.test(time_series$SPY)
```

In [57]:

```
print('p-values:')
adf.test(time_series$SHY)$p.value
adf.test(time_series$SPY)$p.value
```

In [46]:

```
par(mfrow = c(2,1))
plot.ts(time_series$SHY,
type='l')
plot.ts(time_series$SPY,
type='l')
```

The most well known cointegration test is the Johansen test which estimates the VECM parameters and determines whether the determinant of

$$\alpha \beta = \left[ {\begin{array}{*{20}{c}} {{\alpha _1}}\\ {{\alpha _2}} \end{array}} \right]\left[ {{\beta _0}\,\,\,{\beta _1}\,\,\,{\beta _2}} \right]$$is zero or not. If $\alpha \beta\neq 0$, the series are cointegrated.

In [31]:

```
library(urca)
```

In [32]:

```
johansentest <- ca.jo(time_series, type = "trace", ecdet = "const", K = 3)
summary(johansentest)
```

The lines $r=0$ and $r\le 1$ are the results of the test. More specifically:

- line $r=0$: these are the results of the hypothesis test with null hypothesis $r=0$. More concretely, this test checks if the matrix has zero rank. In the present case the hypothesis is rejected since the test variable is well above the $1\%$ value;
- line $r\le 1$: these are the results of the hypothesis test $r\le 1$. Now since the test value is below the $1\%$ value value we fail to reject the null hypothesis. Hence we conclude that the rank of $\alpha \beta$ is 1 and therefore the two series are cointegrated and we can use the VECM model.

Note that if hypotheses were reject we would have $r=2$ corresponding to two stationary series.

In [37]:

```
t <- cajorls(johansentest, r = 1)
t
```

The $\beta$ coefficients are given above. The $\gamma$ coefficients are above the $beta$s in the output.