Let us try to estimate the impact of the Deepwater Horizon oil spill on BP plc (formerly British Petroleum) stock prices using Bayesian structural time series Sengul (2018) (see section BSTS Model for a slightly more formal explanation).

In general, structural time series (STS) models (either frequentist or Bayesian) can be written as a system of equations. The simplest possible example of STS model is the *local level model*, given by:

where ${\xi _t} \sim {\cal N}(0,\sigma _\xi ^2)$ and ${\varepsilon _t} \sim {\cal N}(0,\sigma _\varepsilon ^2)$.
Here, the first equation is called *state equation* and ${\mu _t}$ is an unobserved variable. Though ${\mu _t}$ is not observed, the second equation, called *observation equation*, which depends on ${\mu _t}$, contains the variable ${y_t}$ which is based on observed data (roughly speaking, ${\mu _t}$ can be interpreted as a time-dependent version of the intercept of simple linear regressions).

The general form of the BSTS is more convoluted so let us consider the following case,

$$\begin{array}{l} {y_t} = {\mu _t} + {\tau _t} + {\beta ^T}{{\vec x}_t} + {\varepsilon _t}\\ {\mu _t} = {\mu _{t - 1}} + {\delta _{t - 1}} + {\eta _t},\,\,\,\,\,\,\,\,\,\\ {\delta _t} = {\delta _{t - 1}} + {\omega _t},\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\\ {\tau _t} = - \sum\nolimits_{s = 1}^{S - 1} {{\tau _{t - s}} + {\gamma _t}} \end{array}$$where the errors have similar properties as in the local level model. The variables have the following meanings:

- ${\tau _t}$ is the seasonal component
- ${\delta _t}$ is a random walk with trend
- ${\mu _t}$ is the trend component
- ${{\vec x}_t}$ is a vector of covariates.

The last variable, namely, ${{\vec x}_t}$ is of crucial importance in this discussion. It is a vector time series that can be used to predict ${y_t}$ and the coefficients $\beta$ are estimated the spike-and-slab Bayesian method of feature selection. To use this system of equations to predict BP stock prices after the oil spill we need to identify seasonal effects, trends and choose ${{\vec x}_t}$. An important condition that must be obeyed is that ${{\vec x}_t}$ must (ideally) be highly correlated with ${y_t}$, but cannot have been impacted by the same factors that impacted $y_t$.

Let us apply the `CausalImpact`

package without going into much detail now. First let us build

Some of the packages we need to install are:

`devtools`

to be able to download packages from Github`magrittr`

for code readability and maintainability`dplyr`

for data manipulation`rga`

to obtain data from the Google Analytics (GA) APIs

We need also to provided the `id`

corresponding to our view from GA.

In [107]:

```
# # install.packages("devtools")
# # install.packages("magrittr")
# # install.packages("dplyr")
# # install.packages("curl")
# # install_github("skardhamar/rga")
# # install.packages(c("bitops", "jsonlite", "httr"),repos='http://cran.us.r-project.org')
# install.packages("tidyverse")
# install.packages("tidyr")
```

In [17]:

```
library(devtools)
library(magrittr)
library("dplyr")
library(lubridate)
library(curl)
library(tidyverse)
library(tidyr)
```

Reading the `csv`

file containing the BP stock prices:

In [187]:

```
stocks <- read.csv('stocks.csv')
head(stocks)
```

The oil spil date is indicated by the dashed vertical line.

In [169]:

```
par(mfrow = c(2,1))
plot(stocks$nasdaq,col="red",type="l",xlim=c(200, 500), ylim=c(20, 26))
abline(v=326, lwd=1, lty=2)
plot(stocks$bp,col="blue",type="l", xlim=c(200, 500))
abline(v=326, lwd=1, lty=2)
```

The `CausalImpact`

package is based on BSTMs. BSTMs are used to build synthetic controls i.e. scenarios that would have occurred without the intervention. This technique allows one to estimate the causal effect and subsequent time evolution due to the intervention.

We give the package the following two inputs:

- a response time series and
- a set of control time series

`CausalImpact`

then builds a BSTM which is used to predict the counterfactual. Note that one must be careful to choose a control series unaffected by the intervention.

In [176]:

```
library(CausalImpact)
stocks <- stocks[,c('bp', 'nasdaq')]
pre.period <- c(1, 325)
post.period <- c(326, 504)
```

In [179]:

```
impact <- CausalImpact(data = stocks, pre.period = pre.period, post.period = post.period)
summary(impact)
```

In [186]:

```
par(mfrow = c(1, 2))
plot(impact);
```

The difference between the data and the prediction post-intervention is the causal effect of the intervention. The three panels show respectively:

- The observed and predicted response on the original scale
- The difference between the two
- The individual causal effects added up