- Reduce systemic risk by taking opposite positions in two assets that share a "common trend."¶

In [18]:
%matplotlib inline
import matplotlib.pyplot as plt

import zipline
from zipline.api import (
history,
symbol,
record,
order_target_percent
)

import pytz
import numpy as np
import pandas as pd
import statsmodels.api as sm


A Simple Model¶

• Assume two similar assets $A \& B$ have risk factor exposures that differ by a scalar and satisfy the necessary conditions for cointegration.

$Spread_t = \log(price^{A}_t) - \gamma\log(price^{B}_t) = \epsilon{_t}$¶

Where $\epsilon{_t}$ is a normally distributed series with a mean of 0. After a single time step,

$r^{Spread}_t = r^{A}_t - \gamma r^{B}_t = \Delta\epsilon _t$¶

So predicting the spread return at time $t$ is the same as predicting the direction of change in a stationary series of mean 0.

Selecting Pairs¶

• #### Most importantly (opinion), both companies should have similar market risk exposures
• Better than 90% confidence on cointegration tests
• Consistent historical evidence of a "common trend"
• Must make economic sense

Case Study: EOG ~ PXD¶

EOG Resources (EOG) and Pioneer Natural Resources Company (PXD) are two Texas based independent producers of oil and natural gas.

Due to their similar market risk profiles (oil price sensitivity), PXD and EOG trend with similar trajectories.

In [2]:
# Historical Data from Yahoo
stocks=('EOG', 'PXD', 'SPY'),
start=pd.Timestamp('2011-01-01', tz='utc'),
end=pd.Timestamp.utcnow(),
indexes={}
)
prices = data.minor_xs('price')
log_returns = np.log(prices).diff()
compound_returns = (1 + log_returns).cumprod()
compound_returns.plot(figsize=(14, 6))

EOG
PXD
SPY

Out[2]:
<matplotlib.axes.AxesSubplot at 0x110eeecd0>

Test for cointegration¶

Intuition is not enough, so it's good to test for cointegration. The most commonly used test is the Dicky-Fuller cointegration test.

In [3]:
def adf_test(y, X, intercept=True, window=None):
'''
Dicky-Fuller cointegration test
'''
model = pd.ols(y=y, x=X, intercept=intercept, window=window)

spread = prices.EOG - model.beta['x'] * prices.PXD

print 'Test statistic: ', t_stat
print 'Critical Values:', critical_values


Test statistic:  -11.3864409753
Critical Values: {'5%': -2.8644316780522177, '1%': -3.4368995989062348, '10%': -2.5683096665272789}

Out[3]:
<matplotlib.axes.AxesSubplot at 0x113072a10>
In [10]:
# Trading Algorithm
import numpy as np
import statsmodels.api as sm

def initialize(context):
context.x = symbol('EOG')
context.y = symbol('PXD')
context.nobs = 20
context.tick = 0

def handle_data(context, data):
# Allow the history frame to accumulate data
context.tick += 1
if context.tick < context.nobs:
return

prices = np.log(history(context.nobs, '1d', 'price'))
y = prices[context.y]
X = prices[context.x]

hedge_ratio = model.params[context.x]

spread = y - hedge_ratio * X

record(
hedge_ratio=hedge_ratio,
zscore=zscore
)
# Get direction of the trades, and place orders
# Re-hedge to a dollar-neutral position daily in the direction of anticipated reversion.
# More traditionally an entry is made when the magnitude of the z-score is large.
side = np.copysign(0.5, zscore)
order_target_percent(context.y, -side)
order_target_percent(context.x,  side * model.params[context.x])


In [11]:
algo = TradingAlgorithm(
initialize=initialize,
handle_data=handle_data
)

perf = algo.run(data)
perf.index = perf.index.tz_localize(pytz.utc)

[2015-01-27 04:37:14.000959] INFO: Performance: Simulated 1022 trading days out of 1022.
[2015-01-27 04:37:14.001827] INFO: Performance: first open: 2011-01-03 14:31:00+00:00
[2015-01-27 04:37:14.002566] INFO: Performance: last close: 2015-01-26 21:00:00+00:00

In [12]:
perf.spread.plot(figsize=(14,7))

Out[12]:
<matplotlib.axes.AxesSubplot at 0x115064390>
In [13]:
compound_pair_returns = (1 + perf.returns).cumprod()
results = compound_returns.to_dict()
results['EOG ~ PXD'] = compound_pair_returns
results = pd.DataFrame(results).ffill()
results[['PXD', 'EOG', 'EOG ~ PXD']].plot(figsize=(14,7))
results[['SPY', 'EOG ~ PXD']].plot(figsize=(14,7))

Out[13]:
<matplotlib.axes.AxesSubplot at 0x102990310>
In [14]:
perf[['long_exposure', 'short_exposure', 'net_leverage']].plot(figsize=(14,7))

Out[14]:
<matplotlib.axes.AxesSubplot at 0x119ba6990>

Risk Metrics¶

In [17]:
risk_report = pd.Panel({
period: pd.DataFrame(algo.risk_report[period])
for period in algo.risk_report
})
print risk_report.minor_axis
risk_report.minor_xs('sharpe')

Index([u'algo_volatility', u'algorithm_period_return', u'alpha', u'benchmark_period_return', u'benchmark_volatility', u'beta', u'excess_return', u'information', u'max_drawdown', u'period_label', u'sharpe', u'sortino', u'trading_days', u'treasury_period_return'], dtype='object')

Out[17]:
one_month six_month three_month twelve_month
0 0 0.399064 0.1751205 1.891121
1 -0.2303534 0.06056268 0.4051947 2.081803
2 0.6720155 1.415842 1.463952 2.175454
3 0.4959671 1.185801 0.3694682 1.768673
4 1.17063 1.540741 -0.2464215 1.233226
5 -1.118294 1.678988 0.6653211 1.02713
6 -0.5931036 1.99515 1.174621 1.862708
7 2.033294 2.716927 2.285227 2.53893
8 -0.08176557 1.479004 1.767422 2.25566
9 1.268055 1.181162 1.543545 2.183615
10 1.42794 0.08121069 1.242962 1.824025
11 -0.444197 -0.4190961 0.1417493 1.130034
12 0.9053201 0.4090032 0.04302299 1.734644
13 -0.08242315 0.5739339 -0.9432394 1.176115
14 -0.4266107 1.508455 -0.6499409 1.495482
15 -1.108201 1.763178 0.5388966 1.851867
16 0.6875907 2.659631 2.269167 2.180863
17 1.635488 2.132488 3.292805 1.773212
18 1.369677 2.053531 1.999118 1.333311
19 2.325417 1.059135 1.363741 1.103552
20 -0.2270935 0.5245296 -0.08113302 1.096163
21 0.395192 0.7399767 0.8561812 1.047849
22 -0.4168844 0.2779619 0.04189832 1.139825
23 1.6534 0.3620893 0.8247335 1.425947
24 -1.046189 0.008985382 0.1675946 0.9222007
25 0.7705466 0.4913982 0.349238 1.10515
26 0.3631654 0.9482038 -0.2755495 1.136325
27 -0.9101462 0.7065436 -0.1099011 0.9122471
28 -0.05254148 1.155817 0.3470793 1.331254
29 0.3586327 1.499308 1.460123 1.507445
30 0.6268109 1.255211 1.179911 1.642905
31 1.367809 0.9888387 1.187544 1.493357
32 -0.7228139 0.5936103 0.5557084 0.5932982
33 0.5704613 0.5453151 0.5665684 0.8791801
34 0.5775891 0.6057297 0.02198625 0.8407119
35 -0.5182931 0.481271 0.2531073 0.4785539
36 -0.1662931 0.938908 0.164244 0.7670049
37 0.9211245 1.052431 0.7901755 0.5383184
38 -0.7683404 0.203883 0.4149857 NaN
39 0.7212907 0.6930835 1.123166 NaN
40 0.2959122 0.57153 0.6666565 NaN
41 1.000188 0.2394808 -0.2446549 NaN
42 -0.2450061 0.2787647 -0.4324159 NaN
43 -1.004149 -6.329155e-05 0.1321612 NaN
44 0.807929 NaN 0.3495203 NaN
45 0.7595933 NaN 0.4474621 NaN
46 0.0335916 NaN -0.0445592 NaN
47 0.6829041 NaN NaN NaN
48 -1.275649 NaN NaN NaN

R-style formulas with statsmodels¶

In [16]:
import statsmodels.formula.api as smf

model = smf.ols(formula="EOG ~ PXD", data=np.log(prices)).fit()

model.summary()

Out[16]:
Dep. Variable: R-squared: EOG 0.941 OLS 0.941 Least Squares 1.631e+04 Mon, 26 Jan 2015 0.00 20:37:21 1238.5 1022 -2473. 1020 -2463. 1
coef std err t P>|t| [95.0% Conf. Int.] 0.0261 0.033 0.799 0.424 -0.038 0.090 0.8583 0.007 127.723 0.000 0.845 0.872
 Omnibus: Durbin-Watson: 71.465 0.034 0 94.314 0.599 3.31e-21 3.883 73.2