Notebook

LaTeX macros (hidden cell) $ \newcommand{\Q}{\mathcal{Q}} \newcommand{\ECov}{\boldsymbol{\Sigma}} \newcommand{\EMean}{\boldsymbol{\mu}} \newcommand{\EAlpha}{\boldsymbol{\alpha}} \newcommand{\EBeta}{\boldsymbol{\beta}} $

Imports and configuration¶

In [58]:

import sys
import os
import re
import datetime as dt

import numpy as np
import pandas as pd
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap

from mosek.fusion import *

from notebook.services.config import ConfigManager

from portfolio_tools import data_download, DataReader, compute_inputs

In [59]:

# Version checks
print(sys.version)
print('matplotlib: {}'.format(matplotlib.__version__))

# Jupyter configuration
c = ConfigManager()
c.update('notebook', {"CodeCell": {"cm_config": {"autoCloseBrackets": False}}})  

# Numpy options
np.set_printoptions(precision=5, linewidth=120, suppress=True)

# Pandas options
pd.set_option('display.max_rows', None)

# Matplotlib options
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['figure.dpi'] = 200

3.6.9 (default, Jan 26 2021, 15:33:00) 
[GCC 8.4.0]
matplotlib: 3.3.4

Prepare input data¶

Here we load the raw data that will be used to compute the optimization input variables, the vector $\EMean$ of expected returns and the covariance matrix $\ECov$. The data consists of daily stock prices of $8$ stocks from the US market.

Download data¶

In [60]:

# Data downloading:
# If the user has an API key for alphavantage.co, then this code part will download the data. 
# The code can be modified to download from other sources. To be able to run the examples, 
# and reproduce results in the cookbook, the files have to have the following format and content:
# - File name pattern: "daily_adjusted_[TICKER].csv", where TICKER is the symbol of a stock. 
# - The file contains at least columns "timestamp", "adjusted_close", and "volume".
# - The data is daily price/volume, covering at least the period from 2016-03-18 until 2021-03-18, 
# - Files are for the stocks PM, LMT, MCD, MMM, AAPL, MSFT, TXN, CSCO.
list_stocks = ["PM", "LMT", "MCD", "MMM", "AAPL", "MSFT", "TXN", "CSCO"]
list_factors = []
alphaToken = None
 
list_tickers = list_stocks + list_factors
if alphaToken is not None:
    data_download(list_tickers, alphaToken)  

Read data¶

We load the daily stock price data from the downloaded CSV files. The data is adjusted for splits and dividends. Then a selected time period is taken from the data.

In [61]:

investment_start = "2016-03-18"
investment_end = "2021-03-18"

In [62]:

# The files are in "stock_data" folder, named as "daily_adjusted_[TICKER].csv"
dr = DataReader(folder_path="stock_data", symbol_list=list_tickers)
dr.read_data()
df_prices, _ = dr.get_period(start_date=investment_start, end_date=investment_end)

Found data files: 
stock_data/daily_adjusted_AAPL.csv
stock_data/daily_adjusted_PM.csv
stock_data/daily_adjusted_CSCO.csv
stock_data/daily_adjusted_TXN.csv
stock_data/daily_adjusted_MMM.csv
stock_data/daily_adjusted_IWM.csv
stock_data/daily_adjusted_MCD.csv
stock_data/daily_adjusted_SPY.csv
stock_data/daily_adjusted_MSFT.csv
stock_data/daily_adjusted_LMT.csv

Using data files: 
stock_data/daily_adjusted_PM.csv
stock_data/daily_adjusted_LMT.csv
stock_data/daily_adjusted_MCD.csv
stock_data/daily_adjusted_MMM.csv
stock_data/daily_adjusted_AAPL.csv
stock_data/daily_adjusted_MSFT.csv
stock_data/daily_adjusted_TXN.csv
stock_data/daily_adjusted_CSCO.csv

Run the optimization¶

Define the optimization model¶

Below we implement the optimization model in Fusion API. We create it inside a function so we can call it later.

In [63]:

def RiskBudgeting(N, G, b, z, a):
    
    with Model('Risk budgeting') as M:
        # Settings
        M.setLogHandler(sys.stdout)
        
        # Portfolio weights
        x = M.variable("x", N, Domain.unbounded())
        
        # Orthant specifier constraint
        M.constraint("orthant", Expr.mulElm(z, x), Domain.greaterThan(0.0))
        
        # Auxiliary variables
        t = M.variable("t", N, Domain.unbounded())
        s = M.variable("s", 1, Domain.unbounded())
    
        # Objective function: 1/2 * x'Sx - a * b'log(z*x) becomes s - a * b't
        M.objective(ObjectiveSense.Minimize, Expr.sub(s, Expr.mul(a, Expr.dot(b, t))))
    
        # Bound on risk term
        M.constraint(Expr.vstack(s, 1, Expr.mul(G.T, x)), Domain.inRotatedQCone())
    
        # Bound on log term t <= log(z*x) becomes (z*x, 1, t) in K_exp
        M.constraint(Expr.hstack(Expr.mulElm(z, x), Expr.constTerm(N, 1.0), t), Domain.inPExpCone())
    
        # Create DataFrame to store the results.
        columns = ["obj", "risk", "xsum", "bsum"] + df_prices.columns.tolist()
        df_result = pd.DataFrame(columns=columns) 
    
        # Solve optimization
        M.solve()
        # Check if the solution is an optimal point
        solsta = M.getPrimalSolutionStatus()
        if (solsta != SolutionStatus.Optimal):
            # See https://docs.mosek.com/latest/pythonfusion/accessing-solution.html about handling solution statuses.
            raise Exception("Unexpected solution status!")
    
        # Save results
        xv = x.level()
                   
        # Check solution quality
        risk_budgets = xv * np.dot(G @ G.T, xv)
       
        # Renormalize to gross exposure = 1
        xv = xv / np.abs(xv).sum()
        
        # Compute portfolio metrics
        Gx = np.dot(G.T, xv)
        portfolio_risk = np.sqrt(np.dot(Gx, Gx))
               
        row = pd.Series([M.primalObjValue(), portfolio_risk, np.sum(z * xv), np.sum(risk_budgets)] + list(xv), index=columns)
        df_result = pd.concat([df_result, pd.DataFrame([row])], ignore_index=True)
        row = pd.Series([None] * 4 + list(risk_budgets), index=columns)
        df_result = pd.concat([df_result, pd.DataFrame([row])], ignore_index=True)

        return df_result

Compute optimization input variables¶

Here we use the loaded daily price data to compute the corresponding yearly mean return and covariance matrix.

In [64]:

# Number of securities
N = df_prices.shape[1]

# Get optimization parameters
_, S = compute_inputs(df_prices)

# Risk budget
b = np.ones(N) / N

# Orthant selector
z = np.ones(N)

# Global setting for sum of b
a = 1

Next we compute the matrix $G$ such that $\ECov=GG^\mathsf{T}$, this is the input of the conic form of the optimization problem. Here we use Cholesky factorization.

In [65]:

G = np.linalg.cholesky(S)  

Call the optimizer function¶

In [66]:

df_result = RiskBudgeting(N, G, b, z, a)

Problem
  Name                   : Risk budgeting  
  Objective sense        : min             
  Type                   : CONIC (conic optimization problem)
  Constraints            : 42              
  Cones                  : 9               
  Scalar variables       : 52              
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator started.
Freed constraints in eliminator : 0
Eliminator terminated.
Eliminator - tries                  : 1                 time                   : 0.00            
Lin. dep.  - tries                  : 1                 time                   : 0.00            
Lin. dep.  - number                 : 0               
Presolve terminated. Time: 0.00    
Problem
  Name                   : Risk budgeting  
  Objective sense        : min             
  Type                   : CONIC (conic optimization problem)
  Constraints            : 42              
  Cones                  : 9               
  Scalar variables       : 52              
  Matrix variables       : 0               
  Integer variables      : 0               

Optimizer  - threads                : 20              
Optimizer  - solved problem         : the primal      
Optimizer  - Constraints            : 8
Optimizer  - Cones                  : 9
Optimizer  - Scalar variables       : 34                conic                  : 34              
Optimizer  - Semi-definite variables: 0                 scalarized             : 0               
Factor     - setup time             : 0.00              dense det. time        : 0.00            
Factor     - ML order time          : 0.00              GP order time          : 0.00            
Factor     - nonzeros before factor : 36                after factor           : 36              
Factor     - dense dim.             : 0                 flops                  : 7.74e+02        
ITE PFEAS    DFEAS    GFEAS    PRSTATUS   POBJ              DOBJ              MU       TIME  
0   1.4e+00  1.3e+00  9.7e+00  0.00e+00   1.534945180e+00   -7.147922794e+00  1.0e+00  0.01  
1   2.4e-01  2.3e-01  3.9e-01  5.11e-01   2.860275193e+00   1.099946215e+00   1.8e-01  0.02  
2   3.1e-02  3.0e-02  1.3e-02  1.33e+00   1.294944866e+00   1.109238006e+00   2.3e-02  0.02  
3   1.6e-03  1.5e-03  1.6e-04  1.15e+00   1.052570526e+00   1.043915847e+00   1.2e-03  0.02  
4   1.8e-04  1.7e-04  6.1e-06  1.01e+00   1.043789450e+00   1.042801879e+00   1.3e-04  0.02  
5   2.6e-05  2.5e-05  3.4e-07  1.00e+00   1.042814807e+00   1.042672665e+00   1.9e-05  0.02  
6   5.1e-06  4.8e-06  3.1e-08  1.00e+00   1.042685135e+00   1.042657303e+00   3.8e-06  0.02  
7   1.4e-06  1.3e-06  4.5e-09  1.00e+00   1.042662348e+00   1.042655043e+00   9.9e-07  0.02  
8   3.3e-07  3.1e-07  6.6e-10  1.00e+00   1.042656394e+00   1.042654634e+00   2.4e-07  0.02  
9   3.6e-08  3.4e-08  3.0e-11  1.00e+00   1.042654983e+00   1.042654798e+00   2.6e-08  0.02  
10  4.4e-09  4.2e-09  1.4e-12  1.00e+00   1.042654892e+00   1.042654869e+00   3.3e-09  0.02  
11  1.5e-08  1.9e-09  5.2e-14  1.00e+00   1.042654881e+00   1.042654879e+00   3.6e-10  0.02  
Optimizer terminated. Time: 0.03    


Interior-point solution summary
  Problem status  : PRIMAL_AND_DUAL_FEASIBLE
  Solution status : OPTIMAL
  Primal.  obj: 1.0426548813e+00    nrm: 1e+00    Viol.  con: 2e-09    var: 0e+00    cones: 5e-09  
  Dual.    obj: 1.0426548788e+00    nrm: 1e+00    Viol.  con: 0e+00    var: 1e-09    cones: 0e+00

In [67]:

df_result

Out[67]:

	obj	risk	xsum	bsum	PM	LMT	MCD	MMM	AAPL	MSFT	TXN	CSCO
0	1.042655	0.212983	1.0	1.000007	0.128198	0.134368	0.147567	0.137782	0.090373	0.114948	0.114974	0.131791
1	NaN	NaN	NaN	NaN	0.125001	0.125001	0.125001	0.125001	0.125001	0.125001	0.125001	0.125001

Visualize the results¶

Plot the portfolio components.

In [68]:

ax = df_result.iloc[0, 4:].T.plot.bar(xlabel="securities", ylabel="x", grid=True, rot=0)

Plot the risk budgets.

In [69]:

ax = df_result.iloc[1, 4:].T.plot.bar(xlabel="securities", ylabel="risk budget", grid=True, rot=0)

In [ ]: