Table of Contents

Using Numba for Performance Gains - A Finance Example

This post looks at how Numba can be used to speed up a finance example with minimal effort. The example looks at an application of Monte Carlo simulation for calculating a simplified version of Credit Valuation Adjustment on a portfolio. The first half explains the finance and setup and the second half show how we can speed it up and benefit from that speedup.

Credit Valuation Adjustment is a measure of the credit risk a bank has between its counterparties. Specifically it is the adjustment to the fair value price of derivative instruments to account for counterparty credit risk. This is an interesting problem to look at as a bank can have millions of credit instruments (referenced as deals in the code) outstanding with thousands of counterparties. There are many types of deals, each type having its own complexity in calculating price. Similarly, each counterparty has its own risks, default risk being the main concern with CVA. Monte Carlo is applied to estimate the future values of these instruments along with the probability of default from the counterparty.

The formula for CVA is as such,

$$CVA = (1-R)\sum_{i=0}^{n}q_i p_i$$

where $q$ is the default probability, $R$ the recovery rate and $p$ is the price of the instrument. To simplify the calculation, we can ignore the recovery rate and assume the default probability of the counterparty is limited to a hazard rate such that:

$$q_i = \int_{0}^{i} q(\lambda, t)dt$$

Only two types of deals are looked at, foreign exchange (forex) futures and swaps. The swaps are based on the forex price. Assuming a fixed interest rate, the forex and swaps are easy to price. In the first cell below, we generate a random sample of deals, assigning values and hazard rates based on the weights below.


Generating our simulated bank portfolio

We generate deals for each type of instrument and convert it to a numpy array. Note, the type and currency have been enumerated. This is done since numba does not support strings as are commonly used in code like this.

Inputs

In this section we define our inputs. Most of these inputs are used to generate our deals. These include the number of deals, the ratio of counterparty risk, maturity, and value. We also define some of the inputs for the Monte Carlo, like the number of simulations and periods per year. These are defined here since we are pregenerating the normals at the moment.

In [1]:
import math
import numpy as np
import numpy.random as rand


# Setup Configuration
sims = 1000             # number of simulations
n_forex = 1000          # number of forex deals
n_swaps = 1000          # number of swap deals
ppy = 12                # periods per yer
min_maturity = 0.25     # minumum maturity (years)
max_maturity = 10       # maximum maturity (years)
min_value = 1000        # minimum dollar value of product
max_value = 100000000   # maximum dollar value of product
min_fixed = 0.01        # minimum fixed interest rate
max_fixed = 0.15        # maximum fixed interest rate
periods = max_maturity * ppy # total number of periods

ratings = {0: 0.5,  # 'A' counterparty ratings
           1: 0.2,  # 'B'
           2: 0.1,  # 'C'
           3: 0.1,  # 'D'
           4: 0.1}  # 'E'
hazard = {0: 0.1,    # hazard rates
          1: 0.2,
          2: 0.3,
          3: 0.4,
          4: 0.5}

# percent swap deals in USD
pct_swap_cur = {0: 0.6,  # 'USD'
                1: 0.4}  # 'EUR'

Note, instead of using letter ratings for the keys, we have enumerated using integers. Numba does not currently support string operations or dictionaries. This is one important consideration when working with Numba.

Generating our portfolio

First lets make some functions to randomly assign a rating and a currency.

In [2]:
# make functions to sample from settings
get_rating = lambda: rand.choice(list(ratings.keys()), 1, p=list(ratings.values()))[0]
get_swap_cur = lambda: rand.choice(list(pct_swap_cur.keys()), 1, p=list(pct_swap_cur.values()))[0]

Then we create the deals using the inputs we defined above.

In [3]:
# function to generate a forex deal
def forex(i):
    r = get_rating()
    return (0, rand.randint(min_value, max_value), rand.randint(min_maturity*ppy, max_maturity*ppy)/ppy,
            rand.uniform(min_fixed, max_fixed), r, hazard[r], -1)

# function to generate a swap deal
def swap(i):
    r = get_rating()
    return (1, rand.randint(min_value, max_value), rand.randint(min_maturity*ppy, max_maturity*ppy)/ppy,
            rand.uniform(min_fixed, max_fixed), r, hazard[r], get_swap_cur())

# vectorize to get functions to work with fromfunction
vforex = np.vectorize(forex)
vswap = np.vectorize(swap)

# generate forex and swaps
forexs = np.asarray(np.fromfunction(vforex, (n_forex,))).T
swaps = np.asarray(np.fromfunction(vswap, (n_swaps,))).T

# combine into one deal array
deals = np.concatenate([forexs, swaps]) 

# print first row to show an example of one deal
deals[0]  # [type, value, maturity, fixed, rating, hazard, currency]
Out[3]:
array([  0.00000000e+00,   8.91008600e+07,   1.00000000e+00,
         1.38570132e-02,   4.00000000e+00,   5.00000000e-01,
        -1.00000000e+00])

Writing the CVA function for Numba

Generate standard normals and forex curves

Here we pregenerate our standard normals and forex curves used for each time step in the simulation.

In [4]:
# Generate standard normals
znorms = rand.randn(sims, periods)  # (simulations, max # of periods)

# Generate forex curves
def gen_curve(periods, xbar, alpha, vol, dt):
    curve = []
    sqrtdt = math.sqrt(dt)
    for i in range(periods):
        Z = rand.randn()
        cpast = 1 if i==0 else curve[i-1]
        curve.append(abs(alpha*xbar*cpast + vol*sqrtdt*(xbar+Z)))
    return curve

# Generate forex curve for USD and EUR
forex_curves = np.asarray(list(zip(gen_curve(periods, 0.1, 0.1, 0.3, 1/ppy), 
                                   gen_curve(periods, 0.1, 0.1, 0.3, 1/ppy))))
forex_curves[:5]
Out[4]:
array([[ 0.08367271,  0.11661154],
       [ 0.05881636,  0.10628529],
       [ 0.08903457,  0.15040312],
       [ 0.06683134,  0.12186166],
       [ 0.1404406 ,  0.04686952]])

CVA function

This function will take the generated deals and curves and apply Monte Carlo to estimate CVA. It is pretty straight forward in that it loops through each deal, aggregating the individual CVA for each. Here, the returned value is a percentage of CVA relative to the total value of the deals. This provides a better way to look at than an arbitrary large number. With this value, we can say x% of our credit portfolio will likely be lost due to counterparty default.

In this case, I use Numba's guvectorize to compile the function. The CVA function's arguments are the list of deals, parameters for the MC, and a scalar to store the result. guvectorize requires that the result be returned by modifying an input since it does not allow the function to return anything.

In [5]:
from numbapro import guvectorize, float64


def calc_CVA(deals, args, res):
    fxprice, vol, int_rate, recov_rate, sims = args  # parse parameters
    
    # initialize variables
    CVA = 0.0
    tot_value = 0.0
    dt = 1/ppy
    sqrtdt = math.sqrt(dt)
    
    # calc CVA for each deal
    for k in range(len(deals)):
        deal_type, value, maturity, fixed, rating, hazard, currency = deals[k]
        tot_value += value
        
        # for each monte carlo simulation
        for i in range(int(sims)):
            xold = fxprice
            xnew = 0
            cva = 0.0
            
            # for each period
            exp_h_0 = 1  # j-1 exponential for q, pull out to optimize
            for j in range(1, int(maturity*ppy)-1):  # skip t=0
                xnew = xold * (1 + vol * sqrtdt * znorms[i, j])  # fxprice at t=j
                exp_h_1 = math.exp(-hazard*j*dt)
                q = exp_h_0 - exp_h_1 # prob of default
                exp_h_0 = exp_h_1
                if deal_type == 0:  # forex
                    v = value * xnew * math.exp(-int_rate * j * dt)
                elif deal_type == 1:  # swap
                    v = value * (fixed - forex_curves[j, int(currency)]) * dt
                    v = v if currency == 0 else v/xnew # convert to USD if EUR swap
                else:
                    raise ValueError("Unknown deal type!")

                cva += v*q
            CVA += cva/sims
    res[0] = (1-recov_rate)*CVA/tot_value
Vendor:  Continuum Analytics, Inc.
Package: numbapro
Message: trial mode expires in 29 days

Adding numba

Now we call the numba decorator to compile the cva function. Note, I call the decorator explicitly instead of using @guvectorize so I can preserve the pure python version to compare to the numba compiled versions. Also, note that I set nopython=True in the cpu_calc since the cpu target will fallback to python mode if there are compilation errors. The parallel and gpu target do not have this issue since they require compilation to work.

In [6]:
cpu_calc = guvectorize([(float64[:,:], float64[:], float64[:])], '(m,n),(k)->()', target='cpu', nopython=True)(calc_CVA)

How is adding one line of code going to help? Let's find out...


Performance

Below we see examples of this being ran with regular Python and numba targeting the cpu. In the 1000 simulation / 2000 deals scenario Python takes about 4 minutes compute in Python and about 2 seconds in numba.

That's a 200X speedup!!!

That is a remarkable difference. I didn't really have to change much either in order to get the function to compile with numba. The main challenge I encountered was setting the data types of the inputs correctly, namely enumerating string values.

When I expand this to a 2,000,000 deal scenario, numba computes the CVA in about 4 minutes on my Macbook Air. There is still the opportunity to target the gpu and see how this would perform on a CUDA-capable machine.

In [7]:
args = np.asarray([1.0, 0.3, 0.01, 0.9, sims])  # starting fxprice, vol, int_rate, simulations
res = np.zeros(1)
In [8]:
%time calc_CVA(deals, args, res)
res[0]
CPU times: user 4min 9s, sys: 494 ms, total: 4min 9s
Wall time: 4min 10s
Out[8]:
0.024708592505027108
In [10]:
%time cpu_calc(deals, args, res)
res[0]
CPU times: user 1.83 s, sys: 15.2 ms, total: 1.85 s
Wall time: 1.85 s
Out[10]:
0.024708592505027108

Analysis

What does the extra performance buy us?

Since we can now calculate CVA 150 times faster, we can more easily compute CVA under different scenarios. For example, a risk officer wants to get an idea of CVA for a range of volatilities. Or a trader wants to rebalance the type of counterparty risk. All of these can be determined on the fly (relatively).

As an example, what happens with different levels of the interest rate? As shown below, there is a clear behavior, with a slight convexity in the level of rates.

In [11]:
import pandas as pd
from bokeh.charts import Line, Scatter, Bar, show, output_notebook
output_notebook()

int_cva = []
ints = np.arange(0, 0.10, 0.01)
for i in ints:
    args = np.asarray([1.0, 0.3, i, 0.9, sims])  # starting fxprice, vol, int_rate, recov_rate, simulations
    res = np.zeros(1)
    cpu_calc(deals, args, res)
    int_cva.append(res[0])
    
data = pd.DataFrame({'CVA': int_cva, 'int_rate': ints})
line = Line(data, x='int_rate', y='CVA', title='CVA vs Interest Rate')
show(line)