Notebook

Improved estimation of Signal Noise Ratio via moments¶

Let $\mu$ , and $\sigma$ be the mean and standard deviation of the returns of an asset. Then $\zeta = \mu / \sigma$ is the "Signal to Noise Ratio" (SNR). Typically the SNR is estimated with the Sharpe Ratio, defined as $\hat{\zeta} = \hat{\mu} / \hat{\sigma}$ , where $\hat{\mu}$ and $\hat{\sigma}$ are the vanilla sample estimates. Can we gain efficiency in the case where the returns have significant skew and kurtosis?

Here we consider an estimator of the form

$v = a_0 + \frac{a_1 + \left(1+a_2\right)\hat{\mu} + a_3 \hat{\mu}^2}{\hat{\sigma}} + a_4 \left(\frac{\hat{\mu}}{\hat{\sigma}}\right)^2.$ The Sharpe Ratio corresponds to

$a_0 = a_1 = a_2 = a_3 = a_4 = 0$ . Note that we were inspired by Norman Johnson's work on t-tests under skewed distributions. Johnson considered a similar setup, but with only

$a_1, a_2,$ and

$a_3$ free, and was concerned with the problem of hypothesis testing on

$\mu$ .

Below, following Johnson, I will use the Cornish Fisher expansions of $\hat{\mu}$ and $\hat{\sigma}$ to approximate $v$ as a function of the first few cumulants of the distribution, and some normal variates. I will then compute the mean square error, $E\left[\left(v - \zeta\right)^2\right],$ and take its derivative with respect to $a_i$ . Unfortunately, we will find that the first order conditions are solved by $a_i=0$ , which is to say that the vanilla Sharpe has the lowest MSE of estimators of this kind. Our adventure will take us far, but we will return home empty handed.

We proceed.

In [1]:

# load what we need from sympy
from __future__ import division
from sympy import *
from sympy import Order
from sympy.assumptions.assume import global_assumptions
from sympy.stats import P, E, variance, Normal
init_printing()
nfactor = 4

# define some symbols.
a0, a1, a2, a3, a4 = symbols('a_0 a_1 a_2 a_3 a_4',real=True)
n, sigma = symbols('n \sigma',real=True,positive=True)
zeta, mu3, mu4 = symbols('\zeta \mu_3 \mu_4',real=True)
mu = zeta * sigma

We now express $\hat{\mu}$ and $\hat{\sigma}^2$ by the Cornish Fisher expansion. This is an expression of the distribution of a random variable in terms of its cumulants and a normal variate. The expansion is ordered in a way such that when applied to the mean of independent draws of a distribution, the terms are clustered by the order of $n$ . The Cornish Fisher expansion also involves the Hermite polynomials. The expansions of $\hat{\mu}$ and $\hat{\sigma}^2$ are not independent. We follow Johnson in expression the correlation of normals and truncating:

In [2]:

# probabilist's hermite polynomials
def Hen(x,n):
    return (2**(-n/2) * hermite(n,x/sqrt(2)))

# this comes out of the wikipedia page:
h1 = lambda x : Hen(x,2) / 6
h2 = lambda x : Hen(x,3) / 24
h11 = lambda x : - (2 * Hen(x,3) + Hen(x,1)) / 36

# mu3 is the 3rd centered moment of x
gamma1 = (mu3 / (sigma**(3/2))) / sqrt(n)
gamma2 = (mu4 / (sigma**4)) / n

# grab two normal variates with correlation rho
# which happens to take value:
# rho = mu3 / sqrt(sigma**2 * (mu4 - sigma**4))
z1 = Normal('z_1',0,1)
z3 = Normal('z_3',0,1)
rho = symbols('\\rho',real=True)
z2 = rho * z1 + sqrt(1-rho**2)*z3

# this is out of Johnson, but we call it mu hat instead of x bar:
muhat = mu + (sigma/sqrt(n)) * (z1 + gamma1 * h1(z1) + gamma2 * h2(z1) + gamma1**2 * h11(z1))
muhat

Out[2]:

$\sigma \zeta + \frac{\sigma}{\sqrt{n}} \left(\frac{\mu_3^{2}}{\sigma^{3.0} n} \left(- 0.0392837100659193 \sqrt{2} z_{1}^{3} + 0.0982092751647983 \sqrt{2} z_{1}\right) + \frac{\mu_3}{\sigma^{1.5} \sqrt{n}} \left(0.166666666666667 z_{1}^{2} - 0.166666666666667\right) + \frac{\mu_4}{\sigma^{4} n} \left(0.0294627825494395 \sqrt{2} z_{1}^{3} - 0.0883883476483184 \sqrt{2} z_{1}\right) + z_{1}\right)$

In [3]:

addo = sqrt((mu4 - sigma**4) / (n * sigma**4)) * z2
# this is s^2 in Johnson:
sighat2 = (sigma**2) * (1 + addo)
# use Taylor's theorem to express sighat^-1:
invs = (sigma**(-1)) * (1 - (1/(2*sigma)) * addo)
invs

Out[3]:

$\frac{1}{\sigma} \left(1 - \frac{\sqrt{\mu_4 - \sigma^{4}}}{2 \sigma^{3} \sqrt{n}} \left(\rho z_{1} + \sqrt{- \rho^{2} + 1} z_{3}\right)\right)$

In [4]:

# the new statistic; it is v = part1 + part2 + part3
part1 = a0
part2 = (a1 + (1+a2)*muhat + a3 * muhat**2) * invs
part3 = a4 * (muhat*invs)**2

v = part1 + part2 + part3
v

Out[4]:

$a_{0} + \frac{1}{\sigma} \left(1 - \frac{\sqrt{\mu_4 - \sigma^{4}}}{2 \sigma^{3} \sqrt{n}} \left(\rho z_{1} + \sqrt{- \rho^{2} + 1} z_{3}\right)\right) \left(a_{1} + a_{3} \left(\sigma \zeta + \frac{\sigma}{\sqrt{n}} \left(\frac{\mu_3^{2}}{\sigma^{3.0} n} \left(- 0.0392837100659193 \sqrt{2} z_{1}^{3} + 0.0982092751647983 \sqrt{2} z_{1}\right) + \frac{\mu_3}{\sigma^{1.5} \sqrt{n}} \left(0.166666666666667 z_{1}^{2} - 0.166666666666667\right) + \frac{\mu_4}{\sigma^{4} n} \left(0.0294627825494395 \sqrt{2} z_{1}^{3} - 0.0883883476483184 \sqrt{2} z_{1}\right) + z_{1}\right)\right)^{2} + \left(a_{2} + 1\right) \left(\sigma \zeta + \frac{\sigma}{\sqrt{n}} \left(\frac{\mu_3^{2}}{\sigma^{3.0} n} \left(- 0.0392837100659193 \sqrt{2} z_{1}^{3} + 0.0982092751647983 \sqrt{2} z_{1}\right) + \frac{\mu_3}{\sigma^{1.5} \sqrt{n}} \left(0.166666666666667 z_{1}^{2} - 0.166666666666667\right) + \frac{\mu_4}{\sigma^{4} n} \left(0.0294627825494395 \sqrt{2} z_{1}^{3} - 0.0883883476483184 \sqrt{2} z_{1}\right) + z_{1}\right)\right)\right) + \frac{a_{4}}{\sigma^{2}} \left(1 - \frac{\sqrt{\mu_4 - \sigma^{4}}}{2 \sigma^{3} \sqrt{n}} \left(\rho z_{1} + \sqrt{- \rho^{2} + 1} z_{3}\right)\right)^{2} \left(\sigma \zeta + \frac{\sigma}{\sqrt{n}} \left(\frac{\mu_3^{2}}{\sigma^{3.0} n} \left(- 0.0392837100659193 \sqrt{2} z_{1}^{3} + 0.0982092751647983 \sqrt{2} z_{1}\right) + \frac{\mu_3}{\sigma^{1.5} \sqrt{n}} \left(0.166666666666667 z_{1}^{2} - 0.166666666666667\right) + \frac{\mu_4}{\sigma^{4} n} \left(0.0294627825494395 \sqrt{2} z_{1}^{3} - 0.0883883476483184 \sqrt{2} z_{1}\right) + z_{1}\right)\right)^{2}$

That's a bit hairy. Here I truncate that statistic in $n$ . This was hard for me to figure out in sympy, so I took a limit. (I like how 'oo' is infinity in sympy.)

In [5]:

#show nothing
v_0 = limit(v,n,oo)
v_05 = v_0 + (limit(sqrt(n) * (v - v_0),n,oo) / sqrt(n))
v_05

Out[5]:

$\frac{1}{\sigma^{17.0} \sqrt{n}} \left(- 0.5 \rho \sigma^{13.0} a_{1} \sqrt{\mu_4 - \sigma^{4}} z_{1} - 1.0 \rho \sigma^{14.0} \zeta^{2} a_{4} \sqrt{\mu_4 - \sigma^{4}} z_{1} - 0.5 \rho \sigma^{14.0} \zeta a_{2} \sqrt{\mu_4 - \sigma^{4}} z_{1} - 0.5 \rho \sigma^{14.0} \zeta \sqrt{\mu_4 - \sigma^{4}} z_{1} - 0.5 \rho \sigma^{15.0} \zeta^{2} a_{3} \sqrt{\mu_4 - \sigma^{4}} z_{1} - 0.5 \sigma^{13.0} a_{1} \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} - 1.0 \sigma^{14.0} \zeta^{2} a_{4} \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} - 0.5 \sigma^{14.0} \zeta a_{2} \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} - 0.5 \sigma^{14.0} \zeta \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} - 0.5 \sigma^{15.0} \zeta^{2} a_{3} \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} + 2.0 \sigma^{17.0} \zeta a_{4} z_{1} + 1.0 \sigma^{17.0} a_{2} z_{1} + 1.0 \sigma^{17.0} z_{1} + 2.0 \sigma^{18.0} \zeta a_{3} z_{1}\right) + \frac{1}{\sigma} \left(\sigma^{2} \zeta^{2} a_{3} + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma \zeta + \sigma a_{0} + a_{1}\right)$

Now we define the error as $v - \zeta$ and compute the approximate bias and variance of the error. We sum the variance and squared bias to get mean square error.

In [6]:

staterr = v_05 - zeta
# mean squared error of the statistic v, is
# MSE = E((newstat - zeta)**2)
# this is too slow, though, so evaluate them separately instead:
bias = E(staterr)
simplify(bias)

Out[6]:

$\sigma \zeta^{2} a_{3} + \zeta^{2} a_{4} + \zeta a_{2} + a_{0} + \frac{a_{1}}{\sigma}$

In [7]:

# variance of the error:
varerr = variance(staterr)
MSE = (bias**2) + varerr 
collect(MSE,n)

Out[7]:

$\left(- \zeta + \frac{1}{\sigma} \left(\sigma^{2} \zeta^{2} a_{3} + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma \zeta + \sigma a_{0} + a_{1}\right)\right)^{2} + \frac{1}{n} \left(\frac{0.25 \mu_4}{\sigma^{8.0}} a_{1}^{2} + \frac{1.0 \mu_4}{\sigma^{7.0}} \zeta^{2} a_{1} a_{4} + \frac{0.5 \mu_4}{\sigma^{7.0}} \zeta a_{1} a_{2} + \frac{0.5 \mu_4}{\sigma^{7.0}} \zeta a_{1} + \frac{1.0 \mu_4}{\sigma^{6.0}} \zeta^{4} a_{4}^{2} + \frac{1.0 \mu_4}{\sigma^{6.0}} \zeta^{3} a_{2} a_{4} + \frac{1.0 \mu_4}{\sigma^{6.0}} \zeta^{3} a_{4} + \frac{0.5 \mu_4}{\sigma^{6.0}} \zeta^{2} a_{1} a_{3} + \frac{0.25 \mu_4}{\sigma^{6.0}} \zeta^{2} a_{2}^{2} + \frac{0.5 \mu_4}{\sigma^{6.0}} \zeta^{2} a_{2} + \frac{0.25 \mu_4}{\sigma^{6.0}} \zeta^{2} + \frac{1.0 \mu_4}{\sigma^{5.0}} \zeta^{4} a_{3} a_{4} + \frac{0.5 \mu_4}{\sigma^{5.0}} \zeta^{3} a_{2} a_{3} + \frac{0.5 \mu_4}{\sigma^{5.0}} \zeta^{3} a_{3} + \frac{0.25 \mu_4}{\sigma^{4.0}} \zeta^{4} a_{3}^{2} - \frac{2.0 \rho}{\sigma^{4.0}} \zeta a_{1} a_{4} \sqrt{\mu_4 - \sigma^{4}} - \frac{1.0 \rho}{\sigma^{4.0}} a_{1} a_{2} \sqrt{\mu_4 - \sigma^{4}} - \frac{1.0 \rho}{\sigma^{4.0}} a_{1} \sqrt{\mu_4 - \sigma^{4}} - \frac{4.0 \rho}{\sigma^{3.0}} \zeta^{3} a_{4}^{2} \sqrt{\mu_4 - \sigma^{4}} - \frac{4.0 \rho}{\sigma^{3.0}} \zeta^{2} a_{2} a_{4} \sqrt{\mu_4 - \sigma^{4}} - \frac{4.0 \rho}{\sigma^{3.0}} \zeta^{2} a_{4} \sqrt{\mu_4 - \sigma^{4}} - \frac{2.0 \rho}{\sigma^{3.0}} \zeta a_{1} a_{3} \sqrt{\mu_4 - \sigma^{4}} - \frac{1.0 \rho}{\sigma^{3.0}} \zeta a_{2}^{2} \sqrt{\mu_4 - \sigma^{4}} - \frac{2.0 \rho}{\sigma^{3.0}} \zeta a_{2} \sqrt{\mu_4 - \sigma^{4}} - \frac{1.0 \rho}{\sigma^{3.0}} \zeta \sqrt{\mu_4 - \sigma^{4}} - \frac{6.0 \rho}{\sigma^{2.0}} \zeta^{3} a_{3} a_{4} \sqrt{\mu_4 - \sigma^{4}} - \frac{3.0 \rho}{\sigma^{2.0}} \zeta^{2} a_{2} a_{3} \sqrt{\mu_4 - \sigma^{4}} - \frac{3.0 \rho}{\sigma^{2.0}} \zeta^{2} a_{3} \sqrt{\mu_4 - \sigma^{4}} - \frac{2.0 \rho}{\sigma^{1.0}} \zeta^{3} a_{3}^{2} \sqrt{\mu_4 - \sigma^{4}} - \frac{0.25 a_{1}^{2}}{\sigma^{4.0}} - \frac{1.0 a_{1}}{\sigma^{3.0}} \zeta^{2} a_{4} - \frac{0.5 \zeta}{\sigma^{3.0}} a_{1} a_{2} - \frac{0.5 \zeta}{\sigma^{3.0}} a_{1} - \frac{1.0 \zeta^{4}}{\sigma^{2.0}} a_{4}^{2} - \frac{1.0 a_{2}}{\sigma^{2.0}} \zeta^{3} a_{4} - \frac{1.0 a_{4}}{\sigma^{2.0}} \zeta^{3} - \frac{0.5 a_{1}}{\sigma^{2.0}} \zeta^{2} a_{3} - \frac{0.25 \zeta^{2}}{\sigma^{2.0}} a_{2}^{2} - \frac{0.5 a_{2}}{\sigma^{2.0}} \zeta^{2} - \frac{0.25 \zeta^{2}}{\sigma^{2.0}} - \frac{1.0 a_{3}}{\sigma^{1.0}} \zeta^{4} a_{4} - \frac{0.5 a_{2}}{\sigma^{1.0}} \zeta^{3} a_{3} - \frac{0.5 a_{3}}{\sigma^{1.0}} \zeta^{3} + 8.0 \sigma^{1.0} \zeta^{2} a_{3} a_{4} + 4.0 \sigma^{1.0} \zeta a_{2} a_{3} + 4.0 \sigma^{1.0} \zeta a_{3} + 4.0 \sigma^{2.0} \zeta^{2} a_{3}^{2} - 0.25 \zeta^{4} a_{3}^{2} + 4.0 \zeta^{2} a_{4}^{2} + 4.0 \zeta a_{2} a_{4} + 4.0 \zeta a_{4} + 1.0 a_{2}^{2} + 2.0 a_{2} + 1.0\right)$

That's really involved, and finding the derivative will be ugly. Instead we truncate at $n^{-1}$ , which leaves us terms constant in $n$ . Looking above, you will see that removing terms in $n^{-1}$ leaves some quantity squared. That is what we will minimize. The way forward is fairly clear from here.

In [8]:

# truncate!
MSE_0 = limit(collect(MSE,n),n,oo)
MSE_1 = MSE_0 + (limit(n * (MSE - MSE_0),n,oo)/n)
MSE_0

Out[8]:

$\frac{1}{\sigma^{2}} \left(\sigma^{4} \zeta^{4} a_{3}^{2} + 2 \sigma^{3} \zeta^{4} a_{3} a_{4} + 2 \sigma^{3} \zeta^{3} a_{2} a_{3} + 2 \sigma^{3} \zeta^{2} a_{0} a_{3} + \sigma^{2} \zeta^{4} a_{4}^{2} + 2 \sigma^{2} \zeta^{3} a_{2} a_{4} + 2 \sigma^{2} \zeta^{2} a_{0} a_{4} + 2 \sigma^{2} \zeta^{2} a_{1} a_{3} + \sigma^{2} \zeta^{2} a_{2}^{2} + 2 \sigma^{2} \zeta a_{0} a_{2} + \sigma^{2} a_{0}^{2} + 2 \sigma \zeta^{2} a_{1} a_{4} + 2 \sigma \zeta a_{1} a_{2} + 2 \sigma a_{0} a_{1} + a_{1}^{2}\right)$

Now we take the derivative of the Mean Square Error with respect to the $a_i$ . In each case we will get an equation linear in all the $a_i$ . The first order condition, which corresponds to minimizing the MSE, occurs for $a_i=0$ .

In [9]:

# a_0
simplify(diff(MSE_0,a0))

Out[9]:

$2 \sigma \zeta^{2} a_{3} + 2 \zeta^{2} a_{4} + 2 \zeta a_{2} + 2 a_{0} + \frac{2 a_{1}}{\sigma}$

In [10]:

# a_1
simplify(diff(MSE_0,a1))

Out[10]:

$2 \zeta^{2} a_{3} + \frac{2 a_{4}}{\sigma} \zeta^{2} + \frac{2 \zeta}{\sigma} a_{2} + \frac{2 a_{0}}{\sigma} + \frac{2 a_{1}}{\sigma^{2}}$

In [11]:

# a_2
simplify(diff(MSE_0,a2))

Out[11]:

$\frac{2 \zeta}{\sigma} \left(\sigma^{2} \zeta^{2} a_{3} + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma a_{0} + a_{1}\right)$

In [12]:

# a_3
simplify(diff(MSE_0,a3))

Out[12]:

$2 \zeta^{2} \left(\sigma^{2} \zeta^{2} a_{3} + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma a_{0} + a_{1}\right)$

In [13]:

# a_4
simplify(diff(MSE_0,a4))

Out[13]:

$\frac{2 \zeta^{2}}{\sigma} \left(\sigma^{2} \zeta^{2} a_{3} + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma a_{0} + a_{1}\right)$

To recap, the minimal MSE occurs for $a_0 = a_1 = a_2 = a_3 = a_4 = 0$ . We must try another approach.