Let μ, and σ be the mean and standard deviation of the returns of an asset. Then ζ=μ/σ is the "Signal to Noise Ratio" (SNR). Typically the SNR is estimated with the Sharpe Ratio, defined as ˆζ=ˆμ/ˆσ, where ˆμ and ˆσ are the vanilla sample estimates. Can we gain efficiency in the case where the returns have significant skew and kurtosis?
Here we consider an estimator of the form v=a0+a1+(1+a2)ˆμ+a3ˆμ2ˆσ+a4(ˆμˆσ)2.
Below, following Johnson, I will use the Cornish Fisher expansions of ˆμ and ˆσ to approximate v as a function of the first few cumulants of the distribution, and some normal variates. I will then compute the mean square error, E[(v−ζ)2], and take its derivative with respect to ai. Unfortunately, we will find that the first order conditions are solved by ai=0, which is to say that the vanilla Sharpe has the lowest MSE of estimators of this kind. Our adventure will take us far, but we will return home empty handed.
We proceed.
# load what we need from sympy
from __future__ import division
from sympy import *
from sympy import Order
from sympy.assumptions.assume import global_assumptions
from sympy.stats import P, E, variance, Normal
init_printing()
nfactor = 4
# define some symbols.
a0, a1, a2, a3, a4 = symbols('a_0 a_1 a_2 a_3 a_4',real=True)
n, sigma = symbols('n \sigma',real=True,positive=True)
zeta, mu3, mu4 = symbols('\zeta \mu_3 \mu_4',real=True)
mu = zeta * sigma
We now express ˆμ and ˆσ2 by the Cornish Fisher expansion. This is an expression of the distribution of a random variable in terms of its cumulants and a normal variate. The expansion is ordered in a way such that when applied to the mean of independent draws of a distribution, the terms are clustered by the order of n. The Cornish Fisher expansion also involves the Hermite polynomials. The expansions of ˆμ and ˆσ2 are not independent. We follow Johnson in expression the correlation of normals and truncating:
# probabilist's hermite polynomials
def Hen(x,n):
return (2**(-n/2) * hermite(n,x/sqrt(2)))
# this comes out of the wikipedia page:
h1 = lambda x : Hen(x,2) / 6
h2 = lambda x : Hen(x,3) / 24
h11 = lambda x : - (2 * Hen(x,3) + Hen(x,1)) / 36
# mu3 is the 3rd centered moment of x
gamma1 = (mu3 / (sigma**(3/2))) / sqrt(n)
gamma2 = (mu4 / (sigma**4)) / n
# grab two normal variates with correlation rho
# which happens to take value:
# rho = mu3 / sqrt(sigma**2 * (mu4 - sigma**4))
z1 = Normal('z_1',0,1)
z3 = Normal('z_3',0,1)
rho = symbols('\\rho',real=True)
z2 = rho * z1 + sqrt(1-rho**2)*z3
# this is out of Johnson, but we call it mu hat instead of x bar:
muhat = mu + (sigma/sqrt(n)) * (z1 + gamma1 * h1(z1) + gamma2 * h2(z1) + gamma1**2 * h11(z1))
muhat
addo = sqrt((mu4 - sigma**4) / (n * sigma**4)) * z2
# this is s^2 in Johnson:
sighat2 = (sigma**2) * (1 + addo)
# use Taylor's theorem to express sighat^-1:
invs = (sigma**(-1)) * (1 - (1/(2*sigma)) * addo)
invs
# the new statistic; it is v = part1 + part2 + part3
part1 = a0
part2 = (a1 + (1+a2)*muhat + a3 * muhat**2) * invs
part3 = a4 * (muhat*invs)**2
v = part1 + part2 + part3
v
That's a bit hairy. Here I truncate that statistic in n. This was hard for me to figure out in sympy, so I took a limit. (I like how 'oo' is infinity in sympy.)
#show nothing
v_0 = limit(v,n,oo)
v_05 = v_0 + (limit(sqrt(n) * (v - v_0),n,oo) / sqrt(n))
v_05
Now we define the error as v−ζ and compute the approximate bias and variance of the error. We sum the variance and squared bias to get mean square error.
staterr = v_05 - zeta
# mean squared error of the statistic v, is
# MSE = E((newstat - zeta)**2)
# this is too slow, though, so evaluate them separately instead:
bias = E(staterr)
simplify(bias)
# variance of the error:
varerr = variance(staterr)
MSE = (bias**2) + varerr
collect(MSE,n)
That's really involved, and finding the derivative will be ugly. Instead we truncate at n−1, which leaves us terms constant in n. Looking above, you will see that removing terms in n−1 leaves some quantity squared. That is what we will minimize. The way forward is fairly clear from here.
# truncate!
MSE_0 = limit(collect(MSE,n),n,oo)
MSE_1 = MSE_0 + (limit(n * (MSE - MSE_0),n,oo)/n)
MSE_0
Now we take the derivative of the Mean Square Error with respect to the ai. In each case we will get an equation linear in all the ai. The first order condition, which corresponds to minimizing the MSE, occurs for ai=0.
# a_0
simplify(diff(MSE_0,a0))
# a_1
simplify(diff(MSE_0,a1))
# a_2
simplify(diff(MSE_0,a2))
# a_3
simplify(diff(MSE_0,a3))
# a_4
simplify(diff(MSE_0,a4))
To recap, the minimal MSE occurs for a0=a1=a2=a3=a4=0. We must try another approach.