In [1]:
import pandas as pd
import numpy as np
import seaborn
%matplotlib inline

from IPython.display import Image
from IPython.core.display import HTML 
In [2]:
file = 'ExperimentFullResults_noPercentage.csv'
df = pd.read_csv(file)

Experiment results analysis.

Clean data and descriptive statistics.

From the 1250 runs, 260 were erroneous, mainly because the portfolio value went below zero (and this creates a stackoverflow error when estimating the final statistics, I think this error is fixed in a recent patch).

In [3]:
df[df.TotalTrades == 0].shape[0]
Out[3]:
260
In [4]:
df = df[df.TotalTrades != 0]
In [5]:
df.describe()
Out[5]:
TotalTrades AverageWin AverageLoss CompoundingAnnualReturn Drawdown Expectancy NetProfit SharpeRatio LossRate WinRate ... AnnualStandardDeviation AnnualVariance InformationRatio TrackingError TreynorRatio TotalFees MaxExposure Leverage InitialCash PairsToTrade
count 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000 ... 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000 990.000000
mean 1001.993939 0.026291 -0.026413 -0.030644 0.512565 0.016145 -0.127610 0.135249 0.496717 0.503283 ... 0.319380 0.647293 -0.216307 0.354697 -0.399678 3743.065495 0.415455 10.200000 333535.353535 3.009091
std 501.122852 0.051129 0.050953 0.084435 0.319709 0.128365 0.279853 0.125491 0.029475 0.029475 ... 0.738854 14.030692 0.290198 0.726263 4.056452 8763.064810 0.198053 10.712645 377734.219805 1.428058
min 7.000000 -0.049700 -0.695000 -0.946200 0.002000 -0.631000 -0.981950 -0.215000 0.160000 0.420000 ... 0.001000 0.000000 -0.596000 0.113000 -48.368000 0.000000 0.200000 1.000000 10000.000000 1.000000
25% 684.000000 0.004200 -0.028000 -0.033930 0.198250 -0.015000 -0.221470 0.063000 0.490000 0.490000 ... 0.063250 0.004000 -0.493000 0.126000 0.072000 0.000000 0.300000 2.000000 50000.000000 2.000000
50% 1026.000000 0.012600 -0.012600 -0.002065 0.524500 0.003000 -0.014850 0.099000 0.500000 0.500000 ... 0.195000 0.038000 -0.210000 0.219000 0.131000 0.000000 0.400000 5.000000 100000.000000 3.000000
75% 1376.000000 0.026700 -0.004200 0.002730 0.804000 0.013000 0.019985 0.175000 0.510000 0.510000 ... 0.399000 0.159000 -0.024000 0.407000 0.247000 2000.190000 0.500000 10.000000 500000.000000 4.000000
max 1724.000000 0.528200 -0.000300 0.354120 1.000000 2.008000 0.480640 1.336000 0.580000 0.840000 ... 21.003000 441.106000 1.361000 20.974000 10.025000 55883.380000 0.800000 50.000000 1000000.000000 5.000000

8 rows × 23 columns

Sharpe ratio distribution

I'll analyze the effect of the algorithm parameters making focus just in the Sharpe ratio (all the data is available here, so you can make your own analysis if you want to).

Sharpe ratio density grouped by Maximum exposure

In [6]:
Image(url="http://i.imgur.com/zcN9XSU.png")
Out[6]:

Sharpe ratio density grouped by Leverage

In [7]:
Image(url='http://i.imgur.com/eidGdcp.png')
Out[7]:

Sharpe ratio density grouped by PairsToTrade

In [8]:
Image(url='http://i.imgur.com/i37YEZa.png')
Out[8]:

Sharpe ratio density grouped by InitialCash

In [9]:
Image(url='http://i.imgur.com/FefKZJS.png')
Out[9]:

Sharpe ratio density grouped by Broker

Finally the main question!

In [10]:
Image(url='http://i.imgur.com/lEPT2kU.png')
Out[10]:

At first sight, this plot shows a better performance of FXCM relative to Oanda.

Now, let's try to figure out why FXCM has that second smaller peak.

Sharpe ratio vs. Annual SD by broker, by PairsToTrade

In [11]:
Image(url='http://i.imgur.com/bcO00pa.png')
Out[11]:

I used Annual SD in this projection because it helps us to separate the FXCM observations from the density second peak. The points shape are defined by PairstoTrade, and is clear that the FXCM observations to the right are mostly by runs where only the pairs with the biggest and the lowest excess returns were traded.

Sharpe ratio box plot by Broker

In [12]:
Image(url='http://i.imgur.com/idX7TqI.png')
Out[12]:

This plot shows:

  • The mean (the dark blue vertical line)
  • Border values for the standard deviation of the mean. The blue highlighted area is the entire standard deviation of the mean.
  • The median (yellow vertical line). The thin blue line represents the area between the first (25%) and the third (75%) quantile, while the thin dotted line represents the entire range of values (from the lowest to the highest value in the data set for the selected parameter).

Here we can have a measure of the difference in Sharpe ratio because of the Broker selection. We already know that FXCM has, in average, a better performance. Here is clear that the difference is small if the means are compared; if we compare the medians, the difference is even smaller.

Finally, lets make some simple statistical tests in order to know if those small differences are significance.

In [13]:
from scipy.stats import ttest_ind, ttest_rel
fxcm = df[df['Broker']=='fxcm']
oanda = df[df['Broker']=='oanda']
In [14]:
statistics = df.columns[:17]
for statistic in statistics:
    test = ttest_ind(fxcm.get(statistic), oanda.get(statistic), equal_var=False)[1]
    print(statistic)
    print("\t=> Is difference significative at 95%? {1}".format(statistic, test<0.05))
    print("\t=> Is difference significative at 99%? {1}".format(statistic, test<0.01))
TotalTrades
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False
AverageWin
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False
AverageLoss
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False
CompoundingAnnualReturn
	=> Is difference significative at 95%? True
	=> Is difference significative at 99%? True
Drawdown
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False
Expectancy
	=> Is difference significative at 95%? True
	=> Is difference significative at 99%? True
NetProfit
	=> Is difference significative at 95%? True
	=> Is difference significative at 99%? True
SharpeRatio
	=> Is difference significative at 95%? True
	=> Is difference significative at 99%? True
LossRate
	=> Is difference significative at 95%? True
	=> Is difference significative at 99%? True
WinRate
	=> Is difference significative at 95%? True
	=> Is difference significative at 99%? True
ProfitLossRatio
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False
Alpha
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False
Beta
	=> Is difference significative at 95%? True
	=> Is difference significative at 99%? True
AnnualStandardDeviation
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False
AnnualVariance
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False
InformationRatio
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False
TrackingError
	=> Is difference significative at 95%? False
	=> Is difference significative at 99%? False

Well, seems FXCM has a significantly better performance than OANDA.

Being statistically correct one can say: this experiment failed to show that FXCM and OANDA have the same impact in trading.

Extras

Take care! High exposure + high leverage = huge drawdowns

In [15]:
Image(url='http://i.imgur.com/sSDuZhu.png')
Out[15]:
In [16]:
Image(url='http://i.imgur.com/Qpq2LDL.png')
Out[16]:

The first plot is the Drawdown box plot grouped by MaxExposure, the second is Drawdown gruped by Leverage.

So, FXCM is cheaper, uh? Let's see how it scales.

FXCM fees grouped by InitialCash.

In [17]:
Image(url='http://i.imgur.com/FlARWUo.png')
Out[17]:
In [ ]: