*Key ideas:* False discovery rates, simulation study, multiple testing

This notebook is a small simulation study to assess the sampling properties of different approaches to handling multiple testing, using the routines provided in the statsmodels `stats.multitest`

module. All assessments are done for the case of a one-sample Z-test for the expected value of a population, where the data are an independent and identically distributed sample from the population.

We start by importing a few modules:

In [1]:

```
import numpy as np
from statsmodels.stats.multitest import multipletests
from scipy.stats.distributions import norm
import pandas as pd
```

In [2]:

```
class parameters:
# Number of simulation replications
nrep = 1000
# Sample size
n = 40
# Effect size (mean under the alternative hypothesis).
effect_size = 1
# Cluster sie
clust_size = 5
# Intraclass correlation for clusters
icc = 0.
# The threshold for calling a positive result
threshold = 0.1
# The multitest method to evaluate
method = "fdr_by"
```

Next we define a function called "simulation", where all the simulation takes place. This function takes an instance of the "parameters" class as an argument. The results of this function are:

- Expected number of calls
- FDR (False Discovery Rate) -- this is what the FDR methods are supposed to control

In [3]:

```
def simulate(p):
efdr = []
for i in range(p.nrep):
data = np.random.normal(size=(200,p.n))
# Introduce some positive dependence
data = np.kron(data, np.ones((p.clust_size,1)))
data = np.sqrt(p.icc)*data + np.sqrt(1-p.icc)*np.random.normal(size=data.shape)
# 20 tests will follow the alternative hypothesis. Place these
# tests at random positions so they are distributed among the clusters
ii = np.random.permutation(data.shape[0])
i1 = ii[:20]
i0 = ii[20:]
data[i1,:] += p.effect_size
# ix=1 is true alternative, ix=0 is true null
ix = np.zeros(data.shape[0])
ix[i1] = 1
# Carry out the one-sample Z-tests
zscores = np.sqrt(p.n) * data.mean(1) / data.std(1)
pvalues = 2*norm.cdf(-np.abs(zscores))
# Get the adjusted test results
apv = multipletests(pvalues, method=p.method)[1]
# Number of calls, empirical FDR
efdr.append([np.sum(apv < p.threshold), np.mean(ix[apv < p.threshold] == 0)])
return np.asarray(efdr)
```

In [4]:

```
efdr = simulate(parameters())
rslt = pd.Series(efdr.mean(0), index=["Mean #calls", "Observed FDR"])
print rslt
```

In [5]:

```
p = parameters()
p.icc = 0.9
p.effect_size = 1
efdr = simulate(p)
rslt = pd.Series(efdr.mean(0), index=["Mean #calls", "Observed FDR"])
print rslt
```