Key ideas: False discovery rates, simulation study, multiple testing
This notebook is a small simulation study to assess the sampling properties of different approaches to handling multiple testing, using the routines provided in the statsmodels
stats.multitest module. All assessments are done for the case of a one-sample Z-test for the expected value of a population, where the data are an independent and identically distributed sample from the population.
We start by importing a few modules:
import numpy as np from statsmodels.stats.multitest import multipletests from scipy.stats.distributions import norm import pandas as pd
Next we define a class that holds several parameters that control how the test data are generated. Each parameter is given a default value.
class parameters: # Number of simulation replications nrep = 1000 # Sample size n = 40 # Effect size (mean under the alternative hypothesis). effect_size = 1 # Cluster sie clust_size = 5 # Intraclass correlation for clusters icc = 0. # The threshold for calling a positive result threshold = 0.1 # The multitest method to evaluate method = "fdr_by"
Next we define a function called "simulation", where all the simulation takes place. This function takes an instance of the "parameters" class as an argument. The results of this function are:
def simulate(p): efdr =  for i in range(p.nrep): data = np.random.normal(size=(200,p.n)) # Introduce some positive dependence data = np.kron(data, np.ones((p.clust_size,1))) data = np.sqrt(p.icc)*data + np.sqrt(1-p.icc)*np.random.normal(size=data.shape) # 20 tests will follow the alternative hypothesis. Place these # tests at random positions so they are distributed among the clusters ii = np.random.permutation(data.shape) i1 = ii[:20] i0 = ii[20:] data[i1,:] += p.effect_size # ix=1 is true alternative, ix=0 is true null ix = np.zeros(data.shape) ix[i1] = 1 # Carry out the one-sample Z-tests zscores = np.sqrt(p.n) * data.mean(1) / data.std(1) pvalues = 2*norm.cdf(-np.abs(zscores)) # Get the adjusted test results apv = multipletests(pvalues, method=p.method) # Number of calls, empirical FDR efdr.append([np.sum(apv < p.threshold), np.mean(ix[apv < p.threshold] == 0)]) return np.asarray(efdr)
First we run the simulation with independent tests (all population structure parameters are set at their default values). The results show that the approach is slightly conservative (the observed FDR is lower than the nominal FDR, which defaults to 0.1).
efdr = simulate(parameters()) rslt = pd.Series(efdr.mean(0), index=["Mean #calls", "Observed FDR"]) print rslt
Mean #calls 20.884000 Observed FDR 0.045815 dtype: float64
Now we introduce some positive dependence among the tests. The FDR approaches are derived for data-generating models with no dependence between tests, but the methods often perform well even when fairly strong dependence is present.
p = parameters() p.icc = 0.9 p.effect_size = 1 efdr = simulate(p) rslt = pd.Series(efdr.mean(0), index=["Mean #calls", "Observed FDR"]) print rslt
Mean #calls 20.861000 Observed FDR 0.041134 dtype: float64