KQ3: Pritts Meta-analysis Update

In key question 3, we address the risk of sarcoma dissemination following morcellation of fibroids. We identified a recently published analysis conducted by Elizabeth Pritts and her colleagues (2015) to estimate the prevalence of occult leiomyosarcoma at time of treatment for presumed benign tumors (fibroids). We updated their search and used similar eligibility criteria to identify papers published since 2014. We extracted from these papers the number of women who were treated for uterine fibroids and the cases of leiomyosarcoma subsequently identified. We have combined these data with the data from the 134 publications that Pritts et al included in their analysis for a total of LMS rates from 148 sources.

In [2]:
%matplotlib inline
import numpy as np
import pandas as pd
import pymc3 as pm
import seaborn as sns
import matplotlib.pyplot as plt
import pdb
sns.set()

rseeds = 20090425, 19771114
In [3]:
kq3_data = pd.read_excel('data/UF CER KQ3 Data for Analysis.xlsx', sheetname='5.UFKQ3Data', 
                         na_values=['NR', 'NA'],
                        index_col=0)[['Author', 'Year', 'Design', 'Procedure', 'Indication',
                                     'Age, Mean', 'Age, SD', 'Age, Median', 'Age, Min', 'Age, Max', 'Age, Other',
                                     'LMS','Population','Tumors','InPritts']].dropna(thresh=7)
kq3_data.head()
Out[3]:
Author Year Design Procedure Indication Age, Mean Age, SD Age, Median Age, Min Age, Max Age, Other LMS Population Tumors InPritts
Line
1 Adelusola KA, Ogunniyi SO 2001 Retrospective NE NE NaN NaN NaN 19.0 89 NaN 0 177 0 1
2 Ahmed AA, Stachurski J, Abdel Aziz E et al 2002 Prospective NE NE NaN NaN NaN 29.0 65 NaN 0 10 0 1
3 Angle HS, Cohen SM, Hidlebaugh D 1995 Retrospective NE NE 41.0 NaN NaN 24.0 81 NaN 0 41 0 1
4 Banaczek Z, Sikora K, Lewandowska-Andruszuk I 2004 Retrospective NE NE 44.5 NaN NaN 29.0 73 NaN 0 309 0 1
5 Barbieri R, Dilena M, Chumas J, Rein MS, Fried... 1993 RCT NE NE 33.7 NaN NaN NaN NaN 36.3 0 20 0 1

Missing values

In [4]:
kq3_data.isnull().sum()
Out[4]:
Author           0
Year             0
Design           0
Procedure        0
Indication       3
Age, Mean       70
Age, SD        147
Age, Median    150
Age, Min        64
Age, Max        65
Age, Other     127
LMS              0
Population       0
Tumors           0
InPritts         0
dtype: int64
In [5]:
kq3_data['age_max'] = kq3_data['Age, Max'].replace('50+', 50)
kq3_data = kq3_data.rename(columns={'Age, Min': 'age_min',
                                   'Age, Mean': 'age_mean',
                                   'Age, SD': 'age_sd',
                                   'Age, Median': 'age_med'})
In [6]:
kq3_data.describe()
Out[6]:
Year age_mean age_sd age_med age_min LMS Population Tumors InPritts age_max
count 161.000000 91.000000 14.000000 11.000000 97.00000 161.000000 161.000000 161.000000 161.000000 96.000000
mean 2005.149068 42.117912 7.315000 41.545455 26.91134 2.652174 996.900621 2.652174 0.832298 61.466667
std 7.865440 6.413083 2.977962 5.791954 6.20752 16.441662 3927.579174 16.441662 0.374767 13.430732
min 1984.000000 28.200000 1.800000 35.000000 18.00000 0.000000 5.000000 0.000000 0.000000 34.000000
25% 1999.000000 37.320000 6.000000 37.450000 22.00000 0.000000 40.000000 0.000000 1.000000 51.000000
50% 2006.000000 42.500000 6.600000 40.000000 26.00000 0.000000 92.000000 0.000000 1.000000 60.000000
75% 2012.000000 46.000000 7.812500 45.800000 31.00000 0.000000 368.000000 0.000000 1.000000 70.250000
max 2016.000000 61.200000 12.500000 52.900000 44.00000 172.000000 34603.000000 172.000000 1.000000 96.000000
In [7]:
kq3_data[[c for c in kq3_data.columns if c.startswith('age')]].hist();

Breakdown on studies by design.

In [8]:
kq3_data['Design'] = kq3_data.Design.str.strip()
In [9]:
kq3_data.Design.value_counts()
Out[9]:
Retrospective                     88
Prospective                       38
RCT                               27
Retrospective Cohort               4
Pop based cohort                   1
Prospective cohort pilot study     1
Cohort                             1
Observational study                1
Name: Design, dtype: int64

I will restrict the analysis to the top three study designs.

In [10]:
design_subset = ['Retrospective', 'Prospective', 'RCT']
In [11]:
kq3_data = kq3_data[kq3_data.Design.isin(design_subset)]
In [12]:
kq3_data = pd.concat([kq3_data, pd.get_dummies(kq3_data.Design)[['Prospective', 'RCT']]], axis=1)
In [13]:
kq3_data['design_ind'] = kq3_data.Design.replace({'Retrospective':0, 'Prospective':1, 'RCT':2})
In [14]:
kq3_data.loc[kq3_data.Design=='Prospective', ['Population','Tumors']]
Out[14]:
Population Tumors
Line
2 10 0
6 91 0
7 75 0
9 24 0
10 25 0
13 80 0
14 136 0
15 11 0
18 25 0
21 9 0
27 213 0
28 71 0
31 17 0
44 47 0
46 89 0
50 75 0
52 28 0
54 25 0
55 45 0
56 368 0
57 8 0
63 167 0
64 486 0
69 80 0
73 69 0
80 331 0
85 30 0
89 38 0
91 8 0
108 37 0
110 505 2
115 235 0
119 92 1
120 9 0
122 51 0
123 18 0
125 6 0
127 5 0

Following Pritts et al. (2015), I fit a binomial random effects model, such that event probabilities on the logit scale are normally distributed with mean $\mu$ and standard deviation $\sigma$. This distribution describes how the probabilities vary across studies, with the degree of variation described by $\sigma$.

$$\theta_i \sim N(\mu, \sigma^2)$$

the expected value for study $i$ is then inverse-logit transformed, and used as the event probability $\pi_i$ in a binomial model describing the number of observed tumors $t$:

$$\log\left[\frac{\pi_i}{1-\pi_i}\right] = \theta_i$$

$$t_i \sim \text{Bin}(n_i, \pi_i)$$

In [15]:
import theano.tensor as tt
from numpy.ma import masked_values

model_data = kq3_data

k = model_data.shape[0]
tumors = model_data.Tumors.values.astype(int)
n = model_data.Population.values.astype(int)
X = model_data[['Prospective', 'RCT']].values.astype(int)
design = model_data.design_ind.values.astype(int)
age_max_norm = ((model_data.age_max - 60)/10).fillna(0.5).values

poly_terms = 3

invlogit = pm.math.invlogit
In [16]:
with pm.Model() as pritts_update:
    
    # Impute missing max ages
    age_max_missing = masked_values(age_max_norm, value=0.5)
    ν = pm.HalfCauchy('ν', 5, testval=1)
    μ_age = pm.Normal('μ_age', 0, 5, testval=0)
    age_max = pm.StudentT('age_max', ν, mu=μ_age, observed=age_max_missing)
    
    # Study random effect
    μ = pm.Normal('μ', 0, sd=100, testval=-3)
    σ = pm.Uniform('σ', 0, 1000, testval=10)
    θ = pm.Normal('θ', μ, sd=σ, shape=k)
    # Design effects
    β = pm.Normal('β', 0, sd=10, shape=2, testval=np.zeros(2))
    # Age effect
    α = pm.Normal('α', 0, sd=10, testval=0)
    
    # Study-specific probabilities
    π = pm.Deterministic('π', invlogit(θ + tt.dot(X,β) + α*age_max))
    
    # Expected probabilities by design
    p_retro_10000 = pm.Deterministic('p_retro_10000', invlogit(μ)*10000)
    p_prosp_10000 = pm.Deterministic('p_prosp_10000', invlogit(μ + β[0])*10000)
    p_rct_10000 = pm.Deterministic('p_rct_10000', invlogit(μ + β[1])*10000)
    
    obs = pm.Binomial('obs', n=n, p=π, observed=tumors)
In [17]:
with pritts_update:
    advi_fit = pm.variational.advi(n=40000)
    trace = pm.sample(3000, step=pm.NUTS(scaling=advi_fit.means), start=advi_fit.means, random_seed=[20140425, 19700903][0])
Average ELBO = -308.35: 100%|██████████| 40000/40000 [00:10<00:00, 3764.53it/s]
Finished [100%]: Average ELBO = -305.43
100%|██████████| 3000/3000 [01:42<00:00, 29.32it/s]
In [49]:
with pm.Model() as pritts_update_noage:
    
    # Study random effect
    μ = pm.Normal('μ', 0, sd=100, testval=-3)
    σ = pm.Uniform('σ', 0, 1000, testval=10)
    θ = pm.Normal('θ', μ, sd=σ, shape=k)
    # Design effects
    β = pm.Normal('β', 0, sd=10, shape=2, testval=np.zeros(2))
    
    # Study-specific probabilities
    π = pm.Deterministic('π', invlogit(θ + tt.dot(X,β)))
    
    # Expected probabilities by design
    p_retro_10000 = pm.Deterministic('p_retro_10000', invlogit(μ)*10000)
    p_prosp_10000 = pm.Deterministic('p_prosp_10000', invlogit(μ + β[0])*10000)
    p_rct_10000 = pm.Deterministic('p_rct_10000', invlogit(μ + β[1])*10000)
    
    obs = pm.Binomial('obs', n=n, p=π, observed=tumors)
Applied interval-transform to σ and added transformed σ_interval_ to model.
/Users/fonnescj/Repositories/pymc3/pymc3/distributions/continuous.py:59: UserWarning: The variable specified for sd has non-positive support for Normal, likely making it unsuitable for this parameter.
  warnings.warn(msg)
In [50]:
with pritts_update_noage:
    trace = pm.sample(2000, random_seed=rseeds[0])#, random_seed=[20140425, 19700903])
Assigned NUTS to μ
Assigned NUTS to σ_interval_
Assigned NUTS to θ
Assigned NUTS to β
100%|██████████| 2000/2000 [00:30<00:00, 66.39it/s]
In [13]:
with pm.Model() as pritts_uncond:
    
    # Impute missing max ages
    age_max_missing = masked_values(age_max_norm, value=0.5)
    ν = pm.HalfCauchy('ν', 5, testval=1)
    μ_age = pm.Normal('μ_age', 0, 5, testval=0)
    age_max = pm.StudentT('age_max', ν, mu=μ_age, observed=age_max_missing)
    
    # Study random effect
    μ = pm.Normal('μ', 0, sd=100, testval=-3)
    σ = pm.Uniform('σ', 0, 1000, testval=10)
    θ = pm.Normal('θ', μ, sd=σ, shape=k)
    # Design effects
    β = pm.Normal('β', 0, sd=10, shape=2, testval=np.zeros(2))
    α = pm.Normal('α', 0, sd=10, testval=0)
    
    # Study-specific probabilities
    π = pm.Deterministic('π', invlogit(θ + α*age_max))
    
    p_10000 = pm.Deterministic('p_10000', invlogit(μ)*10000)
    
    obs = pm.Binomial('obs', n=n, p=π, observed=tumors)
Applied log-transform to ν and added transformed ν_log to model.
Applied interval-transform to σ and added transformed σ_interval to model.
In [21]:
with pritts_uncond:
    trace_uncond = pm.sample(2000, random_seed=rseeds[0])#, random_seed=[20140425, 19700903], njobs=2)
#     gp_fit = pm.variational.advi(n=10000)
Assigned NUTS to ν_log_
Assigned NUTS to μ_age
Assigned NUTS to age_max_missing
Assigned NUTS to μ
Assigned NUTS to σ_interval_
Assigned NUTS to θ
Assigned NUTS to β
Assigned NUTS to α
 [-----------------100%-----------------] 2000 of 2000 complete in 174.7 sec

The following plots are the distribution of samples from the posterior distributions for the expected (population) probability of tumor (p_update), the inverse-logit expected probability ($\mu$) and the standard deviation of the probabilities on the inverse-logit scale ($\sigma$).

In [18]:
pm.traceplot(trace[-1000:], varnames=['p_retro_10000', 'p_prosp_10000', 'p_rct_10000', 
                                    'μ', 'σ']);

Summary statistics from each of the above parameters.

In [19]:
pm.summary(trace[-1000:], varnames=['p_retro_10000', 'p_prosp_10000', 'p_rct_10000', 'μ', 'σ'], roundto=4)
p_retro_10000:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  8.8856           2.3711           0.1747           [4.7012, 13.6712]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  4.7732         7.3025         8.7350         10.2718        13.8700


p_prosp_10000:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  5.5972           3.9193           0.1593           [0.4361, 13.6899]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  0.8226         2.8367         4.6300         7.4328         15.5504


p_rct_10000:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  0.3536           1.1564           0.0554           [0.0000, 1.8605]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  0.0000         0.0005         0.0068         0.1107         3.3614


μ:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  -7.0610          0.2723           0.0205           [-7.6544, -6.5873]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  -7.6468        -7.2214        -7.0421        -6.8799        -6.5792


σ:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  0.8534           0.2522           0.0226           [0.3870, 1.3387]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  0.4058         0.6632         0.8425         1.0187         1.3874

By comparison, estimates from Pritts' supplement:

pritts estimates

In [20]:
p_vars = ['p_retro_10000', 'p_prosp_10000', 'p_rct_10000']
intervals = [pm.stats.hpd(trace[var][-1000:]) for var in p_vars]
In [21]:
meds = [np.median(trace[var][-1000:]) for var in p_vars]
upper = [i[1] - m for m,i in zip(meds, intervals)]
lower = [m - i[0] for m,i in zip(meds, intervals)]

plt.errorbar(np.arange(3)-0.1, meds, yerr=[lower, upper], fmt='s', label='Updated MA')
plt.errorbar(np.arange(2)+0.1, [5.7, 1.2], yerr=[[5.7-1.7, 1.2], [11.3 - 5.7,7.5 - 1.2]], 
             color='r', fmt='D', label='Pritts MA', elinewidth=0.5)
plt.xticks(range(3), ['Retrospective', 'Prospective', 'RCT'])
plt.ylabel('Rate per 10,000 surgeries')
plt.xlim(-0.5, 2.25)
plt.legend();
In [42]:
pm.forestplot(trace[1000:], varnames=['age_max_missing'], ylabels=[''])
Out[42]:
<matplotlib.gridspec.GridSpec at 0x1219b4048>

Pooled estimates

In [24]:
pm.traceplot(trace_uncond[1000:], varnames=['p_10000', 
                                     'α', 'μ', 'σ']);
In [25]:
pm.summary(trace_uncond[1000:], varnames=['p_10000', 
                                     'α', 'μ', 'σ']);
p_10000:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  6.965            1.839            0.131            [3.196, 10.338]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  3.496          5.677          6.944          8.132          10.769


α:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  0.149            0.184            0.011            [-0.242, 0.496]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  -0.199         0.020          0.147          0.269          0.548


μ:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  -7.306           0.282            0.021            [-7.893, -6.811]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  -7.958         -7.473         -7.272         -7.114         -6.833


σ:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  0.832            0.197            0.017            [0.474, 1.202]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  0.494          0.692          0.820          0.941          1.284

Goodness of fit

To check how well the model fits the data, I conducted posteior predictive checks. This simulates data from the model for each data point (i.e. study) in the meta-analysis, and compares the distribution of these simulated values (here, 500 replicates) with the value of the data itself. If the percentile of the datum is very extreme (either large or small), then it is evidence that the model does not adequately fit the data. The distribution of percentiles in the bottom histogram shows that there is no evidence of lack of fit.

In [22]:
ppc = pm.sample_ppc(trace, model=pritts_update, samples=500)
100%|██████████| 500/500 [00:07<00:00, 62.93it/s]
In [23]:
from scipy.stats import percentileofscore
        
p = [percentileofscore(s, o).round(2) for s,o in zip(ppc['obs'].T, tumors)]
        
plt.hist(p)
plt.xlabel('percentile')
plt.ylabel('frequency');