In key question 3, we address the risk of sarcoma dissemination following morcellation of fibroids. We identified a recently published analysis conducted by Elizabeth Pritts and her colleagues (2015) to estimate the prevalence of occult leiomyosarcoma at time of treatment for presumed benign tumors (fibroids). We updated their search and used similar eligibility criteria to identify papers published since 2014. We extracted from these papers the number of women who were treated for uterine fibroids and the cases of leiomyosarcoma subsequently identified. We have combined these data with the data from the 134 publications that Pritts et al included in their analysis for a total of LMS rates from 148 sources.
%matplotlib inline
import numpy as np
import pandas as pd
import pymc3 as pm
import seaborn as sns
import matplotlib.pyplot as plt
import pdb
sns.set()
kq3_data = pd.read_excel('data/UF CER KQ3 Data for Analysis.xlsx', sheetname='5.UFKQ3Data',
na_values=['NR', 'NA'],
index_col=0)[['Author', 'Year', 'Design', 'Procedure', 'Indication',
'Age, Mean', 'Age, SD', 'Age, Median', 'Age, Min', 'Age, Max', 'Age, Other',
'LMS','Population','Tumors','InPritts']].dropna(thresh=7)
kq3_data.head()
Author | Year | Design | Procedure | Indication | Age, Mean | Age, SD | Age, Median | Age, Min | Age, Max | Age, Other | LMS | Population | Tumors | InPritts | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Line | |||||||||||||||
1.0 | Adelusola KA, Ogunniyi SO | 2001 | Retrospective | NE | NE | NaN | NaN | NaN | 19.0 | 89 | NaN | 0.0 | 177.0 | 0.0 | 1.0 |
2.0 | Ahmed AA, Stachurski J, Abdel Aziz E et al | 2002 | Prospective | NE | NE | NaN | NaN | NaN | 29.0 | 65 | NaN | 0.0 | 10.0 | 0.0 | 1.0 |
3.0 | Angle HS, Cohen SM, Hidlebaugh D | 1995 | Retrospective | NE | NE | 41.0 | NaN | NaN | 24.0 | 81 | NaN | 0.0 | 41.0 | 0.0 | 1.0 |
4.0 | Banaczek Z, Sikora K, Lewandowska-Andruszuk I | 2004 | Retrospective | NE | NE | 44.5 | NaN | NaN | 29.0 | 73 | NaN | 0.0 | 309.0 | 0.0 | 1.0 |
5.0 | Barbieri R, Dilena M, Chumas J, Rein MS, Fried... | 1993 | RCT | NE | NE | 33.7 | NaN | NaN | NaN | NaN | 36.3 | 0.0 | 20.0 | 0.0 | 1.0 |
Missing values
kq3_data.isnull().sum()
Author 0 Year 0 Design 0 Procedure 0 Indication 2 Age, Mean 67 Age, SD 138 Age, Median 142 Age, Min 58 Age, Max 59 Age, Other 114 LMS 0 Population 0 Tumors 0 InPritts 0 dtype: int64
kq3_data['age_max'] = kq3_data['Age, Max'].replace('50+', 50)
kq3_data = kq3_data.rename(columns={'Age, Min': 'age_min',
'Age, Mean': 'age_mean',
'Age, SD': 'age_sd',
'Age, Median': 'age_med'})
kq3_data[kq3_data.age_max.isnull()]
Author | Year | Design | Procedure | Indication | age_mean | age_sd | age_med | age_min | Age, Max | Age, Other | LMS | Population | Tumors | InPritts | age_max | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Line | ||||||||||||||||
5.0 | Barbieri R, Dilena M, Chumas J, Rein MS, Fried... | 1993 | RCT | NE | NE | 33.70 | NaN | NaN | NaN | NaN | 36.3 | 0.0 | 20.0 | 0.0 | 1.0 | NaN |
6.0 | Begum S, Khan S | 2004 | Prospective | NE | NE | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 91.0 | 0.0 | 1.0 | NaN |
8.0 | Betjes HE, Hanstede M, Emanuel M, Stewart EA | 2009 | Retrospective | NE | NE | 44.30 | NaN | NaN | NaN | NaN | 44.6, 44 | 0.0 | 539.0 | 0.0 | 1.0 | NaN |
10.0 | Bronz L, Suter T, Rusca T | 1997 | Prospective | NE | NE | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 25.0 | 0.0 | 1.0 | NaN |
11.0 | Bushaqer NJ, Dayoub N | 2014 | Retrospective | NE | NE | 36.00 | NaN | NaN | NaN | NaN | NaN | 0.0 | 137.0 | 0.0 | 1.0 | NaN |
12.0 | Butt JL, Jeffery ST, Van der Spuy ZM | 2012 | Retrospective | NE | NE | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 106.0 | 0.0 | 1.0 | NaN |
16.0 | Colgan TJ, Pendergast S, LeBlanc M | 1993 | Retrospective | NE | NE | 36.90 | NaN | NaN | NaN | NaN | 41.4 | 0.0 | 77.0 | 0.0 | 1.0 | NaN |
19.0 | De Falco M, Staibano S, Mascolo M, Mignona C, ... | 2009 | RCT | NE | NE | 37.30 | NaN | NaN | NaN | NaN | 37.5 | 0.0 | 62.0 | 0.0 | 1.0 | NaN |
22.0 | DiLieto A, De Falco M, Mansueto G, De Rosa G, ... | 2005 | RCT | NE | NE | 36.80 | NaN | NaN | NaN | NaN | 37.2 | 0.0 | 70.0 | 0.0 | 1.0 | NaN |
32.0 | Gavai M, Hupuczi P, Papp Z | 2006 | Retrospective | NE | NE | 33.00 | NaN | NaN | NaN | NaN | NaN | 0.0 | 504.0 | 0.0 | 1.0 | NaN |
44.0 | Hoffman M, DeCesare S, Kalter C | 1994 | Prospective | NE | NE | 41.90 | NaN | NaN | NaN | NaN | 42.7 | 0.0 | 47.0 | 0.0 | 1.0 | NaN |
45.0 | Huang JQ, Lathi RB, Lemyre M, Rodriguez HE, Ne... | 2010 | Retrospective | NE | NE | 41.00 | NaN | NaN | NaN | NaN | 45 | 0.0 | 131.0 | 0.0 | 1.0 | NaN |
46.0 | Jansen FW, de Kroon CD, van Dongen H, Grooters... | 2006 | Prospective | NE | NE | 43.80 | NaN | NaN | NaN | NaN | 67.4 | 0.0 | 89.0 | 0.0 | 1.0 | NaN |
48.0 | Johns D, Diamond MP | 1994 | Retrospective | NE | NE | 39.20 | NaN | NaN | NaN | NaN | NaN | 0.0 | 55.0 | 0.0 | 1.0 | NaN |
49.0 | Kafy S, Huang JYJ, Al-Sunaidi M, Wiener D, Tul... | 2006 | Retrospective | NE | NE | 59.60 | NaN | NaN | NaN | NaN | 46.8, 47.0 | 0.0 | 934.0 | 0.0 | 1.0 | NaN |
50.0 | Kalogiannidis I, Prapas N, Xiromeritis P et al | 2010 | Prospective | NE | NE | 34.80 | 4.50 | NaN | NaN | NaN | 37.7 ± 4.8 | 0.0 | 75.0 | 0.0 | 1.0 | NaN |
51.0 | Kamikabeya TS, Etchebehere RM, Nomelini RS, Mu... | 2010 | Retrospective | NE | NE | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 1364.0 | 1.0 | 1.0 | NaN |
52.0 | Kiltz RJ, Rutgers J, Phillips J, Murugesapilla... | 1994 | Prospective | NE | NE | 31.00 | 1.80 | NaN | NaN | NaN | 28 ± 1.3, 30 ± 2.1 | 0.0 | 28.0 | 0.0 | 1.0 | NaN |
60.0 | Levens ED, Wesley R, Prekumar A, Blocker W, Ni... | 2009 | RCT | NE | NE | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 18.0 | 0.0 | 1.0 | NaN |
61.0 | Lim SS, Sockalingam JK, Tan PC | 2008 | RCT | NE | NE | 46.50 | NaN | NaN | NaN | NaN | NaN | 0.0 | 66.0 | 0.0 | 1.0 | NaN |
62.0 | Litta P, Fantinato S, Calonaci F, Cosmi E, Fil... | 2010 | RCT | NE | NE | 37.34 | NaN | NaN | NaN | NaN | 38.11 | 0.0 | 160.0 | 0.0 | 1.0 | NaN |
66.0 | MacKenzie IZ, Naish C, Rees M, Manek S | 2004 | Retrospective | NE | NE | 47.50 | NaN | NaN | NaN | NaN | NaN | 0.0 | 118.0 | 0.0 | 1.0 | NaN |
68.0 | Mansour FW, Kives S, Urbach DR, Lefebvre G | 2012 | Retrospective | NE | NE | 34.70 | NaN | NaN | NaN | NaN | 35.3 | 0.0 | 59.0 | 0.0 | 1.0 | NaN |
73.0 | Milad MP, Morrison K, Sokol A, Miller D, Kirkp... | 2001 | Prospective | NE | NE | 43.90 | NaN | NaN | NaN | NaN | 47.3 | 0.0 | 69.0 | 0.0 | 1.0 | NaN |
80.0 | Obed JY, Bako B, Usman J, Moruppa JY, Kadas S | 2011 | Prospective | NE | NE | 30.10 | NaN | NaN | NaN | NaN | NaN | 0.0 | 331.0 | 0.0 | 1.0 | NaN |
84.0 | Palomba S, Orio F Jr, Russo T, Falbo A, Tolino... | 2005 | RCT | NE | NE | 53.40 | NaN | NaN | NaN | NaN | 52.2 | 0.0 | 40.0 | 0.0 | 1.0 | NaN |
85.0 | Palomba S, Zupi E, Falbo A, Russo T, Marconi D... | 2010 | Prospective | NE | NE | 30.20 | NaN | NaN | NaN | NaN | 28.9 | 0.0 | 30.0 | 0.0 | 1.0 | NaN |
92.0 | Radosa MP, Owsianowski Z, Mothes A, Weisheit J... | 2014 | Retrospective | NE | NE | 37.90 | NaN | NaN | NaN | NaN | NaN | 0.0 | 221.0 | 0.0 | 1.0 | NaN |
93.0 | Rein MS, Friedman AJ, Stuart JM, MacLaughlin DT | 1990 | RCT | NE | NE | NaN | NaN | NaN | NaN | NaN | "premenopausal" | 0.0 | 20.0 | 0.0 | 1.0 | NaN |
94.0 | Reiter RC, Wagner PL, Gambone JC | 1992 | Retrospective | NE | NE | 41.50 | NaN | NaN | NaN | NaN | 42.9 | 0.0 | 104.0 | 0.0 | 1.0 | NaN |
95.0 | Rosenblatt P, Makai G, DiSciullo A | 2010 | Retrospective | NE | NE | 50.20 | NaN | NaN | NaN | NaN | NaN | 0.0 | 24.0 | 0.0 | 1.0 | NaN |
96.0 | Rovio PH, Helin R, Heinonen PK | 2009 | Retrospective | NE | NE | 44.70 | NaN | NaN | NaN | NaN | NaN | 0.0 | 53.0 | 0.0 | 1.0 | NaN |
97.0 | Rutgers JL, Spong CY, Sinow R, Heiner J | 1995 | RCT | NE | NE | 38.00 | NaN | NaN | NaN | NaN | 39 | 0.0 | 22.0 | 0.0 | 1.0 | NaN |
99.0 | Samaila M, Adesiyun AG, Agunbiade OS, Mohammed... | 2009 | Retrospective | NE | NE | 44.60 | NaN | NaN | NaN | NaN | NaN | 0.0 | 196.0 | 0.0 | 1.0 | NaN |
100.0 | Sayyah-Melli M, Tehrani-Gadim S, Dastranj-Tabr... | 2009 | RCT | NE | NE | 39.67 | NaN | NaN | NaN | NaN | 36.87 | 0.0 | 23.0 | 0.0 | 1.0 | NaN |
101.0 | Schutz K, Possover M, Merker A, Michels W, Sch... | 2002 | RCT | NE | NE | NaN | NaN | 47.5 | NaN | NaN | 48, 49 | 0.0 | 48.0 | 0.0 | 1.0 | NaN |
104.0 | Seracchioli R, Venturoli S, Vianello F, Govoni... | 2002 | RCT | NE | NE | 46.30 | NaN | NaN | NaN | NaN | 47.4 | 0.0 | 122.0 | 0.0 | 1.0 | NaN |
105.0 | Shen C, Wu M, Kung F, Huang FJ, Hsieh CH, Lan ... | 2003 | Retrospective | NE | NE | 45.50 | NaN | NaN | NaN | NaN | NaN | 0.0 | 1521.0 | 0.0 | 1.0 | NaN |
108.0 | Silva BAC, Falcone T, Bradley L, Goldberg J, M... | 2000 | Prospective | NE | NE | 37.00 | NaN | NaN | NaN | NaN | NaN | 0.0 | 37.0 | 0.0 | 1.0 | NaN |
109.0 | Silva BAC, Falcone T, Bradley L, Goldberg J, M... | 2000 | Retrospective | NE | NE | 37.00 | NaN | NaN | NaN | NaN | NaN | 0.0 | 37.0 | 0.0 | 1.0 | NaN |
110.0 | Sinha R, Hegde A, Mahajan C, Dubey N, Sundaram M | 2008 | Prospective | NE | NE | 34.44 | NaN | NaN | NaN | NaN | 33.98 | 2.0 | 505.0 | 2.0 | 1.0 | NaN |
113.0 | Tan J, Sun Y, Zhong B, Dai H, Wang D | 2009 | RCT | NE | NE | 36.30 | NaN | NaN | NaN | NaN | 35.8 | 0.0 | 80.0 | 0.0 | 1.0 | NaN |
118.0 | Van Dongen H, Emanuel MH, Wolterbeek R, Trimbo... | 2008 | RCT | NE | NE | 48.20 | NaN | NaN | NaN | NaN | 49 | 0.0 | 22.0 | 0.0 | 1.0 | NaN |
121.0 | Walid MS, Heaton RL | 2010 | Retrospective | NE | NE | NaN | NaN | NaN | NaN | NaN | "all reproductive age" | 0.0 | 41.0 | 0.0 | 1.0 | NaN |
123.0 | Wang CJ, Soong YK, Lee CL | 2007 | Prospective | NE | NE | NaN | NaN | 38.5 | NaN | NaN | NaN | 0.0 | 18.0 | 0.0 | 1.0 | NaN |
126.0 | Williams A, Critchley H, Osei J, Ingamells S, ... | 2007 | RCT | NE | NE | NaN | NaN | NaN | NaN | NaN | "premenopausal" | 0.0 | 33.0 | 0.0 | 1.0 | NaN |
127.0 | Williams CD, Marshburn PB | 1998 | Prospective | NE | NE | 38.50 | NaN | NaN | NaN | NaN | NaN | 0.0 | 5.0 | 0.0 | 1.0 | NaN |
130.0 | Ylikorkala O, Tiitinen A, Hulkko S, Kivinen S,... | 1995 | RCT | NE | NE | 43.00 | NaN | NaN | NaN | NaN | 44.5 | 0.0 | 101.0 | 0.0 | 1.0 | NaN |
132.0 | Yoon HJ, Kyung MS, Jung US, Choi JS | 2007 | Retrospective | NE | NE | 34.90 | NaN | NaN | NaN | NaN | NaN | 0.0 | 51.0 | 0.0 | 1.0 | NaN |
133.0 | Zhu L, Lang JH, Liu CY, SHI HH, Zun ZJ, Fan R | 2009 | RCT | NE | NE | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 101.0 | 0.0 | 1.0 | NaN |
134.0 | Zullo F, Palomba S, Corea D, Pellicano M, Russ... | 2004 | RCT | NE | NE | 28.20 | NaN | NaN | NaN | NaN | 27.1 | 0.0 | 60.0 | 0.0 | 1.0 | NaN |
136.0 | Balgobin S, P. A. Maldonado, K. Chin, J. I. Sc... | 2016 | Retrospective | hysterectomy | AUB, leiomyoma, pain, prolapse, stress urinary... | 46.00 | 11.30 | NaN | NaN | NaN | NaN | 0.0 | 435.0 | 0.0 | 0.0 | NaN |
138.0 | Raine-Bennett T, L. Y. Tucker, E. Zaritsky, R.... | 2016 | Pop based cohort | hysterectomy | leiomyoma | NaN | NaN | NaN | 18.0 | NaN | NaN | 172.0 | 34603.0 | 172.0 | 0.0 | NaN |
139.0 | Zhao WC, F. F. Bi, D. Li and Q. Yang | 2015 | Retrospective | hysterectomy; myomectomy | uterine fibroids | 48.20 | 7.64 | NaN | NaN | NaN | NaN | 13.0 | 10248.0 | 13.0 | 0.0 | NaN |
142.0 | Tan-Kim J, K. A. Hartzell, C. S. Reinsch, C. H... | 2015 | Retrospective | hysterectomy | NaN | 46.00 | 6.00 | NaN | NaN | NaN | NaN | 3.0 | 941.0 | 3.0 | 0.0 | NaN |
144.0 | Cormio G, Loizzi, Ceci O, Leone L, Selvaggi L ... | 2015 | Retrospective | myomectomy | bleeding due to uterine fibroid | NaN | NaN | NaN | NaN | NaN | NaN | 3.0 | 588.0 | 3.0 | 0.0 | NaN |
145.0 | Zhang J, J. Zhang, Y. Dai, L. Zhu, J. Lang and... | 2015 | Retrospective | myomectomy | menstrual disturbances, pelvic pain, myoma det... | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 4248.0 | 1.0 | 0.0 | NaN |
147.0 | Bojahr B, De Wilde RL, and Tchartchian G | 2015 | Retrospective | hysterectomy | symptomatic uterine myomas | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 | 10731.0 | 2.0 | 0.0 | NaN |
148.0 | Lieng M, E. Berner and B. Busund | 2015 | Retrospective | hysterectomy, myomectomy | uterine fibroids | 61.20 | 12.30 | NaN | NaN | NaN | NaN | 6.0 | 4771.0 | 6.0 | 0.0 | NaN |
kq3_data.describe()
age_mean | age_sd | age_med | age_min | LMS | Population | Tumors | InPritts | age_max | |
---|---|---|---|---|---|---|---|---|---|
count | 81.000000 | 10.000000 | 6.000000 | 90.000000 | 148.000000 | 148.000000 | 148.000000 | 148.000000 | 89.000000 |
mean | 41.744815 | 7.071000 | 39.833333 | 26.860000 | 1.750000 | 790.472973 | 1.750000 | 0.905405 | 61.638202 |
std | 6.269069 | 3.044779 | 4.910261 | 6.393638 | 14.256577 | 3299.459447 | 14.256577 | 0.293648 | 13.414371 |
min | 28.200000 | 1.800000 | 35.000000 | 18.000000 | 0.000000 | 5.000000 | 0.000000 | 0.000000 | 34.000000 |
25% | 37.200000 | 6.025000 | 36.475000 | 21.250000 | 0.000000 | 40.000000 | 0.000000 | 1.000000 | 51.000000 |
50% | 41.900000 | 6.600000 | 38.200000 | 25.000000 | 0.000000 | 90.500000 | 0.000000 | 1.000000 | 60.000000 |
75% | 45.900000 | 7.812500 | 42.700000 | 31.000000 | 0.000000 | 297.750000 | 0.000000 | 1.000000 | 71.000000 |
max | 61.200000 | 12.300000 | 47.500000 | 44.000000 | 172.000000 | 34603.000000 | 172.000000 | 1.000000 | 96.000000 |
kq3_data[[c for c in kq3_data.columns if c.startswith('age')]].hist();
Breakdown on studies by design.
kq3_data.Design.value_counts()
Retrospective 83 Prospective 38 RCT 26 Pop based cohort 1 Name: Design, dtype: int64
I will drop the cohort study, since it is the only such representative.
kq3_data = kq3_data[kq3_data.Design!='Pop based cohort']
kq3_data = pd.concat([kq3_data, pd.get_dummies(kq3_data.Design)[['Prospective', 'RCT']]], axis=1)
Following Pritts et al. (2015), I fit a binomial random effects model, such that event probabilities on the logit scale are normally distributed with mean $\mu$ and standard deviation $\sigma$. This distribution describes how the probabilities vary across studies, with the degree of variation described by $\sigma$.
$$\theta_i \sim N(\mu, \sigma^2)$$the expected value for study $i$ is then inverse-logit transformed, and used as the event probability $\pi_i$ in a binomial model describing the number of observed tumors $t$:
$$\log\left[\frac{\pi_i}{1-\pi_i}\right] = \theta_i$$$$t_i \sim \text{Bin}(n_i, \pi_i)$$import theano.tensor as tt
from numpy.ma import masked_values
model_data = kq3_data
k = model_data.shape[0]
tumors = model_data.Tumors.values.astype(int)
n = model_data.Population.values.astype(int)
X = model_data[['Prospective', 'RCT']].values.astype(int)
age_max_norm = ((model_data.age_max - 60)/10).fillna(0.5).values
poly_terms = 3
def invlogit(x):
return tt.exp(x) / (1 + tt.exp(x))
with pm.Model() as pritts_update:
# Impute missing max ages
age_max_missing = masked_values(age_max_norm, value=0.5)
ν = pm.HalfCauchy('ν', 5, testval=1)
μ_age = pm.Normal('μ_age', 0, 5, testval=0)
age_max = pm.StudentT('age_max', ν, mu=μ_age, observed=age_max_missing)
# Study random effect
μ = pm.Normal('μ', 0, sd=100, testval=-3)
σ = pm.Uniform('σ', 0, 1000, testval=10)
θ = pm.Normal('θ', μ, sd=σ, shape=k)
# Design effects
β = pm.Normal('β', 0, sd=10, shape=2, testval=np.zeros(2))
# Polynomial age effect with Lasso
α = pm.Normal('α', 0, sd=10, testval=0)
# Study-specific probabilities
π = pm.Deterministic('π', invlogit(θ + tt.dot(X,β) + α*age_max))
# Expected probabilities by design
p_retro_1000 = pm.Deterministic('p_retro_1000', invlogit(μ)*1000)
p_prosp_1000 = pm.Deterministic('p_prosp_1000', invlogit(μ + β[0])*1000)
p_rct_1000 = pm.Deterministic('p_rct_1000', invlogit(μ + β[1])*1000)
obs = pm.Binomial('obs', n=n, p=π, observed=tumors)
Applied log-transform to ν and added transformed ν_log to model. Applied interval-transform to σ and added transformed σ_interval to model.
with pritts_update:
step = pm.NUTS()
trace = pm.sample(2000, step=step, njobs=2, random_seed=[20140425, 19700903])
[-----------------100%-----------------] 2001 of 2000 complete in 689.1 sec
The following plots are the distribution of samples from the posterior distributions for the expected (population) probability of tumor (p_update
), the inverse-logit expected probability ($\mu$) and the standard deviation of the probabilities on the inverse-logit scale ($\sigma$).
pm.traceplot(trace[1000:], varnames=['p_retro_1000', 'p_prosp_1000', 'p_rct_1000', 'α', 'μ', 'σ']);
Summary statistics from each of the above parameters.
pm.summary(trace[1000:], varnames=['p_retro_1000', 'p_prosp_1000', 'p_rct_1000', 'μ', 'σ'], roundto=4)
p_retro_1000: Mean SD MC Error 95% HPD interval ------------------------------------------------------------------- 0.7946 0.1931 0.0113 [0.4376, 1.1639] Posterior quantiles: 2.5 25 50 75 97.5 |--------------|==============|==============|--------------| 0.4451 0.6585 0.7891 0.9241 1.1779 p_prosp_1000: Mean SD MC Error 95% HPD interval ------------------------------------------------------------------- 0.5657 0.4015 0.0174 [0.0169, 1.3539] Posterior quantiles: 2.5 25 50 75 97.5 |--------------|==============|==============|--------------| 0.0746 0.2845 0.4725 0.7486 1.5540 p_rct_1000: Mean SD MC Error 95% HPD interval ------------------------------------------------------------------- 0.0352 0.1493 0.0087 [0.0000, 0.1788] Posterior quantiles: 2.5 25 50 75 97.5 |--------------|==============|==============|--------------| 0.0000 0.0000 0.0001 0.0061 0.3406 μ: Mean SD MC Error 95% HPD interval ------------------------------------------------------------------- -7.1682 0.2568 0.0156 [-7.6702, -6.7266] Posterior quantiles: 2.5 25 50 75 97.5 |--------------|==============|==============|--------------| -7.7168 -7.3248 -7.1438 -6.9858 -6.7428 σ: Mean SD MC Error 95% HPD interval ------------------------------------------------------------------- 0.8604 0.2499 0.0216 [0.3881, 1.3700] Posterior quantiles: 2.5 25 50 75 97.5 |--------------|==============|==============|--------------| 0.4307 0.6958 0.8348 1.0013 1.4240
/Users/fonnescj/anaconda3/lib/python3.5/site-packages/numpy/core/fromnumeric.py:225: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future return reshape(newshape, order=order)
By comparison, estimates from Pritts' supplement:
p_vars = ['p_retro_1000', 'p_prosp_1000', 'p_rct_1000']
intervals = [pm.stats.hpd(trace[var]) for var in p_vars]
meds = [np.median(trace[var]) for var in p_vars]
upper = [i[1] - m for m,i in zip(meds, intervals)]
lower = [m - i[0] for m,i in zip(meds, intervals)]
plt.errorbar(np.arange(3)-0.1, meds, yerr=[lower, upper], fmt='o', label='Updated MA')
plt.errorbar(np.arange(2)+0.1, [0.57, 0.12], yerr=[[0.57-0.17, 0.12], [1.13 - 0.57,0.75 - 0.12]], color='r', fmt='o', label='Pritts MA')
plt.xticks(range(3), ['Retrospective', 'Prospective', 'RCT'])
plt.ylabel('Rate per 1000 surgeries')
plt.xlim(-0.5, 2.25)
plt.legend();
pm.forestplot(trace[1000:], varnames=['age_max_missing'], ylabels=[''])
<matplotlib.gridspec.GridSpec at 0x124f61630>
To check how well the model fits the data, I conducted posteior predictive checks. This simulates data from the model for each data point (i.e. study) in the meta-analysis, and compares the distribution of these simulated values (here, 500 replicates) with the value of the data itself. If the percentile of the datum is very extreme (either large or small), then it is evidence that the model does not adequately fit the data. The distribution of percentiles in the bottom histogram shows that there is no evidence of lack of fit.
ppc = pm.sample_ppc(trace, model=pritts_update, samples=500)
from scipy.stats import percentileofscore
p = [percentileofscore(s, o).round(2) for s,o in zip(ppc['obs'].T, tumors)]
plt.hist(p)
plt.xlabel('percentile')
plt.ylabel('frequency');