%matplotlib inline
import pymc3 as pm
import numpy as np
import seaborn as sns
First, let's run an analysis of 100 binomial samples, with zero positive outcomes:
n1 = 100
x1 = 0
with pm.Model() as first_dataset:
θ = pm.Beta('θ', 1, 1)
x = pm.Binomial('x', n=n1, p=θ, observed=x1)
trace1 = pm.sample(2000)
Applied logodds-transform to θ and added transformed θ_logodds_ to model. Assigned NUTS to θ_logodds_ 100%|██████████| 2000/2000 [00:00<00:00, 2721.90it/s]
The parameter estimate is 0.012, with a credible interval of length 0.031.
pm.summary(trace1)
θ: Mean SD MC Error 95% HPD interval ------------------------------------------------------------------- 0.012 0.025 0.001 [0.000, 0.031] Posterior quantiles: 2.5 25 50 75 97.5 |--------------|==============|==============|--------------| 0.000 0.003 0.008 0.014 0.037
Now, let's add another 100 samples, but this time with 10 positive outcomes:
n2 = 100
x2 = 10
with pm.Model() as combined_dataset:
θ = pm.Beta('θ', 1, 1)
x = pm.Binomial('x', n=n1+n2, p=θ, observed=x1+x2)
trace2 = pm.sample(2000)
Applied logodds-transform to θ and added transformed θ_logodds_ to model. Assigned NUTS to θ_logodds_ 100%|██████████| 2000/2000 [00:00<00:00, 3415.85it/s]
pm.summary(trace2)
θ: Mean SD MC Error 95% HPD interval ------------------------------------------------------------------- 0.056 0.030 0.002 [0.025, 0.087] Posterior quantiles: 2.5 25 50 75 97.5 |--------------|==============|==============|--------------| 0.027 0.042 0.053 0.064 0.091
Notice that the credible interval is twice as large, even with a doubling of the sample size!