In [2]:
%matplotlib inline
import pymc3 as pm
import numpy as np
import seaborn as sns

First, let's run an analysis of 100 binomial samples, with zero positive outcomes:

In [3]:
n1 = 100
x1 = 0
In [6]:
with pm.Model() as first_dataset:
    θ = pm.Beta('θ', 1, 1)
    x = pm.Binomial('x', n=n1, p=θ, observed=x1)
    
    trace1 = pm.sample(2000)
Applied logodds-transform to θ and added transformed θ_logodds_ to model.
Assigned NUTS to θ_logodds_
100%|██████████| 2000/2000 [00:00<00:00, 2721.90it/s]

The parameter estimate is 0.012, with a credible interval of length 0.031.

In [7]:
pm.summary(trace1)
θ:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  0.012            0.025            0.001            [0.000, 0.031]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  0.000          0.003          0.008          0.014          0.037

Now, let's add another 100 samples, but this time with 10 positive outcomes:

In [8]:
n2 = 100
x2 = 10

with pm.Model() as combined_dataset:
    θ = pm.Beta('θ', 1, 1)
    x = pm.Binomial('x', n=n1+n2, p=θ, observed=x1+x2)
    
    trace2 = pm.sample(2000)
Applied logodds-transform to θ and added transformed θ_logodds_ to model.
Assigned NUTS to θ_logodds_
100%|██████████| 2000/2000 [00:00<00:00, 3415.85it/s]
In [9]:
pm.summary(trace2)
θ:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  0.056            0.030            0.002            [0.025, 0.087]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  0.027          0.042          0.053          0.064          0.091

Notice that the credible interval is twice as large, even with a doubling of the sample size!