Copyright 2020 Allen Downey

Suppose you have run 174 trials and 173 were successful. You want to report an estimate of the probability of success and a confidence interval for the estimate.

According to our friends at Wikipedia there are several ways to compute it, based on different assumptions and requirements.

In my opinion, the clear best option is the Jeffreys interval.

The Jeffreys interval has a Bayesian derivation, but it has good frequentist properties. In particular, it has coverage properties that are similar to those of the Wilson interval, but it is one of the few intervals with the advantage of being equal-tailed (e.g., for a 95% confidence interval, the probabilities of the interval lying above or below the true value are both close to 2.5%). In contrast, the Wilson interval has a systematic bias such that it is centred too close to p = 0.5.

The Jeffreys interval is the Bayesian credible interval obtained when using the non-informative Jeffreys prior for the binomial proportion p. The Jeffreys prior for this problem is a Beta distribution with parameters (1/2, 1/2), it is a conjugate prior. After observing x successes in n trials, the posterior distribution for p is a Beta distribution with parameters (x + 1/2, n – x + 1/2).

When x ≠0 and x ≠ n, the Jeffreys interval is taken to be the 100(1 – α)% equal-tailed posterior probability interval, i.e., the α / 2 and 1 – α / 2 quantiles of a Beta distribution with parameters (x + 1/2, n – x + 1/2). These quantiles need to be computed numerically, although this is reasonably simple with modern statistical software.

In order to avoid the coverage probability tending to zero when p → 0 or 1, when x = 0 the upper limit is calculated as before but the lower limit is set to 0, and when x = n the lower limit is calculated as before but the upper limit is set to 1.

In my opinion, that sentence is an unnecessary hack.

Here's how to compute a Jeffrey's interval for the example.

In [1]:

```
trials = 174
successes = 173
failures = trials-successes
failures
```

Out[1]:

Here's a beta distribution that represents the posterior distribution for the proportion, assuming a Jeffrey's prior.

In [2]:

```
from scipy.stats import beta
dist = beta(successes+1/2, failures+1/2)
```

I think the best point estimate is the posterior mean:

In [3]:

```
estimate = dist.mean() * 100
estimate
```

Out[3]:

Here's the confidence interval.

In [4]:

```
p = 0.95
a = (1-p)
a, 1-a
```

Out[4]:

In [5]:

```
ci = dist.ppf([a/2, 1-a/2]) * 100
ci
```

Out[5]:

So you could report an estimate of 99.1% with 95% CI (97.3, 99.9).

If anyone asks how you computed that, tell them it's a Jeffreys interval. If they ask why, send them this paper.

In [ ]:

```
```