The Beta distribution is characterized as Beta(α,β) where α>0, β>0.
The Beta distribution PDF is given by:
f(x)=cxα−1(1−x)β−1where 0<x<1We will put aside the normalization constant c for now (wait until lecture 25!).
Beta(α,β) distribution
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import (MultipleLocator, FormatStrFormatter,
AutoMinorLocator)
from scipy.stats import beta
%matplotlib inline
plt.xkcd()
_, ax = plt.subplots(figsize=(12,8))
x = np.linspace(0, 1.0, 500)
# seme Beta parameters
alphas = [0.5, 5, 1, 1, 2, 3]
betas = [0.5, 3, 2, 1, 1, 5]
params = map(list, zip(alphas, betas))
# qualitative color scheme
colors = ['#66c2a5', '#fc8d62', '#8da0cb', '#e78ac3', '#a6d854', '#ffd92f']
for i,(a,b) in enumerate(params):
y = beta.pdf(x, a, b)
ax.plot(x, y, color=colors[i], lw=3.2, label=r'$\alpha$={}, $\beta$={}'.format(a,b))
# legend styling
legend = ax.legend()
for label in legend.get_texts():
label.set_fontsize('large')
for label in legend.get_lines():
label.set_linewidth(1.5)
# y-axis
ax.set_ylim([0.0, 5.0])
ax.set_ylabel(r'$f(x)$')
# x-axis
ax.set_xlim([0, 1.0])
ax.set_xlabel(r'$x$')
# x-axis tick formatting
majorLocator = MultipleLocator(.2)
majorFormatter = FormatStrFormatter('%0.1f')
minorLocator = MultipleLocator(.1)
ax.xaxis.set_major_locator(majorLocator)
ax.xaxis.set_major_formatter(majorFormatter)
ax.xaxis.set_minor_locator(minorLocator)
ax.grid(color='grey', linestyle='-', linewidth=0.3)
plt.suptitle(r'Beta Distribution $f(x) = \frac{x^{\alpha-1} \, (1-x)^{\beta-1}}{B(\alpha,\beta)}$')
plt.show()
Recall Laplace's Rule of Succession dealing with the problem of the sun rising: there, we probability p that the sun will rise on any given day Xk, given a consecutive string of days X1,X2,…,Xk−1, to be i.i.d. Bern(p).
We made an assumption that p∼Unif(0,1).
Beta(1,1) is the same as Unif(0,1), and so we will show how to generalize using the Beta distribution.
Given X|p∼Bin(n,p). We get to observe X, but we do not know the true value of p.
In such a case, we can assume that the prior p∼Beta(α,β). After observing n further trials, where perhaps k are successes and n−k are failures, we can use this information to update our beliefs on the nature of p using Bayes Theorem.
So we what we want is the posterior p|X, since we will get to observe more values of X and want to update our understanding of p.
f(p|X=k)=P(X=k|p)f(p)P(X=k)=(nk)pk(1−p)n−kcpα−1(1−p)β−1P(X=k)∝pα+k−1(1−p)β+n−k−1since (nk), c and P(X=k) do not depend on p⇒p|X∼Beta(α+X,β+n−X)Conjugate refers to the fact that we are looking at an entire family of Beta distributions as the prior. We started off with Beta(α,β), and after an additional n more observations of X we end up with Beta(α+X,β+n−X).
The Beta(α,β) distribution has PDF:
f(x)=cxα−1(1−x)β−1Let's try to find the normalizing constant c for the case where α>0, β>0, and α,β∈Z
In order to do that, we need to find out
∫10xk(1−x)n−kdx→∫10(nk)xk(1−x)n−kdxWe have n+1 white billiard balls, and we paint one of them pink. Now we throw them down on the number line from 0 to 1.
We throw our n+1 white billiard balls down on the number line from 0 to 1, and then randomly select one of them to paint in pink.
Note that the image above could have resulted from either of the stories, so both stories are actually equivalent.
At this point, we know exactly how to the above integral without using any calculus.
Let X=# balls to left of pink, so X ranges from 0 to n. If we condition on p (pink billiard ball), we can consider this to be a binomial distribution problem, where "success" means being to the left of the pink ball.
P(X=k)=∫10P(X=k|p)f(p)dpconditioning on p=∫10(nk)pk(1−p)n−kdpsince f(p)∼Unif(0,1)but from Story 2, k could be any value in {0,n}=1n+1And so now we have the normalizing constant when α,β are positive integers.
View Lecture 23: Beta distribution | Statistics 110 on YouTube.