Notebook

In [2]:

import matplotlib.pyplot as plt
import mpl_toolkits.mplot3d.art3d as art3d
import numpy as np

In this chapter, we will only be dealing with joint distributions, which are the most important section of the whole course. Joint distributions are used for formulating all kinds of probability model.

Discrete Distributions¶

Joint PMF¶

The joint probability mass function of two discrete random variable is defined as

$P_{XY}(x, y) = P(X = x, Y=y)$

It is convenient to define a finite range for $X$ and $Y$ , $R_X = \{x_1, x_2, ...\}$ and $R_Y = \{y_1, y_2, ...\}$ and its cartesian product

$R_{XY}\subset R_X \times R_Y = \{(x_i, y_j)|x_i\in R_X, y_j \in R_Y\}$

is the range for joint distribution.

The most common property for probability distribution is

$\sum_{(x_i,y_j)\in R_{XY}}P_{XY}(x_i,y_j)=1$

Marginal PMF¶

Let's consider a probability mass function table.

$\begin{array}{|c|c|} \hline & Y = 0 & Y = 1 & Y= 2 \\ \hline X = 0 & 1/6 & 1/4 & 1/8 \\ \hline X = 1 & 1/8 & 1/6 & 1/6 \\ \hline \end{array}$

Find $P(X=0, Y = 1)$ . It is easy, just eyeball the table.

$P(X=0, Y = 1) = 1/4$

Find $P(X=0| Y = 1)$ and $P(Y = 1| X =0)$

$P(X=0| Y = 1) = \frac{1/4}{1/4+1/6}=3/5\\ P(Y = 1| X =0) = \frac{1/4}{1/6+1/4+1/8}= 6/13$

Find marginal PMFs of $X$ and $Y$

In [36]:

from fractions import Fraction as frac
pY_0 = frac(1,6) + frac(1,8)
pY_1 = frac(1,4) + frac(1,6)
pY_2 = frac(1,8) + frac(1,6)

pX_0 = frac(1,6) + frac(1,4) + frac(1,8)
pX_1 = frac(1,8) + frac(1,6) + frac(1,6)

print('Marginal PMF of pY are {0}, {1}, {2}.'.format(pY_0,pY_1,pY_2))
print('Marginal PMF of pX are {0}, {1}.'.format(pX_0,pX_1))

Marginal PMF of pY are 7/24, 5/12, 7/24.
Marginal PMF of pX are 13/24, 11/24.

The reason we call them marginal is because they are written at the margin of the table.

$\begin{array}{|c|c|} \hline & Y = 0 & Y = 1 & Y= 2 & P_X(x)\\ \hline X = 0 & 1/6 & 1/4 & 1/8 & 13/24 \\ \hline X = 1 & 1/8 & 1/6 & 1/6 & 11/24\\ \hline P_Y(y) & 7/24 & 5/12 & 7/24 & \\ \hline \end{array}$

Are $X$ and $Y$ independent?

If independent, a conditional probability should equal to marginal probability, for instance

$P(X=0| Y = 1)= \frac{1/4}{1/4+1/6} =3/5\\ P_X(X=0)=13/24$

They are not equal, which means they are not independent.

The relationship of marginal PMF and conditional PMF is

$P(X|Y) = \frac{P(X,Y)}{P_Y(Y)}$

i.e.

$\text{Conditional PMF} = \frac{\text{Joint PMF}}{\text{Marginal PMF}}$

Joint and Marginal CDF ¶

The joint CDF of two random variables $X$ and $Y$ is defined as

$F_{XY}(x,y)=P(X\leq x, Y\leq y)$

where $0\leq F_{XY}(x,y) \leq 1$ .

For instance, the joint CDF of $P(X\leq 2, Y\leq 1)$ in range $(-6,\ 6)$ is the probability of the shaded area.

In [92]:

x = np.linspace(-6, 1)
y = 2*np.ones(len(x))
fig, ax = plt.subplots(figsize = (8, 8))

ax.plot([1, -5], [2, 2], color = 'b')
ax.scatter(1, 2, s = 80, zorder = 3, color = 'red')
ax.plot([1, 1], [2, -5], color = 'b')
ax.axis([-5, 6, -5, 6])
ax.scatter(np.random.uniform(low = -5, high = 6, size = 50),
           np.random.uniform(low = -5, high = 6, size = 50))
ax.fill_between(x, y, -5, color = 'red', alpha =.2)
ax.text(1, 2.1, '$(1, 2)$', size = 15)
ax.grid()

Marginal CDF $F_X(x)$ and $F_Y(y)$ are denoted

$F_X(x) = P(X\leq x, Y\leq \infty)\\ F_Y(y) = P(X\leq \infty, Y\leq y)$

Conditional PMF and CDF¶

If $A$ is a random event, the conditional PMF of $X$ given $A$ is denoted as

$P_{X|A}(X = x_i) = \frac{P(X=x_i,A)}{P(A)}$

Consider a PMF as below

$\begin{array}{|c|c|} \hline & X = -2 & X = -1 & X = 0 & X = 1 & X = 2 \\ \hline Y = 2 & 0 & 0 & 1/13 & 0 & 0 \\ \hline Y = 1 & 0 & 1/13 & 1/13 & 1/13 & 0 \\ \hline Y = 0 & 1/13 & 1/13 & 1/13 & 1/13 & 1/13 \\ \hline Y = -1 & 0 & 1/13 & 1/13 & 1/13 & 0 \\ \hline Y = -2 & 0 & 0 & 1/13 & 0 & 0 \\ \hline \end{array}$

Mathematically, it is defined as $G=\{(x, y)|x, y \in \mathbb{Z},| x|+| y | \leq 2\}$ .

Find the marginal PMFs of $X$ and $Y$ .

In [94]:

pY_2 = frac(1,13)
pY_1 = frac(1,13)*3
pY_0 = frac(1,13)*5
pY_m1 = frac(1,13)*3
pY_m2 = frac(1,13)

pX_2 = frac(1,13)
pX_1 = frac(1,13)*3
pX_0 = frac(1,13)*5
pX_m1 = frac(1,13)*3
pX_m2 = frac(1,13)

print('Marginal PMF of pY are {0}, {1}, {2}, {3}, {4}.'.format(pY_2,pY_1,pY_0,pY_m1,pY_m2))
print('Marginal PMF of pX are {0}, {1}, {2}, {3}, {4}.'.format(pX_2,pX_1,pX_0,pX_m1,pX_m2))

Marginal PMF of pY are 1/13, 3/13, 5/13, 3/13, 1/13.
Marginal PMF of pX are 1/13, 3/13, 5/13, 3/13, 1/13.

We add marginals to the table

$\begin{array}{|c|c|} \hline & X = -2 & X = -1 & X = 0 & X = 1 & X = 2 & P_Y(y) \\ \hline Y = 2 & 0 & 0 & 1/13 & 0 & 0 & 1/13\\ \hline Y = 1 & 0 & 1/13 & 1/13 & 1/13 & 0 & 3/13 \\ \hline Y = 0 & 1/13 & 1/13 & 1/13 & 1/13 & 1/13 & 5/13 \\ \hline Y = -1 & 0 & 1/13 & 1/13 & 1/13 & 0 & 3/13 \\ \hline Y = -2 & 0 & 0 & 1/13 & 0 & 0 & 1/13\\ \hline P_X(x) &1/13 &3/13 & 5/13 & 3/13 & 1/13 \\ \hline \end{array}$

Find the conditional PMF of $X$ given $Y = 1$ , i.e. $P(X|Y=1)$

$\begin{array}{|c|c|} \hline & X = -2 & X = -1 & X = 0 & X = 1 & X = 2\\ \hline Y = 1 & 0 & 1/3 & 1/3 & 1/3 & 0 \\ \hline \end{array}$

It shows that given $Y=1$ , $X$ is uniformly distributed over $\{-1,0,1\}$ .

Are X and Y independent?

No, for instance $P(X=0|Y=1) \neq P_X(X = 0)$

If random event $A$ is replaced by a discrete random variable $Y$ , the conditional density PMFs are defined as

$\begin{array}{l} P_{X | Y}\left(x_{i} | y_{j}\right)=\frac{P_{X Y}\left(x_{i}, y_{j}\right)}{P_{Y}\left(y_{j}\right)} \\ P_{Y | X}\left(y_{j} | x_{i}\right)=\frac{P_{X Y}\left(x_{i}, y_{j}\right)}{P_{X}\left(x_{i}\right)} \end{array}$

where $x_i$ and $y_j$ are realizations of $X$ and $Y$ .

Conditional Expectation¶

The expectation can be conditional on a random event or a realization of random variable.

$\begin{align} E[X | A]&=\sum_{x_{i}\in R_{X}}x_{i} P_{X | A}\left(x_{i}|A\right) \\ E[X | Y=y_{j}]&=\sum_{x_{i} \in R_{X}} x_{i} P_{X | Y}\left(x_{i} | Y=y_{j}\right) \end{align}$

Use the PMF example in last section, let's try to answer questions below.

Find $E[X|Y = 1]$ .

To calculate the conditional expectation, we must use conditional probability as weight:

$E[X|Y = 1]= -1 \left(\frac{1}{3}\right)+ 0 \left(\frac{1}{3}\right)+1 \left(\frac{1}{3}\right)=0$

Find $E[X|−1 < Y < 2]$ .

First, calculate the conditional PMF

$P_{X|-1<Y<2}(x_i |-1<Y<2) = -2\frac{1/13}{8/13}-\frac{2/13}{8/13}+0\frac{3/13}{8/13}+ \frac{2/13}{8/13} + 2\frac{1/13}{8/13}=0$

Conditional Expectation as A Function¶

If you paid attention to the conditional expection expression

$E[X | Y=y_{j}]=\sum_{x_{i} \in R_{X}} x_{i} P_{X | Y}\left(x_{i} | Y=y_{j}\right)$

you would find that it is actually a function of $Y$ .

Consider a joint PMF below

$\begin{array}{|c|c|} \hline & X = 0 & X = 1 & P_Y(y) \\ \hline Y = 0 & 1/5 & 2/5 & 3/5\\ \hline Y = 1 & 2/5 & 0 & 2/5 \\ \hline P_X(x) &3/5 & 2/5 \\ \hline \end{array}$

What are the conditional PMF $P_{X|Y}(x|0)$ and $P_{X|Y}(x|1)$ ?

$P_{X|Y}(X= 0|Y = 0) = \frac{1/5}{3/5}=1/3\\ P_{X|Y}(X= 1|Y = 0) = \frac{2/5}{3/5}=2/3\\ P_{X|Y}(X= 0|Y = 1) = \frac{2/5}{2/5}=1\\ P_{X|Y}(X= 1|Y = 1) = 0\\$

Let $Z = E[X|Y]$ , find PMF of $Z$ .

Remember that $Z$ is a function of $Y$ . To calculate conditional expectation, we need to use conditional probability as well.

$E[X|Y = 0] = 0 \left(\frac{\frac{1}{5}}{\frac{1}{5}+\frac{2}{5}}\right)+1\left(\frac{\frac{2}{5}}{\frac{1}{5}+\frac{2}{5}}\right) =\frac{2}{3}\\ E[X|Y = 1] = 0$

Find $E[Z]$ , and check that if $E[Z] = E[X]$ .

Because $E[X|Y]$ itself is a variable, it must have an expectation as well

$E[Z] = E[E[X|Y]] = P_Y(Y = 0)E[X|Y = 0]+ P_Y(Y = 1)E[X|Y = 1] = \frac{3}{5}\cdot\frac{2}{3}+\frac{2}{5}\cdot0=\frac{2}{5}$

Actually, $E[Z] = E[E[X|Y]] = E[X]$ must hold, it is the law of iterated expectation.

Expectation for Independent Variables¶

All the rules of expectation for independent variables are here, they are fairly straightforward, because conditioning on $Y$ does not provide any extra information

$E[X | Y]=E[ X]$
$E[g(X) | Y]=E[g(X)]$
$E[X Y]=E[ X] E [Y]$
$E[g(X) h(Y)]=E[g(X)] E[h(Y)]$

Continuous Distributions¶

Joint PDF¶

Joint PDF of $X$ and $Y$ is defined as

$P((X, Y) \in A)=\iint_{A} f_{X Y}(x, y) d x d y =1$

where $f_{XY}(x, y)$ is a non-negative function, mapping $\mathbb{R}^2$ to $\mathbb{R}$ .

However, we are particularly interested in the case that $A$ is a rectangular,

$P(a\geq X \geq b, c\geq Y \geq d) =\int_c^d\int_a^b f_{X Y}(x, y) d x d y$

And within $A$ , there are infinite amount of small rectangles

$P(a\geq X \geq a+\delta, c\geq Y \geq c+\delta )\approx f_{XY}(a,c)\delta^2$

An Example of Joint PDF ¶

Let's consider an example other than normal distribution.

$f_{X Y}(x, y)=\left\{\begin{array}{ll} x+c y^{2} & 0 \leq x \leq 1,\quad 0 \leq y \leq 1 \\ 0 & \text { otherwise } \end{array}\right.$

Find out constant $c$ .

Use the property $\iint_{A} f_{X Y}(x, y) d x d y =1$

$\begin{align} \int^1_0\int^1_0(x+cy^2)dxdy &= 1\\ \int^1_0\left[\frac{x^2}{2}+cxy^2\right]^1_0dy &= 1\\ \int^1_0\left[\frac{1}{2}+cy^2\right]dy &= 1\\ \left[\frac{y}{2}+c\frac{y^3}{3}\right]^1_0&=1\\ \frac{1}{2}+\frac{c}{3}&=1\\ c&=\frac{3}{2}\\ \end{align}$

Find out $P(0 ≤ X ≤ 1/2,0 ≤ Y ≤ 1/2)$

Plug in $c$ , perform double integration

$\begin{align} \int^{1/2}_{0}\int^{1/2}_0\left(x+\frac{3}{2}y^2\right)dxdy &= \int_0^{1/2}\left[\frac{x^2}{2}+\frac{3}{2}y^2x\right]_0^{1/2}dy \\ &=\int_0^{1/2}\left[\frac{1}{8}+\frac{3}{4}y^2\right]dy\\ &=\left[\frac{1}{8}+\frac{y^3}{4}\right]_0^{1/2}\\ &=\frac{3}{32} \end{align}$

The joint distribution is depicted as below, the volume between the curved plane and $xy$ plane is $1$ .

In [3]:

x, y = np.linspace(0, 1), np.linspace(0, 1)
X, Y = np.meshgrid(x, y)
Z = X + 3/2*Y**2

fig = plt.figure(figsize = (8, 8))
ax = fig.gca(projection='3d')
ax.plot_surface(X, Y, Z, cmap = 'coolwarm')
ax.contourf(X, Y, Z, zdir='z', offset=0, cmap='coolwarm')
plt.show()

Marginal PDF ¶

Maringal PDF of $X$ and $Y$ are

$\begin{equation} f_{X}(x)=\int_{-\infty}^{\infty} f_{X Y}(x, y) d y,\quad \text { for all } x \\ f_{Y}(y)=\int_{-\infty}^{\infty} f_{X Y}(x, y) d x,\quad \text { for all } y \end{equation}$

Let's use the same example as in last section to find out $f_X(x)$ and $f_Y(y)$ .

$f_{X}(x)=\int_{0}^{1}\left(x+\frac{3}{2}y^2\right) d y =x+\frac{1}{2}\\ f_{Y}(y)=\int_{0}^{1}\left(x+\frac{3}{2}y^2\right) d x =\frac{3}{2} y^{2}+\frac{1}{2}$

Joint CDF ¶

Joint CDF and joint PDF has relationship as follows:

$F_{X Y}(x, y)=\int_{-\infty}^{y} \int_{-\infty}^{x} f_{X Y}(u, v) d u d v \\ f_{X Y}(x, y)=\frac{\partial^{2}}{\partial x \partial y} F_{X Y}(x, y)$

The same PDF as above, find the CDF.

$f_{X Y}(x, y)=\left\{\begin{array}{ll} x+\frac{3}{2} y^{2} & 0 \leq x \leq 1,\quad 0 \leq y \leq 1 \\ 0 & \text { otherwise } \end{array}\right.$

$\begin{align} F_{XY}(x,y)=\int_{0}^{y} \int_{0}^{x} f_{X Y}(u, v) d u d v&=\int_{0}^{y} \int_{0}^{x} \left(u+\frac{3}{2}v^2\right) d u d v\\ & = \int_0^y\left[\frac{u^2}{2}+\frac{3}{2}v^2u\right]^x_0dv\\ & = \int_0^y\left(\frac{x^2}{2}+\frac{3}{2}v^2x\right)dv\\ & = \left[\frac{x^2}{2}v+\frac{3}{2}\frac{v^3}{3}x\right]^y_0\\ & =\frac{x^2y}{2}+\frac{y^3x}{2} \end{align}$

Conditional PDF and CDF ¶

Consider the conditional PDF of $X$ given that $X\in A$

$\begin{align} P(x\leq X \leq x+\delta|X \in A)\approx f_{X|X\in A}(x)\cdot \delta &= \frac{P(x\leq X \leq x+\delta,X \in A)}{P(A)}\\ &=\frac{P(x\leq X \leq x+\delta)}{P(A)}\\ &\approx\frac{f_X(x)\delta}{P(A)} \end{align}$

We have shown that

$f_{X|X\in A}(x) = \frac{f_X(x)}{P(A)}$

You can imagin $P(A)$ as a scaling factor that normalize the conditional PDF into an area of $1$ .

For two jointly continuous random variables $X$ and $Y$ , we can define the following conditional concepts:

The conditional PDF of $X$ given $Y=y$ :

$f_{X | Y}(x | y)=\frac{f_{X Y}(x, y)}{f_{Y}(y)}$

The conditional probability that $X \in A$ given $Y=y:$

$P(X \in A | Y=y)=\int_{A} f_{X | Y}(x | y) d x$

The conditional CDF of $X$ given $Y=y$

$F_{X | Y}(x | y)=P(X \leq x | Y=y)=\int_{-\infty}^{x} f_{X | Y}(x | y) d x$

The intuition of the first expression, i.e. conditional PDF is

$P(x\leq X \leq x+\delta| y\leq Y\leq y+\epsilon)\approx \frac{f_{XY}(xy)\delta\epsilon}{f_Y(y)\epsilon}=f_{X|Y}(x|y)\delta$

Conditional probability must satisfy the basic rule of probability as well,

$\int_{-\infty}^\infty f_{X|Y}(x|y)dx = 1$

because

$\frac{\int_{-\infty}^\infty f_{XY}(xy)dx}{f_Y(y)} = 1$

Rearrange the conditional PDF, we obtain the multiplication rule

$f_{XY}(xy)=f_{X|Y}(x|y)f_Y(y)$

Independence¶

If continuous variables $X$ and $Y$ are independent, then knowing either of them does not provide information for the other. That is

$f_{X|Y}(x|y) = f_X(x),\qquad \text{or} \qquad f_{Y|X}(y|x) = f_Y(y)$

Thus the multiplication rule for independent distribution

$f_{XY}(xy)=f_X(x)f_Y(y)$

Other rules derived from this are

$\begin{align} E[XY]&= E[X]E[Y]\\ \text{Var}(X+Y)&=\text{Var}(X)+\text{Var}(Y)\\ E[g(X)h(Y)]&=E[g(X)]E[h(Y)] \end{align}$

Table of Contents