Notebook

1 Probability Mass Function
- 1.1 Cumulative Distribution Function
2 Probability Density Function
3 Independent Random Variables
4 Expected Value and Variances

In [9]:

import numpy as np
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

Probability Mass Function

If $X$ is a discrete variable with a finite range $R_X = \{x_1,x_2,x_3...\}$, the probability mass function (PMF) of $X$ is

$$ P_X(x_k)=P(X=x_k),\ \text{for } k=1,2,3\ldots $$

which is a function which maps the possible values to the corresponding probabilities.

For instance, here is a discrete uniform distribution of a fair die

In [10]:

pmf = 1/6*np.ones(6)
diceN = np.arange(1,7)
plt.stem(diceN, pmf,use_line_collection = True)
plt.axis([0, 7, 0, .3])
plt.title('PMF of A Fair Die')
plt.show()

Cumulative Distribution Function

The cumulative distribution function (CDF) of random variable $X$ is defined as $$ F_{X}(x)=P(X \leq x), \text { for all } x \in \mathbb{R} $$

The CDF of the die example is

In [11]:

pmf = 1/6*np.ones(6)
cdf = np.cumsum(pmf)
plt.stem(diceN, cdf ,use_line_collection = True)
plt.axis([0, 7, 0, 1.1])
plt.title('CDF of A Fair Die')
plt.show()

One particular useful formula for calculating probability within an interval is

$$ P(a<X\leq b) = F_X(b)-F_X(a) $$

It is commonly used for computing the probability of a range, for instance, what is the probability $P(2<X\leq 5)$ in this case?

Using the formula,

$$ P(2<X\leq5) = F_X(5) - F_X(2) = 5/6 - 2/6 = 1/2 $$

Probability Density Function

The Probability Density Function (PDF) is concept used for continuous distribution. For instance the temperature of a room follows a continuous distribution.

We denote PDF as $f_X(x)$, however any realization of $x$ must be $0$ probability, using the room temperature example, the probability that a room reaches exactly $20 ^\circ \text{C}$ is theoretically $0$. $f_X(x = 25)= 0$. Therefore to obtain positive probability, we must delimit range of the arguments $(x,\ \Delta x)$

$$ f_{X}(x)=\lim _{\Delta \rightarrow 0^{+}} \frac{P(x<X \leq x+\Delta)}{\Delta} $$

Furthermore, use CDF formula and definition of derivative, the relation of CDF and PDF

$$ f_{X}(x)=\lim _{\Delta \rightarrow 0} \frac{F_{X}(x+\Delta)-F_{X}(x)}{\Delta}=\frac{d F_{X}(x)}{d x}=F_{X}^{\prime}(x) $$

We will see examples in next chapter.

Independent Random Variables

Independent R.V.s are similar to independent events, recall that independent events have the property

$$ p(A\cap B) =p (A)p(B) $$

Now consider two random variable $A$ and $B$, they are independent as long as

$$ p(X=x,Y=y)=p(X=x)p(X=y) $$

In general, independent variables have the property:

$$ p(X_1=x_1,X_2=x_2, ..., X_n = x_n)=p(X_1=x_1)p(X_2=x_2)...p(X_n=x_n)=\prod_{i=1}^np(X_i=x_i) $$

Expected Value and Variances

The expected value of discrete and continuous random variables are

$$ \text{Discrete:}\qquad E(X)=\sum_{i=1}^k x_ip_X(x_i)=\sum_{i=1}^k x_i p(X_i = x_i) $$$$\text{Continuous:}\qquad E(X) = \int_{-\infty}^{\infty}xf_X(x)dx$$

They are expressing the same idea that weighting each possibilities equally, then sum up.

The variance of discrete and continuous random variables are similar

$$ \text{Discrete:}\qquad\operatorname{Var}(X)=E\left[\left(X-\mu_{X}\right)^{2}\right]=E (X^{2})-[E (X)]^{2} $$\begin{aligned} \text{Continuous:}\qquad \operatorname{Var}(X)&= E\left[\left(X-\mu_{X}\right)^{2}\right]=\int_{-\infty}^{\infty}\left(x-\mu_{X}\right)^{2} f_{X}(x) d x \\ &=E (X^{2})-[E (X)]^{2}=\int_{-\infty}^{\infty} x^{2} f_{X}(x) d x-\mu_{X}^{2} \end{aligned}

And a common method for manual calculation of variance

$$ \begin{aligned} \operatorname{Var}(X) &=E\left[\left(X-\mu_{X}\right)^{2}\right] \\ &=E\left[X^{2}-2 \mu_{X} X+\mu_{X}^{2}\right] \\ &=E\left[X^{2}\right]-2 E\left[\mu_{X} X\right]+E\left[\mu_{X}^{2}\right]\\ &=E\left[X^{2}\right]-2 \mu_{X}^{2}+\mu_{X}^{2}\\ &=E\left[X^{2}\right]-\mu_{X}^{2} \end{aligned} $$

Table of Contents

Probability Mass Function

Cumulative Distribution Function

Probability Density Function

Independent Random Variables

Expected Value and Variances