import numpy as np
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
If $X$ is a discrete variable with a finite range $R_X = \{x_1,x_2,x_3...\}$, the probability mass function (PMF) of $X$ is
$$ P_X(x_k)=P(X=x_k),\ \text{for } k=1,2,3\ldots $$which is a function which maps the possible values to the corresponding probabilities.
For instance, here is a discrete uniform distribution of a fair die
pmf = 1/6*np.ones(6)
diceN = np.arange(1,7)
plt.stem(diceN, pmf,use_line_collection = True)
plt.axis([0, 7, 0, .3])
plt.title('PMF of A Fair Die')
plt.show()
The cumulative distribution function (CDF) of random variable $X$ is defined as $$ F_{X}(x)=P(X \leq x), \text { for all } x \in \mathbb{R} $$
The CDF of the die example is
pmf = 1/6*np.ones(6)
cdf = np.cumsum(pmf)
plt.stem(diceN, cdf ,use_line_collection = True)
plt.axis([0, 7, 0, 1.1])
plt.title('CDF of A Fair Die')
plt.show()
One particular useful formula for calculating probability within an interval is
$$ P(a<X\leq b) = F_X(b)-F_X(a) $$It is commonly used for computing the probability of a range, for instance, what is the probability $P(2<X\leq 5)$ in this case?
Using the formula,
$$ P(2<X\leq5) = F_X(5) - F_X(2) = 5/6 - 2/6 = 1/2 $$The Probability Density Function (PDF) is concept used for continuous distribution. For instance the temperature of a room follows a continuous distribution.
We denote PDF as $f_X(x)$, however any realization of $x$ must be $0$ probability, using the room temperature example, the probability that a room reaches exactly $20 ^\circ \text{C}$ is theoretically $0$. $f_X(x = 25)= 0$. Therefore to obtain positive probability, we must delimit range of the arguments $(x,\ \Delta x)$
$$ f_{X}(x)=\lim _{\Delta \rightarrow 0^{+}} \frac{P(x<X \leq x+\Delta)}{\Delta} $$Furthermore, use CDF formula and definition of derivative, the relation of CDF and PDF
$$ f_{X}(x)=\lim _{\Delta \rightarrow 0} \frac{F_{X}(x+\Delta)-F_{X}(x)}{\Delta}=\frac{d F_{X}(x)}{d x}=F_{X}^{\prime}(x) $$We will see examples in next chapter.
Independent R.V.s are similar to independent events, recall that independent events have the property
$$ p(A\cap B) =p (A)p(B) $$Now consider two random variable $A$ and $B$, they are independent as long as
$$ p(X=x,Y=y)=p(X=x)p(X=y) $$In general, independent variables have the property:
$$ p(X_1=x_1,X_2=x_2, ..., X_n = x_n)=p(X_1=x_1)p(X_2=x_2)...p(X_n=x_n)=\prod_{i=1}^np(X_i=x_i) $$The expected value of discrete and continuous random variables are
$$ \text{Discrete:}\qquad E(X)=\sum_{i=1}^k x_ip_X(x_i)=\sum_{i=1}^k x_i p(X_i = x_i) $$$$\text{Continuous:}\qquad E(X) = \int_{-\infty}^{\infty}xf_X(x)dx$$They are expressing the same idea that weighting each possibilities equally, then sum up.
The variance of discrete and continuous random variables are similar
And a common method for manual calculation of variance