# ipython magic
%matplotlib notebook
%load_ext autoreload
%autoreload 2
# plot configuration
import matplotlib
import matplotlib.pyplot as plt
plt.style.use("ggplot")
# import seaborn as sns # sets another style
matplotlib.rcParams['lines.linewidth'] = 3
fig_width, fig_height = (7.0,5.0)
matplotlib.rcParams['figure.figsize'] = (fig_width, fig_height)
# font = {'family' : 'sans-serif',
# 'weight' : 'normal',
# 'size' : 18.0}
# matplotlib.rc('font', **font) # pass in the font dict as kwar
# Imports and definitions
from sympy import *
from sympy.stats import Normal, density
init_printing(use_latex='mathjax')
from IPython.display import display;
x1,x2,y =symbols("x1,x2,y")
Inspired by Rachel Fewster's presentation
The mean, expected value, or expectation of a random variable $X$ is written $E(X)$ or sometimes $\mu_X$. When it is assumed to be clear from the context which random varialbe it pertains to, the supscript of the expectation may by dropped, i.e. $E(X)=\mu$. The meaning of $E(X)$ is that if we observe $N$ random values of $X$, the mean of these $N$ values will be approximately equal to $E(X)$ for large values of $N$ .
where $p(x)$ is the probability density function (pdf) of for the random variable $X$.
With sympy.stats we may compute
the analytical expectation of a normal distribution easily. Random
variables may be declared by means of ready made functions like
Normal, Exponential
,etc. A corresponding probability density
function may declared with the density
function. The expectation
$E(X) = \mu$ may then be computed analyticaly by following the
definition in (1):
# Expectation of a normally distributed variable
mu1 = symbols("mu1", positive=True)
V1 = symbols("sigma1", positive=True)
X1 = Normal("X", mu1, V1)
D1 = density(X1)(x1)
E1=Integral(x1*D1,(x1,-oo,oo))
display(Eq(E1,E1.doit())) # use doit to evaluate an unevaluated integral
In our snippet above the random variable $Z$ is normally distributed with variance mu1
and variance V1
(rendered $\mu_1$ and $\sigma^2$ repsectively, with mathjax).
We observe that when we carry out the integration
E1=Integral(x1*D1,(x1,-oo,oo))
the expectation $\mu_1$ is returned
as it should.
The variance $V(X)$ is introduced to quantify "how much" the samples of the random variable $X$ varies around the mean $E(X)$:
The variance for our random variable Z
in the code snippet above may also computed from the definition in (2):
# Compute the variance analytically
V = Integral((x1-E1.doit())**2*D1, (x1,-oo,oo))
display(Eq(V,V.doit()))
A common way of denoting the variance of $X$ is also $V(X)=\sigma_X^2$. For later use we observe that from (2) we may deduce the commonly used relations:
and thus:
Joint probability density functions:
Suppose that $X_1, X_2, \ldots$ are random varibles, possibly dependent on each other, then their joint probability density function, (jpdf) expresses the probability that each variable falls within the range specificed for that particular variable. In the case of two random variables, the joint probability distribution is normally called the bivariate distribution.
Below we will consider the jpdfs for the input parameters of a computational model, which we assume to be independent random variables. We will further assume that the jpdf may simply be represented as:
Below we compute the expectation for a bivariate distribution:
# Expectation of the product of two normally distributed parameters
mu2 = symbols("mu2", positive=True)
V2 = symbols("sigma2", positive=True)
X2 = Normal("X", mu2, V2)
D2=density(X2)(x2)
jpdf=D1*D2
E2=Integral(x1*x2*jpdf,(x2,-oo,oo),(x1,-oo,oo))
display(Eq(E2,E2.doit())) # use doit to evaluate an unevaluated integral
and we note that the expectation of the bivariate distribution is the scalar number given (the product of the two mean values).
Conditional pdfs:
Suppose that $X$ and $Y$ are two random varibles, possibly dependent on each other and that we fix $X$ at the value $x$.This situation gives rise to a conditional pdf of $Y$ for a given $X=x$:
where $p(X,Y)$ is the joint pdf. That is, whenever the joint pdf may be represented as the product of the random variables, the conditional pdf reduces to the marginal pdf of the variable which is not taken as a constant. We then realize that the conditional pdf of $Y$ for a given $X=x$ may be simplified to:
Conditional expectation:
Once the conditional pdfs has been introduced, the conditional expectation $E(Y\; | \,X=x)$ has natural formulation following from the definition in (1):
Conditional expectation as a random variable:
However, the conditional expectation, $E(Y \;|\, X=x)$ is a number depending on $x$, and therefore the value of $x$ will also influence the mean or expectation of $Y$. The normal, unconditional expectation $E(X)$ of a random variable $X$, is just a number.
# Conditional expectation
E_given_x1=Integral(x2*jpdf,(x2,-oo,oo))
display(E_given_x1.doit())
from sympy.plotting import plot
mu1_value=1
V1_value=2
mu2_value=2
V2_value=1
Ex=E_given_x1.subs([(V1,V1_value),(mu2,mu2_value),(mu1,mu1_value)]).doit()
_=plot(Ex,(x1,-10,10))
# Compute the conditional variance analytically
V_given_x1=Integral((x2-E_given_x1.doit())**2*jpdf,(x2,-oo,oo))
display(simplify(V_given_x1.doit()))
Vx=V_given_x1.subs([(mu1,mu1_value),(V1,V1_value),(mu2,mu2_value),(V2,V2_value)]).doit()
display(Vx)
_=plot(Vx,(x1,-10,10))
For two integrable random variables $X$ and $Y$ Adam's and Eve's laws are formulated below.
Adam's Law of total expectation:
Adam's law of total expectation may be stated as (follow the hyperlink for more details and proof):
the law is also known as law of iterated expectations, the tower rule, the smoothing theorem.
To make it more clear over which variables the expectation operators are applied to we may present Adams's law as:
A short proof:
We consider a mathematical/computational model $f$ which produces an output $y$ as a function of $k$ uncorrelated parameters $x_1, x_2, \ldots, x_k$:
Given that our computational model $f(x_1, x_2, \ldots, x_k)$ takes $x_1, x_2, \ldots, x_k$ are $k$ independent random parameters with marginal pdfs $p_i(x_i)$, a joint probability density function may be constructed:
As the parameters of our model are random parameters, the predicted values $y$ will be random too, with the following statistical moments:
Expected value $E(y)=\mu$:
Variance $V(y)=\sigma_y^2$:
where the latter equivalence may be deduced following the same lines of deduction as for (3).
The The Law of Total Variance (also known as
Eve's Law
) can be presented as: