#!/usr/bin/env python
# coding: utf-8

# # Probability Inequalities and Identities
# 
# This follows Lugosi (2006) rather closely; it also a repeat of the results in the chapter on 
# [Mathematical Preliminaries](mathPrelim.ipynb), which we largely omitted.
# 
# #### The tail-integral formula for expectation
# If $X$ is a nonnegative real-valued random variable,
# $$
# {\mathbb E} X = \int_0^\infty {\mathbb P} \{X \ge x\} dx
# $$
# 
# #### Jensen's Inequality
# If $\phi$ is a convex function from ${\mathcal X}$ to $\Re$, then $\phi({\mathbb E} X) \le {\mathbb E} \phi(X)$.
# 
# #### Markov's, Chebychev's, and related inequalities
# From the tail-integral formula,
# 
# $$
# {\mathbb E} X \ge \int_0^t {\mathbb P} \{X \ge x\} dx \ge t {\mathbb P} \{X \ge t \},
# $$
# 
# which yields _Markov's Inequality_:
# 
# $$
# {\mathbb P} \{ X \ge t \} \le \frac{{\mathbb E} X}{t}.
# $$
# 
# Moreover, for any strictly monotonic function $f$ and nonnegative $X$,
# we have the _Generalized Markov Inequality_:
# 
# $$
# {\mathbb P} \{ X \ge t \} = {\mathbb P} \{ f(X) \ge f(t) \} \le \frac{{\mathbb E} f(X)}{f(t)}.
# $$
# 
# In particular, for any real-valued $X$ and real $q > 0$,
# $| X - {\mathbb E} X |$ is a nonnegative
# random variable and $f(x) = x^q$ is strictly monotonic, so
# 
# $$
# {\mathbb P} \{| X - {\mathbb E} X | \ge t \} = {\mathbb P} \{ | X - {\mathbb E} X |^q \ge t^q \} \le
# \frac{{\mathbb E}  | X - {\mathbb E} X |^q}{t^q}.
# $$
# 
# _Chebychev's inequality_ is a special case:
# 
# $$
# {\mathbb P} \{ | X - {\mathbb E} X |^2 \ge t^2 \} \le \frac{{\mathbb E}  | X - {\mathbb E} X |^2}{t^2}
# = \frac{{\mbox{Var}} X}{t^2}.
# $$
# 
# #### The Chernoff Bound
# Apply the Generalized Markov Inequality with $f(x) = e^{sx}$ for positive $s$
# to obtain the _Chernoff Bound_:
# $$
# {\mathbb P} \{ X \ge t \} = {\mathbb P} \{ e^{sX} \ge e^{st} \} \le \frac{{\mathbb E} e^{sX}}{e^{st}}
# $$
# for all $s$.
# For particular $X$, one can optimize over $s$.
# 
# 
# #### Hoeffding's Inequality
# Let $\{ X_j \}_{j=1}^n$ be independent, and define
# $S_n \equiv \sum_{j=1}^n X_j$.
# Applying the Chernoff Bound yields
# $$
# {\mathbb P} \{ S_n - {\mathbb E} S_n \ge t \} \le e^{-st} {\mathbb E} e^{sS_n} =
# e^{-st} \prod_{j=1}^n e^{s(X_j - E X_j)}.
# $$
# Hoeffding bounds the moment generating function for a bounded random variable
# $X$:
# If $a \le X \le b$ with probability 1, then
# $$
# {\mathbb E} e^{sX}  \le e^{s^2 (b-a)^2/8},
# $$
# from which follows _Hoeffding's tail bound_.
# 
# If $\{X_j\}_{j=1}^n$ are independent and ${\mathbb P} \{a_j \le X_j \le b_j\} = 1$,
# then
# $$ 
# {\mathbb P} \{ S_n - {\mathbb E} S_n \ge t \} \le e^{-2t^2/\sum_{j=1}^n (b_j - a_j)^2}
# $$
# and
# $$ 
# {\mathbb P} \{ S_n - {\mathbb E} S_n \le -t \} \le e^{-2t^2/\sum_{j=1}^n (b_j - a_j)^2}
# $$
# 
# #### Hoeffding's Other Inequality
# Suppose $f$ is a convex, real function and ${\mathcal X}$ is a finite set.
# Let $\{X_j \}_{j=1}^n$ be a simple random sample from ${\mathcal X}$ and
# let $\{Y_j \}_{j=1}^n$ be an iid uniform random sample (with replacement) from ${\mathcal X}$.
# Then
# $$  
# {\mathbb E} f \left ( \sum_{j=1}^n X_j \right ) \le {\mathbb E} f \left ( \sum_{j=1}^n Y_j \right ).
# $$
# 
# #### Bernstein's Inequality
# Suppose $\{X_j \}_{j=1}^n$ are independent with ${\mathbb E} X_j = 0$ for all $j$,
# ${\mathbb P} \{ | X_j | \le c\} = 1$,
# $\sigma_j^2 = {\mathbb E} X_j^2$ and $\sigma = \frac{1}{n} \sum_{j=1}^n \sigma_j^2$.
# Then for any $\epsilon > 0$,
# $$
# {\mathbb P} \{ S_n/n > \epsilon \} \le e^{-n \epsilon^2 / 2(\sigma^2 + c\epsilon/3)}.
# $$

# Previous: [Random Variables, Expectation, & Random Vectors](rv.ipynb)
# Next: [Simulation](simulate.ipynb)

# In[1]:


get_ipython().run_line_magic('run', 'talktools.py')