- Jupyter: http://continuum.io/downloads and Python 2 kernel for Jupyter; see https://ipython.org/install.html

- Stark, P.B., 1997–2015.
*SticiGui: Statistical Tools for Internet and Classroom Instruction with a Graphical User Interface*. - Stark, P.B., 1990–2010. Lecture notes for Nonparametrics, Statistics 240

**These notes are in draft form, with large gaps.**
I'm happy to hear about any errors, and I hope eventually to fill in some of the missing pieces.

- Jupyter notebook
- Cells, markdown, MathJax

- Less Python than you need

- What's the difference between Probability and Statistics?
Counting and combinatorics

- Sets: unions, intersections, partitions
- De Morgan's Laws
- The Inclusion-Exclusion principle
- The Fundamental Rule of Counting
- Combinations
- Permutations
- Strategies for counting

Axiomatic Probability

- Outcome space and events, events as sets
- Kolmogorov's axioms (finite and countable)
- Analogies between probability and area or mass
- Consequences of the axioms
- Probabilities of unions and intersections
- Bounds on probabilities
- Bonferroni's inequality
- The inclusion-exclusion rule for probabilities

- Conditional probability
- The Multiplication Rule
- Independence
- Bayes Rule ## Lecture 2: Probability, continued

- Theories of probability
- Equally likely outcomes
- Frequency Theory
- Subjective Theory
- Shortcomings of the theories
- Rates versus probabilities
- Measurement error
- Where does probability come from in physical problems?
- Making sense of geophysical probabilities
- Earthquake probabilities
- Probability of magnetic reversals
- Probability that Earth is more than 5B years old

- Random variables.
- Probability distributions of real-valued random variables
- Cumulative distribution functions
- Discrete random variables
- Probability mass functions
- The uniform distribution on a finite set
- Bernoulli random variables
- Random variables derived from the Bernoulli
- Binomial random variables
- Geometric
- Negative binomial

- Hypergeometric random variables
- Poisson random variables: countably infinite outcome spaces

- Random variables, continued
- Continuous and "mixed" random variables
- Probability densities
- The uniform distribution on an interval
- The Gaussian distribution

- The CDF of discrete, continuous, and mixed distributions
- Distribution of measurement errors
- The box model for random error
- Systematic and stochastic error

- Independence of random variables
- Events derived from random variables
- Definitions of independence
- Independence and "informativeness"
- Examples of independent and dependent random variables
- IID random variables
- Exchangeability of random variables

- Marginal distributions
- Point processes
- Poisson processes
- Homogeneous and inhomogeneous Poisson processes
- Spatially heterogeneous, temporally homogenous Poisson processes as a model for seismicity
- The conditional distribution of Poisson processes given N

- Marked point processes
- Inter-arrival times and inter-arrival distributions
- Branching processes
- ETAS

- Poisson processes

- Expectation
- The Law of Large Numbers
- The Expected Value
- Expected value of a discrete univariate distribution
- Special cases: Bernoulli, Binomial, Geometric, Hypergeometric, Poisson

- Expected value of a continuous univariate distribution
- Special cases: uniform, exponential, normal

- Expected value of a multivariate distribution

- Expected value of a discrete univariate distribution
- Standard Error and Variance.
- Discrete examples
- Continuous examples
- The square-root law
- Standardization and Studentization
- The Central Limit Theorem

- The tail-sum formula for the expected value
- Conditional expectation
- The conditional expectation is a random variable
- The expectation of the conditional expectation is the unconditional expectation

- Useful probability inequalities
- Markov's Inequality
- Chebychev's Inequality
- Hoeffding's Inequality
- Jensen's inequality

- Simulation
- Pseudo-random number generation
- Importance of the PRNG. Period, DIEHARD

- Assumptions
- Uncertainties
- Sampling distributions

- Pseudo-random number generation

- Hypothesis tests
- Null and alternative hypotheses, "omnibus" hypotheses
- Type I and Type II errors
- Significance level and power
- Approximate, exact, and conservative tests
- Families of tests
- P-values
- Estimating P-values by simulation

- Test statistics
- Selecting a test statistic
- The null distribution of a test statistic
- One-sided and two-sided tests

- Null hypotheses involving actual, hypothetical, and counterfactual randomness
- Multiplicity
- Per-comparison error rate (PCER)
- Familywise error rate (FWER)
- The False Discovery Rate (FDR)

- Tests, continued
- Parametric and nonparametric tests
- The Kolmogorov-Smirnov test and the MDKW inequality
- Example: Testing for uniformity
- Conditional test for Poisson behavior

- Permutation and randomization tests
- Invariances of distributions
- Exchangeability
- The permutation distribution of test statistics
- Approximating permutation distributions by simulation
- The two-sample problem

- Testing when there are nuisance parameters

- Parametric and nonparametric tests
- Confidence sets
- Definition
- Interpretation
- Duality between hypothesis tests and confidence sets
- Tests and confidence sets for Binomial p
- Pivoting
- Confidence sets for a normal mean
- known variance
- unknown variance; Student's t distribution

- Confidence sets for a normal mean
- Approximate confidence intervals using the normal approximation
- Empirical coverage
- Failures

- Nonparametric confidence bounds for the mean of a nonnegative population
- Multiplicity
- Simultaneous coverage
- Selective coverage

- Summarizing data.
- Types of data: categorical, ordinal, quantitative
- Univariate data.
- Measures of location and spread: mean, median, mode, quantiles, inter-quartile range, range, standard deviation, RMS
- Markov's and Chebychev's inequalities for quantitative lists
- Ranks and ordinal categorical data
- Frequency tables and histograms
- Bar charts

- Multivariate data
- Scatterplots
- Measures of association: Pearson and Spearman correlation coefficients
- Linear regression
- The Least Squares principle
- The Projection Theorem
- The Normal Equations
- Numerical solution of the normal equations
- Numerical linear algebra is not the same as abstract linear algebra
- Condition number
- Do not invert matrices to solve linear systems: use backsubstitution or factorization

- Errors in regression: RMS error of linear regression
- Least Absolute Value regression

- Principal components and approximation by subspaces: another application of the Projection Theorem
- Clustering
- Distance functions
- Hierarchical methods, tree-based methods
- Centroid methods: K-means
- Density-based clustering: kernel methods, DBSCAN

Counting and combinatorics

- Sets: unions, intersections, partitions
- De Morgan's Laws
- The Inclusion-Exclusion principle.
- The Fundamental Rule of Counting
- Combinations. Application (using the Inclusion-Exclusion Principle): counting derangements
- Permutations
- Strategies for complex counting problems

Theories of probability

- Equally likely outcomes
- Frequency Theory
- Subjective Theory
- Shortcomings of the theories

Axiomatic Probability

- Outcome space and events, events as sets
- Kolmogorov's axioms (finite and countable)
- Analogies between probability and area or mass
- Consequences of the axioms
- Probabilities of unions and intersections
- Bounds on probabilities
- Bonferroni's inequality
- The inclusion-exclusion rule for probabilities

- Conditional probability
- The Multiplication Rule
- Independence
- Bayes Rule

Random variables.

- Probability distributions
- Cumulative distribution functions for real-valued random variables
- Discrete random variables
- Probability mass functions
- The uniform distribution on a finite set
- Bernoulli random variables
- Random variables derived from the Bernoulli
- Binomial random variables
- Geometric
- Negative binomial

- Poisson random variables: countably infinite outcome spaces
- Hypergeometric random variables
- Examples of other discrete random variables

- Continuous and "mixed" random variables
- Probability densities
- The uniform distribution on an interval
- The exponential distribution and double-exponential distributions
- The Gaussian distribution
- The CDF of discrete, continuous, and mixed distributions

- Survival functions and hazard functions
- Counting processes

- Joint distributions of collections of random variables, random vectors
- The multivariate uniform distribution
- The multivariate normal distribution
- Independence of random variables
- Events derived from random variables
- Definitions of independence

- Marginal distributions
- Conditional distributions
- The "memoryless property" of the exponential distribution

- The Central Limit Theorem

- Stochastic processes
- Point processes
- Intensity functions and conditional intensity functions
- Poisson processes
- Homogeneous and inhomogeneous Poisson processes
- The conditional distribution of Poisson processes given N

- Marked point processes
- Inter-arrival times and inter-arrival distributions
- The conditional distribution of a Poisson process

- Random walks
- Markov chains
- Brownian motion

- Point processes

Expectation

- The Law of Large Numbers
- The Expected Value
- Expected value of a discrete univariate distribution
- Special cases: Bernoulli, Binomial, Geometric, Hypergeometric, Poisson

- Expected value of a continuous univariate distribution
- Special cases: uniform, exponential, normal

- (Aside: measurability, Lebesgue integration, and the CDF as a measure)
- Expected value of a multivariate distribution

- Expected value of a discrete univariate distribution
- Expected values of functions of a random variable
- Change-of-variables formulas for probability mass functions and densities

- Standard Error and Variance.
- Discrete examples
- Continuous examples
- The square-root law

- The tail-sum formula for the expected value
- Conditional expectation
- The expectation of the conditional expectation is the unconditional expectation

- Useful probability inequalities
- Markov's Inequality
- Chebychev's Inequality
- Hoeffding's Inequality

Empirical distributions

- The ECDF for univariate distributions
- The Kolmogorov-Smirnov statistic and The Massart-Dvoretzky-Kiefer-Wolfowitz inequality
- Inference: inverting the MDKW inequality
- Q-Q plots

Random sampling.

- Types of samples
- Samples of convenience
- Quota sampling
- Systematic sampling
- The importance of random sampling: stirring the soup.
- Systematic random sampling
- Random sampling with replacement
- Simple random sampling
- Stratified random sampling.
- Cluster sampling
- Multistage sampling
- Weighted random samples
- Sampling with probability proportional to size

- Sampling frames
- Nonresponse and missing data
- Sampling bias

- Types of samples
Simulation

- Pseudo-random number generators
- Why the PRNG matters
- Uniformity, period, independence
- Assessing PRNGs. DIEHARD and other tests
- Linear congruential PRNGs, including the Wichmann-Hill. Group-induced patterns
- Statistically "adequate" PRNGs, including the Mersenne Twister
- Cryptographic quality PRNGs, including cryptographic hashes

- Generating pseudorandom permutations
- Taking pseudorandom samples
- Simulating sampling distributions

- Pseudo-random number generators

Estimating parameters using random samples

- Sampling distributions
- The Central Limit Theorem
- Measures of accuracy: mean squared error, median absolute deviation, etc.
- Maximum likelihood
- Loss functions, Risk, and decision theory
- Minimax estimates
- Bayes estimates
- The Bootstrap
- Shrinkage and regularization

Inference

- Hypothesis tests
- Null and alternative hypotheses, "omnibus" hypotheses
- Type I and Type II errors
- Significance level and Power
- Approximate, exact, and conservative tests
- Families of tests
- P-values
- Estimating P-values by simulation

- Test statistics
- Selecting a test statistic
- The null distribution of a test statistic
- One-sided and two-sided tests

- Null hypotheses involving actual, hypothetical, and counterfactual randomness
- Multiplicity
- Per-comparison error rate
- Familywise error rate
- The False Discovery Rate

- Approaches to testing
- Parametric and nonparametric tests
- Likelihood ratio tests
- Permutation and randomization tests
- Invariances of distributions
- Exchangeability
- Other symmetries
- The permutation distribution of test statistics
- Approximating permutation distributions by simulation

- Confidence sets
- Duality between hypothesis tests and confidence sets
- Conditional tests, conditional and unconditional significance levels

- Hypothesis tests
Tests of particular hypotheses

- The Neyman model of a randomized experiment.
- Strong and weak null hypotheses
- Testing the strong null hypothesis
- The distribution of a test statistic under the strong null

- "Interference"
- Blocking and other designs
- Ensuring that the null hypothesis matches the experiment

- Tests for Binomial p
- The Sign test
- The sign test for the median; tests for other quantiles
- The sign test for a difference in medians

- Tests based on the normal approximation
- The Z statistic and the Z test
- The t statistic and the t test
- 2-sample problems, paired and unpaired tests
- Tests based on ranks
- The Wilcoxon test
- The Wilcoxon signed rank test

- Tests using actual values

- Tests of association
- The hypothesis of exchangeability
- The Spearman test
- The permutation distribution of the Pearson correlation

- Tests of randomness and independence
- The runs test

- Tests of symmetry
- Tests of exchangeability
- Tests of spherical symmetry

- The two-sample problem
- Selecting the test statistic: what's the alternative?
- Mean, sum, Student t
- Smirnov statistic
- Other choices

- The permutation distribution of the test statistic
- The two-sample problem for complex data
- Test statistics

- The k-sample problem

- Selecting the test statistic: what's the alternative?
- Stratified permutation tests
- Fisher's Exact Test
- Tests of homogeneity and ANOVA
- The F statistic
- The permutation distribution of the F statistic
- Other statistics
- Ordered alternatives

- Tests based on the distribution function: The Kolmogorov-Smirnov Test
- The universality of the null distribution for continuous variables
- Using the K-S test to test for Poisson behavior

- Sequential tests and Wald's SPRT
- Random walks and Gambler's ruin
- Wald's Theorem

- The Neyman model of a randomized experiment.
Confidence intervals for particular parameters

- Confidence intervals for a shift in the Neyman model
- Confidence intervals for Binomial p
- Application: confidence bounds for P-values estimated by simulation
- Application: intervals for quantiles by inverting binomial tests

- Confidence intervals for a Normal mean using the Z and t distributions
- Confidence intervals for the mean
- Nonparametric confidence bounds for a population mean
- The need for a priori bounds
- Nonnegative random variables
- Bounded random variables

- Nonparametric confidence bounds for a population mean
- Confidence sets for multivariate parameters

Density estimation

- Histogram estimates
- Kernel estimates
- Confidence bounds for monotone and shape-restricted densities
- Lower confidence bounds on the number of modes

Function estimation

- Splines and penalized splines
- Polynomial splines
- Periodic splines
- Smoothing splines as least-squares
- B-splines
- L1 splines

- Constraints
- Balls and ellipsoids
- Smoothness and norms
- Lipschitz conditions
- Sobolev conditions

- Cones
- Nonnegativity
- Shape restrictions
- Monotonicity
- Convexity

- Star-shaped constraints
- Sparsity and minimum L1 methods

- Balls and ellipsoids

- Splines and penalized splines

Experiments versus observational studies

- Controls and the Method of Comparison
- Randomization
- Blinding

Experimental design

- Blocking
- Orthogonal designs
- Latin hypercube design

In [1]:

```
# Version information
%load_ext version_information
%version_information scipy, numpy, pandas, matplotlib
```

Out[1]:

In [ ]:

```
```