Items 1 through 4 deal with the Big Picture™ questions: What is randomness? How do we think about uncertainty?
Items 5 through 7 are for computing expected values (mean, variance & standard deviation).
Items 8 through 10 are important for understanding long-run behavior.
Some topics to study from here on out:
You've seen this before:
Y=β0+β1X+ϵSo how would we solve for β1?
We can start by treating Cov as an operator!
Cov(Y,X)=Cov((β0+β1X+ϵ),X)=Cov(β0,X)+Cov((β1X),X)+Cov(ϵ,X)now Cov(β0,X)=0 since Cov of constant with anything is 0and Cov((β1X),X)=β1Cov(X,X)by definition of Var=β1Var(X)and since E(ϵ)=E(E(ϵ|X))=E(0)=0and further E(ϵX)=E(E(ϵX|X)) by Adam's Law=E(XE(ϵ|X)) since X is known, we can pull it out=E(0)=0so Cov(ϵ,X)=E(ϵX)−E(ϵ)E(X)=0−0=0⇒β1=Cov(X,Y)Var(X)(population version)import numpy as np
X = np.array([95, 85, 80, 70, 60])
Y = np.array([85, 95, 70, 65, 70])
# numpy.cov(X, Y) returns the matrix
# [ Cov(X,X), Cov(X,Y)]
# [ Cov(X,Y), Cov(Y,Y)]
covM = np.cov(X,Y)
beta_1 = covM[0,1]/covM[0,0]
beta_1
0.64383561643835618
For comparison's sake, we also obtain β1 via sklearn.linear_model.LinearRegression
from sklearn import linear_model
regr = linear_model.LinearRegression()
regr.fit(np.matrix(X).T, Y).coef_[0]
0.64383561643835607
Here's the set-up:
It is important to understand the difference between Xj and Yj:
Let ty be the true population total ∑N1Yi. How can we use random sampling of this finite population to find ^ty?
The claim is that ∑nj=1XjPZj is an unbiased estimator for ty we are looking for.
ty=n∑j=1XjPZj=N∑j=1IjYjPjwhere Ij=1 if person j included in sampleE(ty)=E(N∑j=1IjYjPj) find expected value to get ^ty=N∑j=1PjYjPj by linearity=N∑j=1YjThis is known as the Horvitz-Thompson Estimator, or alternately inverse probability weighting.
But is an unbiased estimator good?
Statistics is not easy, and it requires a lot of effort to keep your eyes open and question whether or not a tentative method is really going to yield a proper answer. Here is an anecdote to illustrate an example of when blindly applying an Horvitz-Thompson estimate ends in disaster.
The circus owner is planning to ship his 50 adult elephants and so he needs a rough estimate of the total weight of the elephants. As weighing an elephant is a cumbersome process, the owner wants to estimate the total weight by weighing just one elephant. Which elephant should he weigh?
So the owner looks back on his records and discovers a list of the elephants' weights taken 3 years ago. He finds that 3 years ago Sambo the middle-sized elephant was the average (in weight) elephant in his herd. He checks with the elephant trainer who reassures him (the owner) that Sambo may still be considered to be the average elephant in the herd. Therefore, the owner plans to weigh Sambo and take 50y (where y is the present weight of Sambo) as an estimate of the total weight of the 50 elephants.
But the circus statistician is horrified when he learns of the owner's proposed sampling plan. "How can you get an unbiased estimate of Y this way?", protests the statistician.
So, together they work out a compromise sampling plan. With the help of a table of random numbers they devise a plan that allots a selection probability of 99/100 to Sambo and equal selection probabilities of 1/4900 to each of the other 49 elephants. Naturally, Sambo is selected and the owner is happy.
"How are you going to estimate Y?", asks the statistician.
"Why? The estimate ought to be 50y of course," says the owner.
"Oh! No! That cannot possibly be right," says the statistician, "I recently read an article in the Annals of Mathematical Statistics where it is proved that the Horvitz-Thompson estimator is the unique hyperadmissible estimator in the class of all generalized polynomial unbiased estimators."
"What is the Horvitz-Thompson estimate in this case?" asks the owner, duly impressed.
"Since the selection probability for Sambo in our plan was 99/100," says the statistician, "the proper estimate of Y is 100y/99 and not 50y."
"And, how would you have estimated Y", inquires the incredulous owner, "if our sampling plan made us select, say, the big elephant Jumbo?"
"According to what I understand of the Horvitz-Thompson estimation method," says the unhappy statistician, "the proper estimate of Y would then have been 4900y, where y is Jumbo's weight".
That is how the statistician lost his circus job (and perhaps became a teacher of statistics)
View Lecture 34: A Look Ahead | Statistics 110 on YouTube.