Answer for StackOverflow question: http://stackoverflow.com/q/22239691/2839786
Assuming the line of best fit for a set of points is given by:
$y = a + b x$
where:
$b = \Large{\frac{\sum x_i y_i - n \bar x\bar y )} {\sum (x_i - \bar x)^2}} $
and
$a = \bar y - b \bar x$
Here is a toy code to do that:
# sample points
X = [0, 5, 10, 15, 20]
Y = [0, 7, 10, 13, 20]
# solve for a and b
def best_fit(X, Y):
xbar = sum(X)/len(X)
ybar = sum(Y)/len(Y)
n = len(X) # or len(Y)
numer = sum(xi*yi for xi,yi in zip(X, Y)) - n * xbar * ybar
denum = sum(xi**2 for xi in X) - n * xbar**2
b = numer / denum
a = ybar - b * xbar
print('best fit line:\ny = {:.2f} + {:.2f}x'.format(a, b))
return a, b
# solution
a, b = best_fit(X, Y)
# plot points and fit line
%matplotlib inline
import matplotlib.pyplot as plt
plt.scatter(X, Y)
yfit = [a + b * xi for xi in X]
plt.plot(X, yfit)
plt.grid()
plt.show()
best fit line: y = 0.80 + 0.92x