This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.

9.3. Fitting a function to data with nonlinear least squares

  1. Let's import the usual libraries.
In [ ]:
import numpy as np
import scipy.optimize as opt
import matplotlib.pyplot as plt
%matplotlib inline
np.random.seed(3)
  1. We define a logistic function with four parameters.

$$f_{a,b,c,d}(x) = \frac{a}{1 + \exp\left(-c (x-d)\right)} + b$$

In [ ]:
def f(x, a, b, c, d):
    return a/(1. + np.exp(-c * (x-d))) + b
  1. Let's define four random parameters.
In [ ]:
a, c = np.random.exponential(size=2)
b, d = np.random.randn(2)
  1. Now, we generate random data points, by using the sigmoid function and adding a bit of noise.
In [ ]:
n = 100
x = np.linspace(-10., 10., n)
y_model = f(x, a, b, c, d)
y = y_model + a * .2 * np.random.randn(n)
  1. Here is a plot of the data points, with the particular sigmoid used for their generation.
In [ ]:
plt.figure(figsize=(6,4));
plt.plot(x, y_model, '--k');
plt.plot(x, y, 'o');
  1. We now assume that we only have access to the data points. These points could have been obtained during an experiment. By looking at the data, the points appear to approximately follow a sigmoid, so we may want to try to fit such a curve to the points. That's what curve fitting is about. SciPy's function curve_fit allows us to fit a curve defined by an arbitrary Python function to the data.
In [ ]:
(a_, b_, c_, d_), _ = opt.curve_fit(f, x, y, (a, b, c, d))
  1. Now, let's take a look at the fitted simoid curve.
In [ ]:
y_fit = f(x, a_, b_, c_, d_)
In [ ]:
plt.figure(figsize=(6,4));
plt.plot(x, y_model, '--k');
plt.plot(x, y, 'o');
plt.plot(x, y_fit, '-');

The fitted sigmoid appears to be quite close from the original sigmoid used for data generation.

You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).

IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).