- First, let's make all the necessary imports.

In [ ]:

```
import numpy as np
import scipy.stats as st
import sklearn.linear_model as lm
import matplotlib.pyplot as plt
%matplotlib inline
```

- We now define the deterministic function underlying our generative model.

In [ ]:

```
f = lambda x: np.exp(3 * x)
```

- We generate the values along the curve on $[0, 2]$.

In [ ]:

```
x_tr = np.linspace(0., 2, 200)
y_tr = f(x_tr)
```

In [ ]:

```
x = np.array([0, .1, .2, .5, .8, .9, 1])
y = f(x) + np.random.randn(len(x))
```

- Let's plot our data points on $[0, 1]$.

In [ ]:

```
plt.figure(figsize=(6,3));
plt.plot(x_tr[:100], y_tr[:100], '--k');
plt.plot(x, y, 'ok', ms=10);
```

- Now, we use scikit-learn to fit a linear model to the data. There are three steps. First, we create the model (an instance of the
`LinearRegression`

class). Then we fit the model to our data. Finally, we predict values from our trained model.

In [ ]:

```
# We create the model.
lr = lm.LinearRegression()
# We train the model on our training dataset.
lr.fit(x[:, np.newaxis], y);
# Now, we predict points with our trained model.
y_lr = lr.predict(x_tr[:, np.newaxis])
```

`x`

and `x_tr`

to column vectors, as it is a general convention in scikit-learn that observations are rows, while features are columns. Here, we have 7 observations with 1 feature.

- We now plot the result of the trained linear model. We obtain a regression line, in green here.

In [ ]:

```
plt.figure(figsize=(6,3));
plt.plot(x_tr, y_tr, '--k');
plt.plot(x_tr, y_lr, 'g');
plt.plot(x, y, 'ok', ms=10);
plt.xlim(0, 1);
plt.ylim(y.min()-1, y.max()+1);
plt.title("Linear regression");
```

- The linear fit is not well adapted here, since the data points are generated according to a non-linear model (an exponential curve). Therefore, we are now going to fit a non-linear model. More precisely, we will fit a polynomial function to our data points. We can still use linear regression for that, by pre-computing the exponents of our data points. This is done by generating a Vandermonde matrix, using the
`np.vander`

function. We will explain this trick in more detail in*How it works...*.

In [ ]:

```
lrp = lm.LinearRegression()
plt.figure(figsize=(6,3));
plt.plot(x_tr, y_tr, '--k');
for deg, s in zip([2, 5], ['-', '.']):
lrp.fit(np.vander(x, deg + 1), y);
y_lrp = lrp.predict(np.vander(x_tr, deg + 1))
plt.plot(x_tr, y_lrp, s, label='degree ' + str(deg));
plt.legend(loc=2);
plt.xlim(0, 1.4);
plt.ylim(-10, 40);
# Print the model's coefficients.
print(' '.join(['%.2f' % c for c in lrp.coef_]))
plt.plot(x, y, 'ok', ms=10);
plt.title("Linear regression");
```

**overfitting**: by using a model too complex, we obtain a better fit on the trained dataset, but a less robust model outside this set.

- We will now use a different learning model, called
**ridge regression**. It works like linear regression, except that it prevents the polynomial's coefficients to explode (which was what happened in the overfitting example above). By adding a**regularization term**in the**loss function**, ridge regression imposes some structure on the underlying model. We will see more details in the next section.

`Ridge`

class. However, scikit-learn includes another model called `RidgeCV`

which includes a parameter search with cross-validation. In practice, it means that you don't have to tweak this parameter by hand: scikit-learn does it for you. Since the models of scikit-learn always follow the `fit`

-`predict`

API, all we have to do is replace `lm.LinearRegression`

by `lm.RidgeCV`

in the code above. We will give more details in the next section.

In [ ]:

```
ridge = lm.RidgeCV()
plt.figure(figsize=(6,3));
plt.plot(x_tr, y_tr, '--k');
for deg, s in zip([2, 5], ['-', '.']):
ridge.fit(np.vander(x, deg + 1), y);
y_ridge = ridge.predict(np.vander(x_tr, deg + 1))
plt.plot(x_tr, y_ridge, s, label='degree ' + str(deg));
plt.legend(loc=2);
plt.xlim(0, 1.5);
plt.ylim(-5, 80);
# Print the model's coefficients.
print(' '.join(['%.2f' % c for c in ridge.coef_]))
plt.plot(x, y, 'ok', ms=10);
plt.title("Ridge regression");
```

**underfitting**). The ridge regression reduces the overfitting issue here.

You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).

IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).