Lab 1: Linear Regression

In this practice session, you are invited to train a linear regression model using gradient descent method. After the learning phase, your model should predict house prices in the region of Ile-de-France given their areas (in m²) and their numbers of rooms.

We will also enhace the perfomence of the learning algorithm using different implementation techniques like vectorization and features normalization.

Import libraries and load data

Import numpy library that support matrix operation and matplotlib library for plotting data.

In [ ]:
 

Question 1

The house.csv file contains $d=3$ columns that represent the area, the number of rooms and the price of $n=600$ houses (one per row).

  • Open this file with a file editor to understand more the data.
  • Load the data and check its size.

Hint: You could use loadtxt function from numpy library.

In [ ]:
 

Question 2

  • Extract the house area and price columns respectively in $X$ and $y$ lists.
  • Scatter prices against areas.
In [ ]:
 

Question 3

The cost function we will use for this linear model training is the Mean Squared Error function defined by $$L(w) = MSE(X \cdot w, y) = \frac{1}{2n} \sum_{i=1}^{n}{( x_i \cdot w - y_i)^2}$$ First of all, transform $X$ and $y$ to be two $(n,1)$-numpy arrays.

In [ ]:
 

Question 4

Implement the mean squared error cost function. Then implement its gradient : $$\nabla L(w) = \partial_{w} MSE(X \cdot w, y) = \frac{1}{n} \sum_{i=1}^{n}{(x_i \cdot w - y_i)~x_i}$$

In [ ]:
 

Note : Check mean_squared_error.

Question 5

The update equation of the gradient descent algorithm is given by: $$w^{(t+1)}=w^{(t)}-\alpha \nabla L(w^{(t)}) $$ Where $\alpha$ represents the step or the learning rate.

  • Compute three steps of the gradient descent with different values of $\alpha$.
    • Hint : To select the best value of $\alpha$, we could start by small values and increase it progressively. We should select the highest value of $\alpha$ before the algorithm diverge. The rule of thumbs for varying $\alpha$ is to increase it by factor of 3. For example, we could select $\alpha=0.0001,~0.0003,~0.001,~0.003,~0.01,~0,03,~0.1 \dots$
  • For each $\alpha$, plot $w \to L(w)$, your initial point $(w_0, L(w_0))$ and the three points associated to $w_1$, $w_2$ and $w_3$ you've just computed.
  • What is the best $\alpha$ according to you ?
In [ ]:
 

Question 6

Implement the gradient descent algorithm, taking as input the gradient function, the learning rate, one or several stopping criterions and an initial parameter $w_0$.

In [ ]:
 

Question 7

Compute the optimal parameter $w^*$. Compare on the same frame the actual prices and the ones predicted by the linear model against the areas.

In [ ]:
 

Question 8

Introduce the bias term.

In [ ]:
 

Question 9

Same as question 7. Be careful to what should happend to $w_0$ and $\alpha$.

In [ ]:
 

Question 10

Add the number of rooms in $X$ and normalize it. Same as question 9. Comment.

In [ ]:
 

Question 11

You could also try to add other feature columns to the matrix X like $area^2$ or $area^{0.5}\dots~$ and see the effect on the model and the error. Is it still linear ?

In [ ]:
 

Question 12

Create a LinearRegression class object.

In [ ]:
 

Question 13

Compare your results with $\verb!sklearn!$'s LinearRegression.

Question 14

How much would cost a $330m^2$ flat with $5$ rooms ?

In [ ]: