Exercise 05

Neural networks

4.1 Little Red Riding Hood Network

Train a neural network to solve the Little Red Riding Hood problem in sklern and Keras. Try the neural networ with different inputs and report the results.


4.2 Boston House Price Prediction

In the next questions we are going to work using the dataset Boston. This dataset measures the influence of socioeconomical factors on the price of several estates of the city of Boston. This dataset has 506 instances, each one characterized by 13 features:

  • CRIM - per capita crime rate by town
  • ZN - proportion of residential land zoned for lots over 25,000 sq.ft.
  • INDUS - proportion of non-retail business acres per town.
  • CHAS - Charles River dummy variable (1 if tract bounds river; 0 otherwise)
  • NOX - nitric oxides concentration (parts per 10 million)
  • RM - average number of rooms per dwelling
  • AGE - proportion of owner-occupied units built prior to 1940
  • DIS - weighted distances to five Boston employment centres
  • RAD - index of accessibility to radial highways
  • TAX - full-value property-tax rate per 10,000 USD
  • PTRATIO - pupil-teacher ratio by town
  • B - $1000(Bk - 0.63)^2$ where $Bk$ is the proportion of blacks by town
  • LSTAT - % lower status of the population

Output variable:

  • MEDV - Median value of owner-occupied homes in 1000's USD

Note: In this exercise we are going to predict the price of each estate, which is represented in the MEDV variable. It is important to remember that we are always aiming to predict MEDV, no matter which explanatory variables we are using. That means, in some cases we will use a subset of the 13 previously mentioned variables, while in other cases we will use all the 13 variables. But in no case we will change the dependent variable $y$.

  1. Load the dataset using from sklearn.datasets import load_boston.
  2. Create a DataFrame using the attribute .data from the loading function of Scikit-learn.
  3. Assign the columns of the DataFrame so they match the .feature_names attribute from the loading function of Scikit-learn.
  4. Assign a new column to the DataFrame which holds the value to predict, that means, the .target attribute of the loading function of Scikit-learn. The name of this columns must be MEDV.
  5. Use the function .describe() from Pandas for obtaining statistics about each column.

4.3 Feature analysis:

Using the DataFrame generated in the previous section:

  • Filter the dataset to just these features:
    • Explanatory: 'LSTAT', 'INDUS', 'NOX', 'RM', 'AGE'
    • Dependent: 'MEDV'.
  • Generate a scatter matrix among the features mentioned above using Pandas (scatter_matrix) or Seaborn (pairplot).
    • Do you find any relationship between the features?
  • Generate the correlation matrix between these variables using numpy.corrcoef. Also include MEDV.
    • Which characteristics are more correlated?
    • BONUS: Visualize this matrix as heat map using Pandas, Matplotlib or Seaborn.

4.4 Modeling linear and non linear relationships

  • Generate two new subsets filtering these characteristics:
    • $D_1$: $X = \textit{'RM'}$, $y = \textit{'MEDV'}$
    • $D_2$: $X = \textit{'LSTAT'}$, $y = \textit{'MEDV'}$
  • For each subset, generate a training partition and a test partition using a ratio of $ 70 \% - 30 \% $
  • Train a linear regression model on both subsets of data:
    • Report the mean square error on the test set
    • Print the values of $ w $ and $ w_0 $ of the regression equation
    • Generate a graph where you visualize the line obtained by the regression model in conjunction with the training data and the test data
  • How does the model perform on $ D_1 $ and $ D_2 $? Why?

4.5 Training a regression model

  • Generate a 70-30 partitioning of the data using all the features. (Do not include the dependent variable MEDV)
  • Train a linear regression model with the objective of predicting the output variable MEDV.
    • Report the mean square error on the test set
  • Train a regression model using MLPRegressor in order to predict the output variableMEDV.
    • Report the mean square error on the test set
  • Scale the data so that they have zero mean variance one per feature (only $ X $). You can use the following piece of code:
from sklearn.preprocessing import StandardScaler

sc_x = StandardScaler()
sc_x.fit(X)
X_train_s = sc_x.transform(X_train)
X_test_s = sc_x.transform(X_test)

Check more information about StandardScaler here.

  • Train the following models:
    1. Train a linear regression model using the scaled data.
      • Report the mean square error on the test set
    2. Train a regression model using a 2-layer MultiLayer Perceptron (128 neurons in the first and 512 in the second) and with the scaled data.
      • Report the mean square error on the test set
    3. Which model has better performance? Why?