Notebook

Dense classifiers¶

In this assignment we implement and improve the bag of vectors classifier discussed in the class note. This is a first step towards using neural network models.

Written answers¶

(1pt) Why is mini batch learning better for the dense classifier, compared to the sparse classifiers?
(2pt) Derive the gradient of softmax loss $\frac{dL(w,v)}{dv_i}$ with respect to the word vectors, when the input is the mean vector of the all words of the document $\bar{v} = \frac1{n} \sum_{i=1}^n v_i$.
(2pt) Derive the gradient of all weights ($W_1, b_1, W_2, b_2$) in the 2-layer neural network

In [ ]:

import numpy as np
from a1 import data, featurize
from a1.dense_classifiers import SoftmaxClassifier
from a1.bag_vectors import BagOfVectors
# %load_ext autoreload
# %autoreload 2

Dense vector data¶

In [ ]:

train_data, test_data = data.polarity(verbose=True)
def x_list(data):
    return [ example['x'].split(' ') for example in data]
def y_array(data):
    return np.array([0 if example['y']=='neg' else 1 for example in data])
    
train_sents = x_list(train_data);
bag_vectors = BagOfVectors(train_sents, max_vocab=10000, dim=100)

X_train = bag_vectors.data_matrix(x_list(train_data))
X_test = bag_vectors.data_matrix(x_list(test_data))
y_train = y_array(train_data)
y_test = y_array(test_data)

Softmax classifier (10pt)¶

Implement softmax_loss_vectorized, DenseLinearClassifier.train and DenseLinearClassifier.predict in dense_classifiers.py.

Train and test using the above bag-of-vectors data.

Cross validate and try hyperparameters combinations
Run and print the evaluation results

Two-layer neural network (15pt)¶

Implement train, predict, loss in neural_net.py

Verify that gradient check passes
Cross validate and try hyperparameters combinations
Run and print the evaluation results

Backpropate to word vectors (15pt)¶

One problem is that our random word vector is underfitting. One way to improve this is to backpropogate and modify these random word vectors. In this part, make new classifiers analogous to the linear classifer, and the neural network classifier that also updates the word vectors via backpropagation.

Hint: these models should also have a BagOfVectors as an instance state in addition to the regular model parameters.

create new models based on the dense softmax classifier, and the neural network classifier
Implement an appropriate update function in BagOfVectors
Run and print the evaluation results

Analysis (10 pts)¶

Describe some mistakes made by your model, inspect/visualize the weights. Comment on any issues with the model based on your concrete observations.m