__author__ = "Christopher Potts"
__version__ = "CS224u, Stanford, Spring 2018 term"
The goal of this in-class bake-off is to achieve the highest average F1 score on the SST development set, with the binary class function.
The only restriction: you cannot make any use of the subtree labels.
f1-score
in the avg / total
row of the classification report.Submission URL: https://docs.google.com/forms/d/1R41Zxxils7lOPzuThMdv2p1TKmFEy8c0DyUg-YkzTa0/edit
You don't have to use the experimental framework defined below (based on sst
). However, if you don't use sst.experiment
as below, then make sure you're training only on train
, evaluating on dev
, and that you report with
from sklearn.metrics import classification_report
classification_report(y_dev, predictions)
where y_dev = [y for tree, y in sst.dev_reader(class_func=sst.binary_class_func)]
See the first notebook in this unit for set-up instructions.
from collections import Counter
from rnn_classifier import RNNClassifier
from sklearn.linear_model import LogisticRegression
import sst
import tensorflow as tf
from tf_rnn_classifier import TfRNNClassifier
from tree_nn import TreeNN
/Applications/anaconda/envs/nlu/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters
def unigrams_phi(tree):
"""The basis for a unigrams feature function.
Parameters
----------
tree : nltk.tree
The tree to represent.
Returns
-------
defaultdict
A map from strings to their counts in `tree`. (Counter maps a
list to a dict of counts of the elements in that list.)
"""
return Counter(tree.leaves())
def fit_maxent_classifier(X, y):
mod = LogisticRegression(fit_intercept=True)
mod.fit(X, y)
return mod
_ = sst.experiment(
unigrams_phi, # Free to write your own!
fit_maxent_classifier, # Free to write your own!
train_reader=sst.train_reader, # Fixed by the competition.
assess_reader=sst.dev_reader, # Fixed.
class_func=sst.binary_class_func) # Fixed.
Accuracy: 0.772 precision recall f1-score support negative 0.783 0.741 0.761 428 positive 0.762 0.802 0.782 444 avg / total 0.772 0.772 0.772 872
By the way, with some informal hyperparameter search on a GPU machine, I found this model
tf_rnn_glove = TfRNNClassifier(
sst_glove_vocab,
embedding=glove_embedding, ## 100d version
hidden_dim=300,
max_length=52,
hidden_activation=tf.nn.relu,
cell_class=tf.nn.rnn_cell.LSTMCell,
train_embedding=True,
max_iter=5000,
batch_size=1028,
eta=0.001)
which finished with almost identical performance to the above:
precision recall f1-score support
negative 0.78 0.75 0.76 428
positive 0.77 0.80 0.78 444
avg / total 0.77 0.77 0.77 872
def rnn_phi(tree):
return tree.leaves()
def fit_tf_rnn_classifier(X, y):
vocab = sst.get_vocab(X, n_words=3000)
mod = TfRNNClassifier(
vocab,
eta=0.05,
batch_size=2048,
embed_dim=50,
hidden_dim=50,
max_length=52,
max_iter=500,
cell_class=tf.nn.rnn_cell.LSTMCell,
hidden_activation=tf.nn.tanh,
train_embedding=True)
mod.fit(X, y)
return mod
_ = sst.experiment(
rnn_phi,
fit_tf_rnn_classifier,
vectorize=False, # For deep learning, use `vectorize=False`.
assess_reader=sst.dev_reader)
Iteration 500: loss: 2.5404394865036012
Accuracy: 0.615 precision recall f1-score support negative 0.571 0.869 0.689 428 positive 0.745 0.369 0.494 444 avg / total 0.660 0.615 0.590 872
def tree_phi(tree):
return tree
def fit_tree_nn_classifier(X, y):
vocab = sst.get_vocab(X, n_words=3000)
mod = TreeNN(
vocab,
embed_dim=100,
max_iter=100)
mod.fit(X, y)
return mod
_ = sst.experiment(
rnn_phi,
fit_tree_nn_classifier,
vectorize=False, # For deep learning, use `vectorize=False`.
assess_reader=sst.dev_reader)
Finished epoch 100 of 100; error is 0.8351342778738807
Accuracy: 0.510 precision recall f1-score support negative 0.501 0.498 0.499 428 positive 0.519 0.523 0.521 444 avg / total 0.510 0.510 0.510 872