Course set-up

In [1]:
__author__ = "Christopher Potts"
__version__ = "CS224u, Stanford, Spring 2018 term"

This notebook covers the steps you'll need to take to get set up for CS224u.

Anaconda

We recommend installing the free Anaconda Python distribution, which includes IPython, Numpy, Scipy, matplotlib, scikit-learn, NLTK, and many other useful packages. This is not required, but it's an easy way to get all these packages installed. Unless you're very comfortable with Python package management and like installing things, this is the option for you!

Please be sure that you download the Python 3 version, which currently installs Python 3.6. Although our code is largely compatible with Python 2, we're not supporting Python 2.

One you have Anaconda installed, it makes sense to create a virtual environment for the course. In a terminal, run

conda create -n nlu python=3.6 anaconda

to create an environment called nlu.

Then, to enter the environment, run

source activate nlu

To leave it, you can just close the window, or run

source deactivate nlu

This page has more detailed instructions on managing virtual environments with Anaconda.

The course Github repository

The core materials for the course are on Github:

https://github.com/cgpotts/cs224u

We'll be working in this repository a lot, and it will receive updates throughout the quarter, as we add new materials and correct bugs.

If you're new to git and Github, we recommend using Github's Desktop Apps. Then you just have to clone our repository and sync your local copy with the official one when there are updates.

Additional installations

Be sure to do these additional installations from inside your virtual environment for the course!

Installing the package requirements

Just run

pip install -r requirements.txt

from inside the course directory to install the core additional packages.

People who aren't using Anaconda should edit requirements.txt so that it installs all the prerequisites that come with Anaconda. For Anaconda users, there's no need to edit it or even open it.

TensorFlow

The TensorFlow library has special installation instructions depending on your computing environment. The instructions are posted in

https://www.tensorflow.org/install/

NLTK data

Anaconda comes with NLTK but not with its data distribution. To install that, open a Python interpreter and run import nltk; nltk.download(). If you decide to download the data to a different directory than the default, then you'll have to set NLTK_DATA in your shell profile. (If that doesn't make sense to you, then we recommend choosing the default download directory!)

Jupyter notebooks

The majority of the materials for this course are Jupyter notebooks, which allow you to work in a browser, mixing code and description. It's a powerful form of literate programming, and increasingly a standard for open science.

To start a notebook server, navigate to the directory where you want to work and run

jupyter notebook --port 5656

The port specification is optional.

This should launch a browser that takes you to a view of the directory you're in. You can then open notebooks for working and create new notebooks.

A major advantage of working with Anaconda is that you can switch virtual environments from inside a notebook, via the Kernel menu. If this isn't an option for you, then run this command while inside your virtual environment:

python -m ipykernel install --user --name nlu --display-name "nlu"

(If you named your environment something other than nlu, then change the --name and --display-name values.)

Additional discussion of Jupyter and kernels.

SippyCup

Our semantic parsing library is SippyCup. Clone this repository for local use. We'll help you get set up to use it as part of the semantic parsing unit.