Experiments in language analysis of Hegel's "Phenomenology of Spirit"

Since long I have dreamt of visualizing complex works of literature in a style such as the XKCD movie narrative charts (inspiring scientific projects). Previously I have performed basic experiments on the Illuminatus Trilogy using NLTK, and now that we have been reading the Phenomenology of Spirit it is an excellent candidate of illumination through analysis.

Environment and initialization

I intend to use Jupyter (iPython Notebook) and NLTK. Installing a functioning python environment is always messy - the recommendation is to install as little as possible as root, and place all your dependencies in a virtualenv (or docker container).

apt-get install python3-virtualenv w3m wget
mkdir -p venv
virtualenv --python python3 venv
source venv/bin/activate
pip install -r requirements.txt

I won't add any possibly copyrighted source material here, fetch them to the de/ and en/ directories:

./initialize.sh
In [4]:
import nltk