Table of ContentsChapter 1.01 - HelpChapter 2 - Classification


Chapter 1.02 - Getting Started

Machine Learning is well supported on all major operating systems, thanks to analytical (...) such Anaconda. Anaconda is an open data science platform that pre-packages many of the required components for data science (and everything for this book!) and can be installed on Windows, OS X and Linux.

In fact, this entire book is written using components from Anaconda, including Python and R and their machine learning packages, as well as the super helpful Jupyter Notebook which allows for structured and repeatable machine learning.

Installation

To install Anaconda, follow the instructions for your operating system below:

  • Download and install Anaconda 4.3.0 64-bit from here: Windows, OS X, Linux
  • Update scikit-learn to version 0.18.1: conda install -c anaconda scikit-learn=0.18.1
  • Install Keras (and TensorFlow): pip install keras==2.0.0
  • Spark and launch Jupyter:
sudo gem update --system
brew install apache-spark

export SPARK_HOME="/usr/local/Cellar/apache-spark/2.1.0/libexec/"
export PATH=$PATH:$SPARK_HOME/bin
export PYSPARK_SUBMIT_ARGS="--master local[2]"
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH
export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

pyspark

Running Jupyter

If you've downloaded a copy of this book as Jupyter notebooks, you can run Jupyter jupyter notebook. This will open a new browser window with the Jupyter file browser, and you can select the directory containing your notebooks from there.


Table of ContentsChapter 1.01 - HelpChapter 2 - Classification