Notebook

Lesson 1¶

The course will use wrappers and code for the most part, so use the usual Python methods to check what's going on.

In [2]:

# View the docstring/code using ??, or view info using shift + tab
import numpy as np
np.arange(5)

Out[2]:

array([0, 1, 2, 3, 4])

In [3]:

??np.arange

In [4]:

?np.arange

~~## Running a notebook on AWS~~

~~1. Open a terminal window and ssh into the AWS server (IP can be found from the console: https://us-west-2.console.aws.amazon.com/ec2/v2/home?region=us-west-2#Instances:sort=instanceId)~~

~~2. Navigate to the nbs (notebooks) directory~~

~~3. wget the notebook from https://github.com/fastai/courses/tree/master/deeplearning1/nbs (remember to get the RAW notebook)~~

4. (Optional) You can use tmux to split the terminal window. This will let you run Jupyter from one part of the terminal, while leaving another part of the terminal free to execute commands, crunch data, etc.

(I now have my own Ubuntu and Windows servers working)

tmux commands¶

ctrl+b ?: List commands
ctrl+b ": Split terminal horizontally
ctrl+b %: Split terminal vertically
ctrl+b left/right/up/down arrows: Switch between split panes
ctrl+b d: Detatch from this tmux session (session will still be running)
tmux attach: Rejoin tmux session

Run jupyter notebook
In browser, put in the server_ip:8888 -- dl_course

That's it!

Note on folder structure¶

Folder structure is very important when training/running models. As well as keeping things organized, folder structure matters when it comes to Keras. Keras expects each class of object is in its own folder.

Sample structure

dogscats

training
- dogs
  - dog1.jpg
  - ...
- cats
  - cat1.jpg
  - ...
testing
- img1.jpg
- ...
(you must determine which images in this directory belong to which class, hence there are no subfolders)
validation
(Like testing, except you know which are which?)
- dogs
- dog1.jpg
- ...
- cats
- cat1.jpg
- ...
sample
(This is like your top-level directory, but contains fewer images, so you can run through it more quickly)
- training
  - dogs
    - dog1.jpg
    - ...
  - cats
    - cat1.jpg
    - ...
- testing
  - img1.jpg
  - ...

image-net.org¶

http://www.image-net.org/ contains images you can use to train models, but beware: these images tend to have only the object in question, they have been sanitized. When you use a pre-trained model, you inherit all of the biases from the data it was trained on.

Architecture¶

VGG16: A simple neural net model that can be used to, e.g., classify images. This is built on...

Keras: A neural net API written in Python. It can be used to produce TensorFlow or Theano code. Update keras.json to switch the backend to use either one. You can also use the file to switch between CPU and GPU processing.

Theano: Converts Python code into compiled GPU code (CUDA)

TensorFlow: Like a Google version of Theano. Can run on multiple GPUs.

CUDA: The "bottom" layer. The NVIDIA GPU language for machine learning.

CUDNN: CUDA Deep Neural Network Library. A part of CUDA.

NOTE: Now using PyTorch in V2. PyTorch is Facebook's open source machine learning library for Python.

Universal Approximation Theorem¶

Thinking of a neural network as a function, this kind of function can solve any given problem to some close accuracy as long as you add enough parameters. (This is proven.)

This links to the assertion that deep learning is (or should be):

An infinitely flexible function
All-purpose parameter fitting
Fast and flexible

Convolutional Neural Network¶

http://setosa.io/ev/image-kernels/

Multiplies each pixel value in 3x3 matrix by a kernel, and then add them all together.

Low values become black, high values become white, and so you manage to detect the edges.

Nonlinearity¶

Takes some input value and transforms it into another value in a non-linear way.

E.g., the sigmoid function, or recitified linear unit (RLU).

Using our linear and non-linear functions allows us to create complex shapes.

Learning rate¶

Step size in gradient descent.

Images¶

Images are held in 3D arrays.

The first two dimensions represent the X and Y dimensions, and the third dimension represents the RGB vaules.

Pretraining Models¶

A model created by somebody else to solve another problem.

Probability¶

When classifying between our two cat and dog objects, we measure things between 0-1.

0 = cat 1 = dog 0.5 = not sure

These numbers can then be used to view, for example, the most incorrect dog (scored close to 0, but was labelled dog) or the most dog-like dog (scored close to 1 and is a dog).

The probability is returned using a log scale. On a log scale, each tick is the previous tick mark MULTIPLIED by a certain number. It's useful for compressing a large data range into a more smaller range.

Overfitting & transforms¶

When you run too many epochs (process all of the training data via a series of mini-batches), you run the risk of overfitting.

This means the network has become very good at identifying the training data, but it cannot generalize enough to identify the validation data.

One way to correct for overfitting is to create more test data.

One way to do this is to create transforms of the data. For example, rotate, zoom, and scale your training images.

Confusion Matrix¶

A confusion matrix shows you:

TN = true negatives, values that should be negative and are
FN = false negatives, values that shouldn't be negative but are
TP = true positives, values that should be positive and are
FP = false positives, values that shouldn't be positive but are

Fitting the model¶

When we talk about fitting the model, we mean finding the best parameters for a layer in order to achieve the desired outputs using gradient descent.

When we fit the mode, we get back:

Loss in the training set
Loss in the validation set
Accuracy in the validation set

Accuracy = ratio of correct predictions

Loss/cost function = represents the price paid for inaccuracy of predictions

E.g., if your image label is 1 and your model gives it a prediction of 0.9, the loss should be small, and vice versa.

In [ ]: