In this notebook we will cover the following topics:
Although we have made our own network from scratch to learn about deep learning components, in practice you will often want to use a standard network that has been found by deep learning researchers to be successful.
Keras comes with several popular networks already defined, and can even load them with weights from standard datasets. Keras calls these premade networks "applications". Many popular networks are included, like:
Let's try out the InceptionV3 network, which is a popular image recognition network.
import numpy as np
np.warnings.filterwarnings('ignore') # Hide np.floating warning
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D, Input, Lambda
# Prevent TensorFlow from grabbing all the GPU memory
import tensorflow as tf
gpu_devices = tf.config.experimental.list_physical_devices('GPU')
for device in gpu_devices:
tf.config.experimental.set_memory_growth(device, True)
import holoviews as hv
hv.extension('bokeh')
Same data preparation as before.
from keras.datasets import cifar10
import keras.utils
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Save an unmodified copy of y_test for later, flattened to one column
y_test_true = y_test[:,0].copy()
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
# The data only has numeric categories so we also have the string labels below
cifar10_labels = np.array(['airplane', 'automobile', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck'])
When we load a network, we have a number of options we can set. Some of the more important ones are:
input_shape
: Pretrained networks assume a particular image input size. If your data is not this shape, Keras will allow you to set it here, but some models have limitations. InceptionV3 cannot go below 75 by 75.weights
: What weights to load with the model. Default is random weights or 'imagenet'
, which loads weights from training on the ImageNet dataset. We will try the pretrained weights in a later section.include_top
: Include the dense layer at the end of the network? If you are loading pre-trained weights, you will likely need to replace the top layer with your own.classes
: Number of classes to output. Needed if include_top
is True and weights
is None.For our first attempt, we'll start training the model from scratch. Because we are dealing with such small images, we'll need to built a custom first layer to rescale the image up by a factor of 3. We can do this using a Lambda
layer, which lets us call backend (TensorFlow in this case) tensor manipulation functions. Keras provides a resize_images
function which will scale up the images.
We also use the "functional" API of Keras here, where we connect one layer to the next by treating each layer like a function and passing the preceding layer to it.
from keras import backend as K
from keras.layers import Input, Lambda, GlobalAveragePooling2D
from keras.models import Model
# Rescale input from 32x32 to 96x96
input_layer = Input(shape=(32,32,3), dtype=np.float32)
resize_layer = Lambda(lambda x: K.resize_images(x, 3, 3, 'channels_last', interpolation='nearest'))(input_layer)
# Load InceptionV3 with random initial weights
inception = keras.applications.InceptionV3(
input_shape=(96,96,3), # must be larger than 75x75
weights=None, # random weights
include_top=True,
classes=num_classes,
)(resize_layer)
model = Model(inputs=[input_layer], outputs=[inception])
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Let's see how many parameters the model has:
model.summary()
The InceptionV3 model is signficantly deeper than our toy models before. Let's see how Keras handles it.
history = model.fit(x_train, y_train,
batch_size=256,
epochs=5,
verbose=1,
validation_data=(x_test, y_test))
train_acc = hv.Curve((history.epoch, history.history['accuracy']), 'epoch', 'accuracy', label='training')
val_acc = hv.Curve((history.epoch, history.history['val_accuracy']), 'epoch', 'accuracy', label='validation')
layout = (train_acc * val_acc).redim(accuracy=dict(range=(0.0, 1.1)))
layout.opts(
hv.opts.Curve(width=400, height=300, line_width=3),
hv.opts.Overlay(legend_position='bottom_right')
)
This model seems to be training reasonably well, even with completely random starting weights. Let's try seeding the model with a starting point.
Given the relatively small size of our training dataset, it can be hard to retrain a complex predefined model entirely from scratch. Let's try to retrain a model starting from the ImageNet weights:
from keras import backend as K
from keras.layers import Input, Lambda, GlobalAveragePooling2D
from keras.models import Model
# Rescale input from 32x32 to 96x96
input_layer = Input(shape=(32,32,3), dtype=np.float32)
resize_layer = Lambda(lambda x: K.resize_images(x, 3, 3, 'channels_last', interpolation='nearest'))(input_layer)
# Load InceptionV3 with imagenet weights, but removing the top dense layers
inception = keras.applications.InceptionV3(
input_shape=(96,96,3), # our scaled up dimension >= 75
weights='imagenet', # random weights
include_top=False, # we are going to replace the top of the network with our own layers
)
#inception.trainable = False # uncomment this to freeze the loaded weights
# Add our own top layers to produce 10 categories, but also adding dropout to control overfitting
prediction = Flatten()(inception(resize_layer))
prediction = Dropout(0.25)(prediction)
prediction = Dense(num_classes, activation='softmax')(prediction)
model2 = Model(inputs=[input_layer], outputs=[prediction])
model2.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model2.summary()
history2 = model2.fit(x_train, y_train,
batch_size=256,
epochs=5,
verbose=1,
validation_data=(x_test, y_test))
train_acc = hv.Curve((history2.epoch, history2.history['accuracy']), 'epoch', 'accuracy', label='training')
val_acc = hv.Curve((history2.epoch, history2.history['val_accuracy']), 'epoch', 'accuracy', label='validation')
layout = (train_acc * val_acc).redim(accuracy=dict(range=(0.0, 1.1)))
layout.opts(
hv.opts.Curve(width=400, height=300, line_width=3),
hv.opts.Overlay(legend_position='bottom_right')
)
inception.trainable = False
), does the training still work?interpolation='bilinear'
.If you screw everything up, you can use File / Revert to Checkpoint to go back to the first version of the notebook and restart the Jupyter kernel with Kernel / Restart.