Notebook

Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

Author: Sebastian Raschka
GitHub Repository: https://github.com/rasbt/deeplearning-models

In [1]:

%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p tensorflow

Sebastian Raschka 

CPython 3.6.0
IPython 6.0.0

tensorflow 1.1.0

Model Zoo -- Saving and Loading Trained Models¶

from TensorFlow Checkpoint Files and NumPy NPZ Archives¶

This notebook demonstrates different strategies on how to export and import training TensorFlow models based on a a simple 2-hidden layer multilayer perceptron. These include

Using regular TensorFlow meta and checkpoint files
Loading variables from NumPy archives (.npz) files

Note that the graph def is going set up in a way that it constructs "rigid," not trainable TensorFlow classifier if .npz files are provided. This is on purpose, since it may come handy in certain use cases, but the code can be easily modified to make the model trainable if NumPy .npz files are provided -- for example, by wrapping the tf.constant calls in fc_layer in a tf.Variable constructor like so:

...
if weight_params is not None:
    weights = tf.Variable(tf.constant(weight_params, name='weights',
                                      dtype=tf.float32))
...

instead of

...
if weight_params is not None:
    weights = tf.constant(weight_params, name='weights',
                          dtype=tf.float32)
...

Define Multilayer Perceptron Graph¶

The following code cells defines wrapper functions for our convenience; it saves us some re-typing later when we set up the TensorFlow multilayer perceptron graphs for the trainable and non-trainable models.

In [2]:

import tensorflow as tf


##########################
### WRAPPER FUNCTIONS
##########################


def fc_layer(input_tensor, n_output_units, name,
             activation_fn=None, seed=None,
             weight_params=None, bias_params=None):

    with tf.variable_scope(name):

        if weight_params is not None:
            weights = tf.constant(weight_params, name='weights',
                                  dtype=tf.float32)
        else:
            weights = tf.Variable(tf.truncated_normal(
                shape=[input_tensor.get_shape().as_list()[-1], n_output_units],
                    mean=0.0,
                    stddev=0.1,
                    dtype=tf.float32,
                    seed=seed),
                name='weights',)

        if bias_params is not None:
            biases = tf.constant(bias_params, name='biases', 
                                 dtype=tf.float32)

        else:
            biases = tf.Variable(tf.zeros(shape=[n_output_units]),
                                 name='biases', 
                                 dtype=tf.float32)

        act = tf.matmul(input_tensor, weights) + biases

        if activation_fn is not None:
            act = activation_fn(act)

    return act


def mlp_graph(n_input=784, n_classes=10, n_hidden_1=128, n_hidden_2=256,
              learning_rate=0.1,
              fixed_params=None):
    
    # fixed_params to allow loading weights & biases
    # from NumPy npz archives and defining a fixed, non-trainable
    # TensorFlow classifier
    if not fixed_params:
        var_names = ['fc1/weights:0', 'fc1/biases:0',
                     'fc2/weights:0', 'fc2/biases:0',
                     'logits/weights:0', 'logits/biases:0',]
        
        fixed_params = {v: None for v in var_names}
        found_params = False
    else:
        found_params = True
    
    # Input data
    tf_x = tf.placeholder(tf.float32, [None, n_input], name='features')
    tf_y = tf.placeholder(tf.int32, [None], name='targets')
    tf_y_onehot = tf.one_hot(tf_y, depth=n_classes, name='onehot_targets')

    # Multilayer perceptron
    fc1 = fc_layer(input_tensor=tf_x, 
                   n_output_units=n_hidden_1, 
                   name='fc1',
                   weight_params=fixed_params['fc1/weights:0'], 
                   bias_params=fixed_params['fc1/biases:0'],
                   activation_fn=tf.nn.relu)

    fc2 = fc_layer(input_tensor=fc1, 
                   n_output_units=n_hidden_2, 
                   name='fc2',
                   weight_params=fixed_params['fc2/weights:0'], 
                   bias_params=fixed_params['fc2/biases:0'],
                   activation_fn=tf.nn.relu)
    
    logits = fc_layer(input_tensor=fc2, 
                      n_output_units=n_classes, 
                      name='logits',
                      weight_params=fixed_params['logits/weights:0'], 
                      bias_params=fixed_params['logits/biases:0'],
                      activation_fn=tf.nn.relu)
    
    # Loss and optimizer
    ### Only necessary if no existing params are found
    ### and a trainable graph has to be initialized
    if not found_params:
        loss = tf.nn.softmax_cross_entropy_with_logits(
            logits=logits, labels=tf_y_onehot)
        cost = tf.reduce_mean(loss, name='cost')
        optimizer = tf.train.GradientDescentOptimizer(
            learning_rate=learning_rate)
        train = optimizer.minimize(cost, name='train')

    # Prediction
    probabilities = tf.nn.softmax(logits, name='probabilities')
    labels = tf.cast(tf.argmax(logits, 1), tf.int32, name='labels')
    
    correct_prediction = tf.equal(labels, 
                                  tf_y, name='correct_predictions')
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32),
                              name='accuracy')

Train and Save Multilayer Perceptron¶

In [3]:

from tensorflow.examples.tutorials.mnist import input_data

##########################
### SETTINGS
##########################

# Hyperparameters
learning_rate = 0.1
training_epochs = 10
batch_size = 64

##########################
### GRAPH DEFINITION
##########################

g = tf.Graph()
with g.as_default():
    mlp_graph()

##########################
### DATASET
##########################

mnist = input_data.read_data_sets("./", one_hot=False)

##########################
### TRAINING & EVALUATION
##########################

with tf.Session(graph=g) as sess:
    sess.run(tf.global_variables_initializer())
    saver0 = tf.train.Saver()
    
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = mnist.train.num_examples // batch_size

        for i in range(total_batch):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            _, c = sess.run(['train', 'cost:0'], feed_dict={'features:0': batch_x,
                                                            'targets:0': batch_y})
            avg_cost += c
        
        train_acc = sess.run('accuracy:0', feed_dict={'features:0': mnist.train.images,
                                                      'targets:0': mnist.train.labels})
        valid_acc = sess.run('accuracy:0', feed_dict={'features:0': mnist.validation.images,
                                                      'targets:0': mnist.validation.labels})  
        
        print("Epoch: %03d | AvgCost: %.3f" % (epoch + 1, avg_cost / (i + 1)), end="")
        print(" | Train/Valid ACC: %.3f/%.3f" % (train_acc, valid_acc))
        
    test_acc = sess.run('accuracy:0', feed_dict={'features:0': mnist.test.images,
                                                 'targets:0': mnist.test.labels})
    print('Test ACC: %.3f' % test_acc)
    
    ##########################
    ### SAVE TRAINED MODEL
    ##########################
    saver0.save(sess, save_path='./mlp')

Extracting ./train-images-idx3-ubyte.gz
Extracting ./train-labels-idx1-ubyte.gz
Extracting ./t10k-images-idx3-ubyte.gz
Extracting ./t10k-labels-idx1-ubyte.gz
Epoch: 001 | AvgCost: 0.366 | Train/Valid ACC: 0.944/0.948
Epoch: 002 | AvgCost: 0.163 | Train/Valid ACC: 0.965/0.963
Epoch: 003 | AvgCost: 0.118 | Train/Valid ACC: 0.972/0.969
Epoch: 004 | AvgCost: 0.093 | Train/Valid ACC: 0.979/0.974
Epoch: 005 | AvgCost: 0.076 | Train/Valid ACC: 0.984/0.977
Epoch: 006 | AvgCost: 0.062 | Train/Valid ACC: 0.986/0.974
Epoch: 007 | AvgCost: 0.052 | Train/Valid ACC: 0.990/0.977
Epoch: 008 | AvgCost: 0.044 | Train/Valid ACC: 0.988/0.975
Epoch: 009 | AvgCost: 0.037 | Train/Valid ACC: 0.991/0.978
Epoch: 010 | AvgCost: 0.032 | Train/Valid ACC: 0.994/0.979
Test ACC: 0.976

Reload Model from Meta and Checkpoint Files¶

You can restart and the notebook and the following code cells should execute without any additional code dependencies.

In [4]:

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("./", one_hot=False)

with tf.Session() as sess:
    
    saver1 = tf.train.import_meta_graph('./mlp.meta')
    saver1.restore(sess, save_path='./mlp')
    
    test_acc = sess.run('accuracy:0', feed_dict={'features:0': mnist.test.images,
                                                 'targets:0': mnist.test.labels})
    print('Test ACC: %.3f' % test_acc)

Extracting ./train-images-idx3-ubyte.gz
Extracting ./train-labels-idx1-ubyte.gz
Extracting ./t10k-images-idx3-ubyte.gz
Extracting ./t10k-labels-idx1-ubyte.gz
INFO:tensorflow:Restoring parameters from ./mlp
Test ACC: 0.976

Working with NumPy Archive Files and Creating Non-Trainable Graphs¶

Export Model Parameters to NumPy NPZ files¶

In [5]:

import tensorflow as tf
import numpy as np

tf.reset_default_graph()
with tf.Session() as sess:

    saver1 = tf.train.import_meta_graph('./mlp.meta')
    saver1.restore(sess, save_path='./mlp')
    
    var_names = [v.name for v in 
                 tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)]
    
    params = {}
    print('Found variables:')
    for v in var_names:
        print(v)
        
        ary = sess.run(v)
        params[v] = ary
        
    np.savez('mlp', **params)

INFO:tensorflow:Restoring parameters from ./mlp
Found variables:
fc1/weights:0
fc1/biases:0
fc2/weights:0
fc2/biases:0
logits/weights:0
logits/biases:0

Load NumPy .npz files into the `mlp_graph`¶

Note that the graph def was set up in a way that it constructs "rigid," not trainable TensorFlow classifier if .npz files are provided. This is on purpose, since it may come handy in certain use cases, but the code can be easily modified to make the model trainable if NumPy .npz files are provided (e.g., by wrapping the tf.constant calls in fc_layer in a tf.Variable constructor.

Note: If you defined the fc_layer and mlp_graph wrapper functions in Define Multilayer Perceptron Graph, the following code cell is otherwise independent and has no other code dependencies.

In [6]:

import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

###########################
### LOAD DATA AND PARAMS
###########################

mnist = input_data.read_data_sets("./", one_hot=False)
param_dict = np.load('mlp.npz')

##########################
### GRAPH DEFINITION
##########################


g = tf.Graph()
with g.as_default():
    
    # here: constructs a non-trainable graph
    # due to the provided fixed_params argument
    mlp_graph(fixed_params=param_dict)

with tf.Session(graph=g) as sess:
    
    test_acc = sess.run('accuracy:0', feed_dict={'features:0': mnist.test.images,
                                                 'targets:0': mnist.test.labels})
    print('Test ACC: %.3f' % test_acc)

Extracting ./train-images-idx3-ubyte.gz
Extracting ./train-labels-idx1-ubyte.gz
Extracting ./t10k-images-idx3-ubyte.gz
Extracting ./t10k-labels-idx1-ubyte.gz
Test ACC: 0.976