Copyright 2016 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This notebook is similar in functionality to this python script, and is used with this README. It shows how to use TensorFlow's high-level apis, in contrib.tflearn
, to easily build a classifier with multiple hidden layers.
First, do some imports and set some variables:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import os
import time
import numpy
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
# comment out for less info during the training runs.
tf.logging.set_verbosity(tf.logging.INFO)
DATA_DIR = "/tmp/MNIST_data"
# read in data, downloading first as necessary
DATA_SETS = input_data.read_data_sets(DATA_DIR)
# define a utility function for generating a new directory in which to save
# model information, so multiple training runs don't stomp on each other.
def get_new_path(name=""):
base="/tmp/tfmodels/mnist_estimators"
logpath = os.path.join(base, name + "_" + str(int(time.time())))
print("Logging to {}".format(logpath))
return logpath
Next create an input function, using the tf.train.shuffle_batch
function to take care of the batching and shuffling of the input data.
BATCH_SIZE = 40
# call with generate_input_fn(DATA_SETS.train) or generate_input_fn(DATA_SETS.test)
def generate_input_fn(dataset, batch_size=BATCH_SIZE):
def _input_fn():
X = tf.constant(dataset.images)
Y = tf.constant(dataset.labels.astype(numpy.int64))
image_batch, label_batch = tf.train.shuffle_batch([X,Y],
batch_size=batch_size,
capacity=8*batch_size,
min_after_dequeue=4*batch_size,
enqueue_many=True
)
return {'pixels': image_batch} , label_batch
return _input_fn
We'll first define a function that adds a LinearClassifier and runs its fit()
method, which will train the model. Note that we didn't need to explicitly define a model graph or a training loop ourselves.
Once we've trained the model, we run the evaluate()
method, which uses the trained model. To do this, it loads the most recent checkpointed model info available. The model checkpoint(s) will be generated during the training process.
def define_and_run_linear_classifier(num_steps, logdir, batch_size=BATCH_SIZE):
"""Run a linear classifier."""
feature_columns = [tf.contrib.layers.real_valued_column(
"pixels", dimension=784)]
classifier = tf.estimator.LinearClassifier(
feature_columns=feature_columns,
n_classes=10,
model_dir=logdir
)
classifier.train(input_fn=generate_input_fn(DATA_SETS.train,
batch_size=batch_size),
steps=num_steps)
print("Finished training.")
# Evaluate accuracy.
accuracy_score = classifier.evaluate(input_fn=generate_input_fn(
DATA_SETS.test, batch_size), steps=100)['accuracy']
print('Linear Classifier Accuracy: {0:f}'.format(accuracy_score))
Next, add a function that defines a DNNClassifier
, and runs its fit()
method, which will train the model. Again note that we didn't need to explicitly define a model graph or a training loop ourselves.
Then after we've trained the model, we run the classifier's evaluate()
method, which uses the trained model.
def define_and_run_dnn_classifier(num_steps, logdir, lr=.1, batch_size=40):
"""Run a DNN classifier."""
feature_columns = [tf.contrib.layers.real_valued_column(
"pixels", dimension=784)]
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns,
n_classes=10,
hidden_units=[128, 32],
optimizer=tf.train.ProximalAdagradOptimizer(learning_rate=lr),
model_dir=logdir
)
# After you've done a training run with optimizer learning rate 0.1,
# change it to 0.5 and run the training again. Use TensorBoard to take
# a look at the difference. You can see both runs by pointing it to the
# parent model directory, which by default is:
#
# tensorboard --logdir=/tmp/tfmodels/mnist_tflearn
classifier.train(input_fn=generate_input_fn(DATA_SETS.train,
batch_size=batch_size),
steps=num_steps)
print("Finished running the deep training via the fit() method")
accuracy_score = classifier.evaluate(input_fn=generate_input_fn(
DATA_SETS.test, batch_size=batch_size), steps=100)['accuracy']
print('DNN Classifier Accuracy: {0:f}'.format(accuracy_score))
Now we can call the functions that define and train our classifiers. (It takes a moment to set up the input data queue before the training starts).
Let's start with the LinearClassifier, which won't be very accurate.
print("Running Linear classifier ...")
define_and_run_linear_classifier(num_steps=400,
logdir=get_new_path("linear"),
batch_size=40)
# With 400 steps and a batch size of 40, we see accuracy of approx 87%
Now, let's run the DNN Classifier. First, let's try it with a .1 learning rate.
print("Running DNN classifier with .1 learning rate...")
classifier = define_and_run_dnn_classifier(2000,
get_new_path("deep01"),
lr=.1)
# With 2000 steps and a batch size of 40, we see accuracy of approx 95%
Now, let's run it with a .5 learning rate.
print("Running DNN classifier with .5 learning rate...")
classifier = define_and_run_dnn_classifier(2000,
get_new_path("deep05"),
lr=.5)
# With 2000 steps and a batch size of 40, we see accuracy of approx 91%, though sometimes it does not converge at all.
To compare your results, start up TensorBoard as follows in a new terminal window. (If you get a 'not found' error, make sure you've activated your virtual environment in that new window):
$ tensorboard --logdir=/tmp/tfmodels/mnist_estimators
Or run the following (select Kernel --> Interrupt from the menu when you're done):
!tensorboard --logdir=/tmp/tfmodels/mnist_estimators