In this lesson you are going to save and restore your models for additional training.
Tensorflow makes this really easy
# import libraries
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn import preprocessing
import sys
import datetime
import matplotlib.pyplot as plt
plt.style.use('ggplot') # use this plot style
%matplotlib inline
print('Python version ' + sys.version)
print('Tensorflow version ' + tf.VERSION)
print('Pandas version ' + pd.__version__)
print('Numpy version ' + np.__version__)
Python version 3.5.1 |Anaconda custom (64-bit)| (default, Feb 16 2016, 09:49:46) [MSC v.1900 64 bit (AMD64)] Tensorflow version 0.12.0-rc0 Pandas version 0.19.0 Numpy version 1.11.0
y = a * x^8 - b * x^6
TIP: Recommended percentages
Since my model was performing terribly, I attempted to pre-process my data to see if this would help me out.
def normalize(array):
u = array.mean()
s = array.std()
norm = (array - u) / s
return u, s, norm
def min_max(array, min=0, max=1):
X_std = (array - array.min(axis=0)) / (array.max(axis=0) - array.min(axis=0))
X_scaled = X_std * (max - min) + min
return X_scaled
# Let's generate 1000 random samples
pool = np.random.rand(10000,1).astype(np.float32)
# Shuffle the samples
np.random.shuffle(pool)
# sample size of 15%
sample = int(1000 * 0.15)
# 15% test
test_x = pool[0:sample]
# 15% validation
valid_x = pool[sample:sample*2]
# 70% training
train_x = pool[sample*2:]
print('Testing data points: ' + str(test_x.shape))
print('Validation data points: ' + str(valid_x.shape))
print('Training data points: ' + str(train_x.shape))
# Let's compute the ouput using 2 for a and 2 for b
test_y = 2.0 * test_x**8 + 1.0 * test_x**6
valid_y = 2.0 * valid_x**8 + 1.0 * valid_x**6
train_y = 2.0 * train_x**8 + 1.0 * train_x**6
# scale x and y (I choose to only scale y since x seemed already to be close enough to min=0, max=1)
#test_x = min_max(test_x)
test_y = min_max(test_y)
#valid_x = min_max(valid_x)
valid_y = min_max(valid_y)
#train_x = min_max(train_x)
train_y = min_max(train_y)
# Normalize x and y (I choose to only normalize y since x seemed already to be close enough to mean=0, std=1)
#u_test_x, s_test_x, test_x = normalize(test_x)
u_test_y, s_test_y, test_y = normalize(test_y)
#u_valid_x, s_valid_x, valid_x = normalize(valid_x)
u_valid_y, s_valid_y, valid_y = normalize(valid_y)
#u_train_x, s_train_x, train_x = normalize(train_x)
u_train_y, s_train_y, train_y = normalize(train_y)
Testing data points: (150, 1) Validation data points: (150, 1) Training data points: (9700, 1)
df = pd.DataFrame({'train_x':train_x[:,0],
'train_y':train_y[:,0]})
df_valid = pd.DataFrame({'valid_x':valid_x[:,0],
'valid_y':valid_y[:,0]})
df_test = pd.DataFrame({'test_x':test_x[:,0],
'test_y':test_y[:,0]})
df.head()
train_x | train_y | |
---|---|---|
0 | 0.380171 | -0.546063 |
1 | 0.287790 | -0.550913 |
2 | 0.005886 | -0.551907 |
3 | 0.057236 | -0.551907 |
4 | 0.112282 | -0.551904 |
df.describe()
train_x | train_y | |
---|---|---|
count | 9700.000000 | 9.700000e+03 |
mean | 0.501061 | -1.408390e-08 |
std | 0.288875 | 1.000058e+00 |
min | 0.000004 | -5.519073e-01 |
25% | 0.250166 | -5.514931e-01 |
50% | 0.499987 | -5.167179e-01 |
75% | 0.752269 | 2.827150e-02 |
max | 0.999983 | 3.952552e+00 |
df.plot.scatter(x='train_x', y='train_y', figsize=(15,5));
df_valid.plot.scatter(x='valid_x', y='valid_y', figsize=(15,5));
df_test.plot.scatter(x='test_x', y='test_y', figsize=(15,5));
Make a function that will help you create layers easily
def add_layer(inputs, in_size, out_size, activation_function=None):
# tf.random_normal([what is the size of your batches, size of output layer])
Weights = tf.Variable(tf.truncated_normal([in_size, out_size], mean=0.1, stddev=0.1))
# tf.random_normal([size of output layer])
biases = tf.Variable(tf.truncated_normal([out_size], mean=0.1, stddev=0.1))
# shape of pred = [size of your batches, size of output layer]
pred = tf.matmul(inputs, Weights) + biases
if activation_function is None:
outputs = pred
else:
outputs = activation_function(pred)
return outputs
Start to use W (for weight) and b (for bias) when setting up your variables. Aside from adding your ReLU activation function, it is a good idea to use Tensorflow's *matrix multiplication function (matmul)* as shown below.
The ? in the shape output just means it can be of any shape.
# larger batch sizes help you get to the local minimum faster at a cost of more cpu power
# The strategy is to use batch_size when you cannot fit the entire dataset into memory
# In practice, small to moderate mini-batches (10-500) are generally used
batch_size = 100
# you can adjust the number of neurons in the hidden layers here
hidden_size = 10
# placeholders
# shape=[how many samples do you have, how many input neurons]
x = tf.placeholder(tf.float32, shape=[None, 1], name="01_x")
y = tf.placeholder(tf.float32, shape=[None, 1], name="01_y")
print("shape of x and y:")
print(x.get_shape(),y.get_shape())
shape of x and y: (?, 1) (?, 1)
We will be feeding in the percentage of neurons to keep on every epoch
# drop out
keep_prob = tf.placeholder(tf.float32)
Note that the input of one layer becomes the input of the next layer.
# create your hidden layers!
h1 = add_layer(x, 1, hidden_size, tf.nn.relu)
# here is where we shoot down some of the neurons
h1_drop = tf.nn.dropout(h1, keep_prob)
# add a second layer
h2 = add_layer(h1_drop, hidden_size, hidden_size, tf.nn.relu)
h2_drop = tf.nn.dropout(h2, keep_prob)
# add a third layer
h3 = add_layer(h2_drop, hidden_size, hidden_size, tf.nn.relu)
h3_drop = tf.nn.dropout(h3, keep_prob)
# add a fourth layer
h4 = add_layer(h3_drop, hidden_size, hidden_size, tf.nn.relu)
h4_drop = tf.nn.dropout(h4, keep_prob)
print("shape of hidden layers:")
print(h1_drop.get_shape(), h2_drop.get_shape(), h3_drop.get_shape(), h4_drop.get_shape())
shape of hidden layers: (?, 10) (?, 10) (?, 10) (?, 10)
# Output Layers
pred = add_layer(h4_drop, hidden_size, 1)
print("shape of output layer:")
print(pred.get_shape())
shape of output layer: (?, 1)
# minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(pred - y))
# pick optimizer
optimizer = tf.train.GradientDescentOptimizer(0.001)
train = optimizer.minimize(loss)
# Create variable to save and restore all of your variables
saver = tf.train.Saver()
# path
save_path = r"C:/Users/david/notebooks/tensorflow/tmp/model_" + datetime.datetime.now().strftime('%Y-%m-%d')
Set up the following variables to calculate the accuracy rate of your model. You will do that shortly.
# check accuracy of model
correct_prediction = tf.equal(tf.round(pred), tf.round(y))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
Code borrowed from this great Tensorflow Jupyter Notebook.
# Best validation accuracy seen so far.
best_valid_acc = 0.0
# Iteration-number for last improvement to validation accuracy.
last_improvement = 0
# Stop optimization if no improvement found in this many iterations.
require_improvement = 15000
The code below aims to save the best model you have seen during training. If an improvement is found, you save that model to disk. Also note that you can restore any saved model before you begin training so you can continue training a previosly saved model. Just make sure you have a saved model before running the restore code below. Note that saving your model constantly may *slow down* your training time.
Where is my model getting saved?
It should be inside the *tmp* folder.
# initialize the variables
init = tf.global_variables_initializer()
# hold step and error values
t = []
# Run your graph
with tf.Session() as sess:
# restore model (no need to initialize variables if restorint model)
saver.restore(sess, save_path=save_path)
# initialize variables
sess.run(init)
# Fit the function.
for step in range(60000):
# pull batches at random
i = np.random.permutation(train_x.shape[0])[:batch_size]
# get your data
train_data = {x:train_x[i,:], y:train_y[i,:], keep_prob: 0.98}
valid_data = {x:valid_x, y:valid_y, keep_prob: 1.0}
test_data = {x:test_x, y:test_y, keep_prob: 1.0}
# training in progress...
train_loss, train_pred = sess.run([loss, train], feed_dict=train_data)
# print every n iterations
if step%1000==0:
# capture the step and error for analysis
valid_loss = sess.run(loss, feed_dict=valid_data)
t.append((step, train_loss, valid_loss))
# get snapshot of current training and validation accuracy
train_acc = accuracy.eval(train_data)
valid_acc = accuracy.eval(valid_data)
# If validation accuracy is an improvement over best-known.
if valid_acc > best_valid_acc:
# Update the best-known validation accuracy.
best_valid_acc = valid_acc
# Set the iteration for the last improvement to current.
last_improvement = step
# Save model to disk
saver.save(sess, save_path=save_path)
# Flag when ever an improvement is found
improved_str = '*'
else:
# An empty string to be printed below.
# Shows that no improvement was found.
improved_str = ''
print("Training loss at step %d: %f %s" % (step, train_loss, improved_str))
print("Validation %f" % (valid_loss))
# If no improvement found in the required number of iterations.
if step - last_improvement > require_improvement:
print("No improvement found in a while, stopping optimization.")
# Break out from the for-loop.
break
# here is where you see how good of a Data Scientist you are
print("Accuracy on the Training Set:", accuracy.eval(train_data) )
print("Accuracy on the Validation Set:", accuracy.eval(valid_data) )
print("Accuracy on the Test Set:", accuracy.eval(test_data) )
# capture predictions on test data
test_results = sess.run(pred, feed_dict={x:test_x, keep_prob: 1.0})
df_final = pd.DataFrame({'test_x':test_x[:,0],
'pred':test_results[:,0]})
# capture training and validation loss
df_loss = pd.DataFrame(t, columns=['step', 'train_loss', 'valid_loss'])
Training loss at step 0: 1.147479 * Validation 1.388356 Training loss at step 1000: 1.027446 * Validation 0.995131 Training loss at step 2000: 0.988181 Validation 0.986231 Training loss at step 3000: 1.258126 Validation 0.971168 Training loss at step 4000: 0.655966 Validation 0.941765 Training loss at step 5000: 0.805845 Validation 0.874819 Training loss at step 6000: 0.430810 Validation 0.708996 Training loss at step 7000: 0.569400 * Validation 0.432954 Training loss at step 8000: 0.253947 * Validation 0.230539 Training loss at step 9000: 0.142885 * Validation 0.118798 Training loss at step 10000: 0.123665 Validation 0.067341 Training loss at step 11000: 0.055928 Validation 0.042721 Training loss at step 12000: 0.081819 Validation 0.031563 Training loss at step 13000: 0.049281 Validation 0.023350 Training loss at step 14000: 0.060906 Validation 0.018348 Training loss at step 15000: 0.053416 Validation 0.015116 Training loss at step 16000: 0.017379 * Validation 0.010970 Training loss at step 17000: 0.016900 * Validation 0.008417 Training loss at step 18000: 0.027532 * Validation 0.007178 Training loss at step 19000: 0.009496 * Validation 0.006012 Training loss at step 20000: 0.017412 * Validation 0.004963 Training loss at step 21000: 0.013048 Validation 0.004350 Training loss at step 22000: 0.052921 Validation 0.004060 Training loss at step 23000: 0.014262 * Validation 0.003633 Training loss at step 24000: 0.047953 Validation 0.003212 Training loss at step 25000: 0.012783 Validation 0.003220 Training loss at step 26000: 0.007954 Validation 0.002730 Training loss at step 27000: 0.044838 Validation 0.003043 Training loss at step 28000: 0.051435 Validation 0.002801 Training loss at step 29000: 0.027301 Validation 0.002369 Training loss at step 30000: 0.030987 Validation 0.002710 Training loss at step 31000: 0.009278 Validation 0.002117 Training loss at step 32000: 0.018473 * Validation 0.001894 Training loss at step 33000: 0.033495 Validation 0.001661 Training loss at step 34000: 0.032616 Validation 0.001833 Training loss at step 35000: 0.015506 Validation 0.002224 Training loss at step 36000: 0.023276 Validation 0.001924 Training loss at step 37000: 0.019154 Validation 0.001557 Training loss at step 38000: 0.016489 Validation 0.002008 Training loss at step 39000: 0.027124 Validation 0.001634 Training loss at step 40000: 0.009575 Validation 0.001565 Training loss at step 41000: 0.010542 Validation 0.001354 Training loss at step 42000: 0.012189 Validation 0.001209 Training loss at step 43000: 0.011183 Validation 0.001617 Training loss at step 44000: 0.011050 Validation 0.001304 Training loss at step 45000: 0.018225 Validation 0.001442 Training loss at step 46000: 0.027513 Validation 0.002010 Training loss at step 47000: 0.014332 Validation 0.001789 Training loss at step 48000: 0.010220 Validation 0.001428 No improvement found in a while, stopping optimization. Accuracy on the Training Set: 0.91 Accuracy on the Validation Set: 0.953333 Accuracy on the Test Set: 0.906667
fig, axes = plt.subplots(nrows=1, ncols=1, figsize=(15, 5))
# Chart 1 - Shows the line we are trying to model
df.plot.scatter(x='train_x', y='train_y', ax=axes, color='red')
# Chart 2 - Shows the line our trained model came up with
df_final.plot.scatter(x='test_x', y='pred', ax=axes, alpha=0.3)
# add a little sugar
axes.set_title('target vs pred', fontsize=20)
axes.set_ylabel('y', fontsize=15)
axes.set_xlabel('x', fontsize=15)
axes.legend(["target", "pred"], loc='best');
If the *valid_loss* is increasing and your *train_loss* is decreasing then you have a problem. Since you have implemented early stopping, your model will not over train and prevents this issue from getting out of control.
df_loss.set_index('step').plot(logy=True, figsize=(15,5));
I'll be honest with you. This function was giving me a lot of trouble. It is probably something I am doing wrong but it took me a while to get to a *90% accuracy* on the test set. I changed the size of each layer, the batch size, the learning rate, and even the initial sample size with no success.
What finally worked was making sure I was saving the *best* model to disk and just simply re-running another 60,000 epochs on that model. I believe I completed ~6 runs of 60,000 training iterations. The chart above also started to become smoother and the accuracy started to climb.
Think about how you plan on saving your models. This tutorial saved the model by a date but that may not work for you. You might also consider writing a function that will pass in the name of your model. You may also consider saving a text file with all of the parameters you used during training.
Try to come up with a much better model than mines. Try to beat 90% accuracy on the test set and with fewer training iterations. When you do, please share.
This tutorial was created by HEDARO