$\newcommand{\xv}{\mathbf{x}} \newcommand{\Xv}{\mathbf{X}} \newcommand{\yv}{\mathbf{y}} \newcommand{\Yv}{\mathbf{Y}} \newcommand{\zv}{\mathbf{z}} \newcommand{\av}{\mathbf{a}} \newcommand{\Wv}{\mathbf{W}} \newcommand{\wv}{\mathbf{w}} \newcommand{\gv}{\mathbf{g}} \newcommand{\Hv}{\mathbf{H}} \newcommand{\dv}{\mathbf{d}} \newcommand{\Vv}{\mathbf{V}} \newcommand{\vv}{\mathbf{v}} \newcommand{\tv}{\mathbf{t}} \newcommand{\Tv}{\mathbf{T}} \newcommand{\zv}{\mathbf{z}} \newcommand{\Zv}{\mathbf{Z}} \newcommand{\muv}{\boldsymbol{\mu}} \newcommand{\sigmav}{\boldsymbol{\sigma}} \newcommand{\phiv}{\boldsymbol{\phi}} \newcommand{\Phiv}{\boldsymbol{\Phi}} \newcommand{\Sigmav}{\boldsymbol{\Sigma}} \newcommand{\Lambdav}{\boldsymbol{\Lambda}} \newcommand{\half}{\frac{1}{2}} \newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}} \newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}} \newcommand{\dimensionbar}[1]{\underset{#1}{\operatorname{|}}}$

# Autoencoder Neural Networks¶

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import neuralnetworks as nn

import matplotlib.gridspec as gridspec
import pickle
import gzip


Get data from DeepLearning Tutorial.

In [2]:
with gzip.open('mnist.pkl.gz', 'rb') as f:
train_set, valid_set, test_set = pickle.load(f, encoding='latin1')

Xorig = np.vstack([a.reshape((28, 28, 1))[np.newaxis, :, :, :] for a in train_set[0]])
Torig = np.array(train_set[1]).reshape((-1,1))

Xtest = np.vstack([a.reshape((28,28,1))[np.newaxis,:,:,:] for a in test_set[0]])
Ttest = np.array(test_set[1]).reshape((-1,1))

Xorig.shape, Torig.shape, Xtest.shape, Ttest.shape

Out[2]:
((50000, 28, 28, 1), (50000, 1), (10000, 28, 28, 1), (10000, 1))
In [3]:
plt.figure(figsize=(10, 10))
for i in range(100):
plt.subplot(10, 10, i + 1)
plt.imshow(-Xorig[i, :].reshape((28, 28)), interpolation='nearest', cmap='gray')
plt.axis('off')
plt.title(str(Torig[i][0]))
plt.tight_layout()

In [4]:
if True:
nEach = 1000
useThese = []
for digit in range(10):
useThese += np.where(Torig == digit)[0][:nEach].tolist()
useThese = np.array(useThese)
np.random.shuffle(useThese)
X = Xorig[useThese, :]
T = Torig[useThese, :]
del Xorig # to save memory
del Torig
else:
X = Xorig
T = Torig
X.shape, T.shape

Out[4]:
((10000, 28, 28, 1), (10000, 1))

Flatten each 28x28 image into a 784 vector.

In [5]:
X = X.reshape((-1,784))
X.shape

Out[5]:
(10000, 784)
In [6]:
rowsShuffled = np.arange(X.shape[0])
np.random.shuffle(rowsShuffled)
nTrain = int(X.shape[0] * 0.8)
Xtrain = X[rowsShuffled[:nTrain], :]
Ttrain = T[rowsShuffled[:nTrain], :]
Xtest = X[rowsShuffled[nTrain:], :]
Ttest = T[rowsShuffled[nTrain:], :]
Xtrain.shape, Ttrain.shape, Xtest.shape, Ttest.shape

Out[6]:
((8000, 784), (8000, 1), (2000, 784), (2000, 1))
In [7]:
nnet = nn.NeuralNetwork(784, [100, 50, 2, 50, 100],784)
nnet.train(X, X, nIterations=50, verbose=True)
print('Training took', nnet.getTrainingTime()/60, 'minutes.')
plt.plot(nnet.getErrors());

SCG: Iteration 5 fValue 0.42614977409024957 Scale 0.012720102141146954
SCG: Iteration 10 fValue 0.40137634215911006 Scale 0.0003975031919108423
SCG: Iteration 15 fValue 0.39392649062653373 Scale 1.2421974747213822e-05
SCG: Iteration 20 fValue 0.3911832882085571 Scale 7.763734217008639e-07
SCG: Iteration 25 fValue 0.38978672559926714 Scale 2.4261669428151997e-08
SCG: Iteration 30 fValue 0.3883534581372223 Scale 7.581771696297499e-10
SCG: Iteration 35 fValue 0.3875108628558666 Scale 2.3693036550929684e-11
SCG: Iteration 40 fValue 0.3869076661997746 Scale 7.404073922165526e-13
SCG: Iteration 45 fValue 0.3863391446425705 Scale 2.313773100676727e-14
SCG: Iteration 50 fValue 0.3859281586759046 Scale 1e-15
Training took 0.5629754702250163 minutes.

In [8]:
Y,allOutputs = nnet.use(X,allOutputs=True)

In [9]:
plt.figure(figsize=(10,10))
for i in range(0,36,2):
plt.subplot(6,6,i+1)
plt.imshow(-X[i,:].reshape((28,28)), interpolation='nearest', cmap='gray')
plt.axis('off')
plt.subplot(6,6,i+2)
plt.imshow(-Y[i,:].reshape((28,28)), interpolation='nearest', cmap='gray')
plt.axis('off')


Now we try training with mini-batches, a method often used for large data sets.

In [10]:
from IPython.display import display, clear_output

In [11]:
fig = plt.figure()

nnet = nn.NeuralNetwork(784, [1000, 500, 250, 30, 250, 500, 1000], 784)
errs = []
batchSize = 1000
for epoch in range(10):
for firstInBatch in range(0, Xtrain.shape[0], batchSize):
batch = Xtrain[firstInBatch:firstInBatch + batchSize, :]
nnet.train(batch, batch, nIterations=5)
print('epoch', epoch, 'firstInBatch', firstInBatch, nnet.getErrors()[-1])
err = np.mean((nnet.use(Xtrain) - Xtrain)**2)
errtest = np.mean((nnet.use(Xtest) - Xtest)**2)
errs.append([err, errtest])
fig.clf()
plt.plot(np.array(errs))
clear_output(wait=True)
display(fig)
print('epoch', epoch + 1, err, errtest)

clear_output(wait=True)

In [12]:
Y, allOutputs = nnet.use(X, allOutputs=True)

In [13]:
plt.figure(figsize=(10,10))
for i in range(0,36,2):
plt.subplot(6,6,i+1)
plt.imshow(-X[i,:].reshape((28,28)), interpolation='nearest', cmap='gray')
plt.axis('off')
plt.subplot(6,6,i+2)
plt.imshow(-Y[i,:].reshape((28,28)), interpolation='nearest', cmap='gray')
plt.axis('off')


Let's go back to smaller networks, with two units in the bottleneck layer. We can then plot where each image falls in the two-dimensional space defined by the outputs of the two units in the bottleneck layer.

In [14]:
nnet = nn.NeuralNetwork(784, [1000, 200, 20, 2, 20, 200, 1000], 784)

nnet.train(Xtrain, Xtrain, nIterations=200, verbose=True)
print('Training took', nnet.getTrainingTime()/60, 'minutes.')
plt.plot(nnet.getErrors())

SCG: Iteration 20 fValue 0.38762461903549567 Scale 0.00015724869177570787
SCG: Iteration 40 fValue 0.38196361973328197 Scale 2.999280772699506e-10
SCG: Iteration 60 fValue 0.38053489132259644 Scale 1e-15
SCG: Iteration 80 fValue 0.3792235420687086 Scale 1e-15
SCG: Iteration 100 fValue 0.3739787808435733 Scale 1e-15
SCG: Iteration 120 fValue 0.36958515053334867 Scale 1e-15
SCG: Iteration 140 fValue 0.3640891874510645 Scale 1e-15
SCG: Iteration 160 fValue 0.3607817166780881 Scale 1e-15
SCG: Iteration 180 fValue 0.3593245761757301 Scale 1e-15
SCG: Iteration 200 fValue 0.3581065313101273 Scale 1e-15
Training took 6.646376502513886 minutes.

Out[14]:
[<matplotlib.lines.Line2D at 0x7f4cdcff0780>]
In [15]:
Y,allOutputs = nnet.use(X,allOutputs=True)

In [16]:
plt.figure(figsize=(10,10))
for i in range(0,36,2):
plt.subplot(6,6,i+1)
plt.imshow(-X[i,:].reshape((28,28)), interpolation='nearest', cmap='gray')
plt.axis('off')
plt.subplot(6,6,i+2)
plt.imshow(-Y[i,:].reshape((28,28)), interpolation='nearest', cmap='gray')
plt.axis('off')

In [17]:
nnet

Out[17]:
NeuralNetwork(784, [1000, 200, 20, 2, 20, 200, 1000], 784)
Network was trained for 201 iterations that took 398.7826 seconds. Final error is 0.5984200291685826.
In [18]:
Y,allOutputs = nnet.use(Xtrain, allOutputs=True)
bottleNeck = allOutputs[3]  # bottleneck layer is index 3
plt.figure(figsize=(10,10))
plt.scatter(bottleNeck[:,0], bottleNeck[:, 1], c=Ttrain.flat, alpha=0.5)
plt.colorbar()

Out[18]:
<matplotlib.colorbar.Colorbar at 0x7f4cdd04dcf8>
In [19]:
plt.figure(figsize=(10,10))
plt.xlim(-1,1)
plt.ylim(-1,1)
for i, txt in enumerate(Ttrain.flat):
plt.annotate(txt, (bottleNeck[i, 0], bottleNeck[i, 1]))

In [20]:
plt.figure(figsize=(10,10))
plt.xlim(-1,1)
plt.ylim(-1,1)
for i, txt in enumerate(Ttrain.flat):
plt.annotate(txt, (bottleNeck[i, 0], bottleNeck[i, 1]))

_,allOutputsTest = nnet.use(Xtest,allOutputs=True)
bottleTest = allOutputsTest[3]
for i, txt in enumerate(Ttest.flat):
plt.annotate(txt, bottleTest[i,:], color='r');