DeepLearning from scratch

Here is implementation of Neural Network from scratch without using any libraries of ML Only numpy is used for NN and matplotlib for plotting the results.

Objective: Objective of this exercise is to understand what difference layers learn, how different activation functions affect the learning rate and importantly what different neurons learn with different activation functions.

Features

Implementation includes following

  • Optimization: Gradient Decent, Momentum, RMSprop, Adam (RMS+ Momentum)

  • Regularization: L2 Penalization, Dropouts

  • Activation Function: Sigmoid, Tanh, Relu, LeakyRelu, Softmax

  • Data set:: Two class dataset (Gaussian, Linear, Moons, Spiral, Sinasodal) and Multiclass (Gaussian distribuated data upto 9 classes)

All you need to import

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook

# DL library (code included)
from DeepNet import deepNet

# Toy Datasets (simulated) 
import DataSet as ds

# Other datasets
from sklearn import datasets

Toy examples

Example 1 (Moons, 2 classes)

Data

In [31]:
dtype = ['MOONS','GAUSSIANS','LINEAR','SINUSOIDAL','SPIRAL']

# Moons data
#Training: N=100 examples and no noise
Xr, yr,_ = ds.create_dataset(100, dtype[0],noise=0.0,varargin = 'PRESET');

#Testing: N=100 examples and 10% noise
Xs, ys,_ = ds.create_dataset(100, dtype[0],noise=0.1,varargin = 'PRESET');

print(Xr.shape, yr.shape,Xs.shape, ys.shape)
print('#Features: ',Xr.shape[0])
print('#Examples: ',Xr.shape[1])
(2, 100) (1, 100) (2, 100) (1, 100)
#Features:  2
#Examples:  100

Neural Network :: Hidden Layers : [3,4]

In [32]:
NN = deepNet(X=Xr,y=yr,Xts=Xs, yts=ys, Net = [3,4],NetAf =['tanh'], alpha=0.01,
             miniBatchSize = 0.3,printCostAt =20,AdamOpt=True,lambd=0,keepProb =[1.0])
#Classes   :  2
#Features  :  2
#Examples  :  100
Network    :  [2, 3, 4, 1]
ActiFun    :  ['tanh', 'tanh', 'sig']
keepProb   :  [1.0, 1.0, 1.0, 1.0]

Training and plotting

In [33]:
%matplotlib notebook
In [34]:
fig1=plt.figure(1,figsize=(8,4))
fig2=plt.figure(2,figsize=(8,5))

for i in range(20):         ## 20 times
    NN.fit(itr=10)          ## itr=10 iteretion each time
    NN.PlotLCurve(pause=0)
    fig1.canvas.draw()
    NN.PlotBoundries(Layers=True,pause=0)
    fig2.canvas.draw()
    
NN.PlotLCurve()
NN.PlotBoundries(Layers=True)

print(NN)

yri,yrp = NN.predict(Xr)
ysi,ysp = NN.predict(Xs)

print('Training Accuracy ::',100*np.sum(yri==yr)/yri.shape[1])
print('Testing  Accuracy ::',100*np.sum(ysi==ys)/ysi.shape[1])
Epoc @ 20 : Training Cost 5.451194e-01  Testing Cost 6.252184e-01
Epoc @ 40 : Training Cost 5.325943e-01  Testing Cost 5.949402e-01
Epoc @ 60 : Training Cost 5.221049e-01  Testing Cost 5.588211e-01
Epoc @ 80 : Training Cost 5.134952e-01  Testing Cost 5.515395e-01
Epoc @ 100 : Training Cost 4.883630e-01  Testing Cost 5.397175e-01
Epoc @ 120 : Training Cost 4.773322e-01  Testing Cost 5.394530e-01
Epoc @ 140 : Training Cost 4.499698e-01  Testing Cost 5.214815e-01
Epoc @ 160 : Training Cost 4.343921e-01  Testing Cost 4.811901e-01
Epoc @ 180 : Training Cost 3.963799e-01  Testing Cost 4.621540e-01
Epoc @ 200 : Training Cost 3.614711e-01  Testing Cost 4.280268e-01
-------------Info---------------
#Classes   :  2
#Features  :  2
#Examples  :  100
Network    :  [2, 3, 4, 1]
ActiFun    :  ['tanh', 'tanh', 'sig']
keepProb   :  [1.0, 1.0, 1.0, 1.0]
Alpha      :  0.01
B1, B2     :  0.9 0.99
lambd      :  0
AdamOpt    :  True
---------------------------
Training Accuracy :: 85.0
Testing  Accuracy :: 80.0
In [35]:
plt.close(fig1)
plt.close(fig2)

Example 2 (Sinusoidal, 2 classes)

Data

In [7]:
dtype = ['MOONS','GAUSSIANS','LINEAR','SINUSOIDAL','SPIRAL']

#Training: N=200 examples and no noise
Xr, yr,_ = ds.create_dataset(200, dtype[3],0.0,varargin = 'PRESET');

#Testing: N=200 examples and 10% noise
Xs, ys,_ = ds.create_dataset(200, dtype[3],0.1,varargin = 'PRESET');

print(Xr.shape, yr.shape,Xs.shape, ys.shape)
print('#Features: ',Xr.shape[0])
print('#Examples: ',Xr.shape[1])
(2, 200) (1, 200) (2, 200) (1, 200)
#Features:  2
#Examples:  200

Neural Network :: Hidden Layers : [8,8,5]

In [8]:
NN = deepNet(X=Xr,y=yr,Xts=Xs, yts=ys, Net = [8,8,5],NetAf =['tanh'], alpha=0.01,
             miniBatchSize = 0.3, printCostAt =100, AdamOpt=True,lambd=0,keepProb =[1.0])
#Classes   :  2
#Features  :  2
#Examples  :  200
Network    :  [2, 8, 8, 5, 1]
ActiFun    :  ['tanh', 'tanh', 'tanh', 'sig']
keepProb   :  [1.0, 1.0, 1.0, 1.0, 1.0]

Training and plotting

In [10]:
%matplotlib notebook
In [9]:
plt.close(fig1)
plt.close(fig2)
fig1=plt.figure(1,figsize=(8,4))
fig2=plt.figure(2,figsize=(8,5))

for i in range(20):         ## 20 times
    NN.fit(itr=10)          ## itr=10 iteretion each time
    NN.PlotLCurve(pause=0)
    fig1.canvas.draw()
    NN.PlotBoundries(Layers=True,pause=0)
    fig2.canvas.draw()
    
NN.PlotLCurve()
NN.PlotBoundries(Layers=True)

print(NN)

yri,yrp = NN.predict(Xr)
ysi,ysp = NN.predict(Xs)

print('Training Accuracy ::',100*np.sum(yri==yr)/yri.shape[1])
print('Testing  Accuracy ::',100*np.sum(ysi==ys)/ysi.shape[1])
Epoc @ 100 : Training Cost 1.819583e-01  Testing Cost 6.126728e-01
Epoc @ 200 : Training Cost 9.083479e-02  Testing Cost 6.960079e-01
-------------Info---------------
#Classes   :  2
#Features  :  2
#Examples  :  200
Network    :  [2, 8, 8, 5, 1]
ActiFun    :  ['tanh', 'tanh', 'tanh', 'sig']
keepProb   :  [1.0, 1.0, 1.0, 1.0, 1.0]
Alpha      :  0.01
B1, B2     :  0.9 0.99
lambd      :  0
AdamOpt    :  True
---------------------------
Training Accuracy :: 97.0
Testing  Accuracy :: 86.0
In [10]:
plt.close(fig1)
plt.close(fig2)

Example 3 (Gaussian, 4 classes)

Data (70-30 split)

In [11]:
X, y = ds.mclassGaus(N=500, nClasses = 4,var =0.25,ShowPlot=False)

[n,N] =X.shape

r  = np.random.permutation(N)

split =int(0.7*N)

Xr = X[:,r[:split]]
yr = y[:,r[:split]]
Xs = X[:,r[split:]]
ys = y[:,r[split:]]

print(Xr.shape, yr.shape,Xs.shape,ys.shape)

print('#Features: ',Xr.shape[0])
print('#Examples: ',Xr.shape[1])
(2, 2000) (1, 2000)
(2, 1400) (1, 1400) (2, 600) (1, 600)
#Features:  2
#Examples:  1400

Neural Network :: Hidden Layers : [8,8,5]

In [12]:
NN = deepNet(X=Xr,y=yr,Xts=Xs, yts=ys, Net = [8,8,5],NetAf =['tanh'], alpha=0.01,
             miniBatchSize = 0.3,printCostAt =-1,AdamOpt=True,lambd=0,keepProb =[1.0])
1 1400 4
1 600 4
#Classes   :  4
#Features  :  2
#Examples  :  1400
Network    :  [2, 8, 8, 5, 4]
ActiFun    :  ['tanh', 'tanh', 'tanh', 'softmax']
keepProb   :  [1.0, 1.0, 1.0, 1.0, 1.0]
In [13]:
plt.close(fig1)
plt.close(fig2)
fig1=plt.figure(1,figsize=(8,4))
fig2=plt.figure(2,figsize=(8,5))

for i in range(20):         ## 20 times
    NN.fit(itr=10)          ## itr=10 iteretion each time
    NN.PlotLCurve(pause=0)
    fig1.canvas.draw()
    NN.PlotBoundries(Layers=True,pause=0)
    fig2.canvas.draw()
    
NN.PlotLCurve()
NN.PlotBoundries(Layers=True)

print(NN)

yri,yrp = NN.predict(Xr)
ysi,ysp = NN.predict(Xs)

print('Training Accuracy ::',100*np.sum(yri==yr)/yri.shape[1])
print('Testing  Accuracy ::',100*np.sum(ysi==ys)/ysi.shape[1])
-------------Info---------------
#Classes   :  4
#Features  :  2
#Examples  :  1400
Network    :  [2, 8, 8, 5, 4]
ActiFun    :  ['tanh', 'tanh', 'tanh', 'softmax']
keepProb   :  [1.0, 1.0, 1.0, 1.0, 1.0]
Alpha      :  0.01
B1, B2     :  0.9 0.99
lambd      :  0
AdamOpt    :  True
---------------------------
Training Accuracy :: 97.07142857142857
Testing  Accuracy :: 97.66666666666667
In [14]:
plt.close(fig1)
plt.close(fig2)

Reapeating example 3 with Relu activation

In [15]:
print(Xr.shape, yr.shape,Xs.shape,ys.shape)

print('#Features: ',Xr.shape[0])
print('#Examples: ',Xr.shape[1])

NN = deepNet(X=Xr,y=yr,Xts=Xs, yts=ys, Net = [8,8,5],NetAf =['relu'], alpha=0.01,
             miniBatchSize = 0.3,printCostAt =-1,AdamOpt=True,lambd=0,keepProb =[1.0])



plt.close(fig1)
plt.close(fig2)
fig1=plt.figure(1,figsize=(8,4))
fig2=plt.figure(2,figsize=(8,5))

for i in range(20):         ## 20 times
    NN.fit(itr=10)          ## itr=10 iteretion each time
    NN.PlotLCurve(pause=0)
    fig1.canvas.draw()
    NN.PlotBoundries(Layers=True,pause=0)
    fig2.canvas.draw()
    
NN.PlotLCurve()
NN.PlotBoundries(Layers=True)

print(NN)

yri,yrp = NN.predict(Xr)
ysi,ysp = NN.predict(Xs)

print('Training Accuracy ::',100*np.sum(yri==yr)/yri.shape[1])
print('Testing  Accuracy ::',100*np.sum(ysi==ys)/ysi.shape[1])
(2, 1400) (1, 1400) (2, 600) (1, 600)
#Features:  2
#Examples:  1400
1 1400 4
1 600 4
#Classes   :  4
#Features  :  2
#Examples  :  1400
Network    :  [2, 8, 8, 5, 4]
ActiFun    :  ['relu', 'relu', 'relu', 'softmax']
keepProb   :  [1.0, 1.0, 1.0, 1.0, 1.0]
-------------Info---------------
#Classes   :  4
#Features  :  2
#Examples  :  1400
Network    :  [2, 8, 8, 5, 4]
ActiFun    :  ['relu', 'relu', 'relu', 'softmax']
keepProb   :  [1.0, 1.0, 1.0, 1.0, 1.0]
Alpha      :  0.01
B1, B2     :  0.9 0.99
lambd      :  0
AdamOpt    :  True
---------------------------
Training Accuracy :: 97.42857142857143
Testing  Accuracy :: 97.5
In [16]:
plt.close(fig1)
plt.close(fig2)

Real world Examples

MNIST Dataset (10 classes)

In [36]:
Xy= datasets.load_digits()
X = Xy['data']
y = Xy['target']
print(X.shape, y.shape)
(1797, 64) (1797,)
In [37]:
fig=plt.figure(1,figsize=(10,1))
for i in range(10):
    plt.subplot(1,10,i+1)
    plt.imshow(X[i].reshape([8,8]),cmap='gray',aspect='auto')
    plt.title('y :' + str(y[i]))
    plt.axis('off')
plt.subplots_adjust(top=0.8,wspace=0.12, hspace=0)
plt.show()