# The Mathematical Engineering of Deep Learning¶

## Practical 1 (Julia version)¶

For an R or Python version see the course website.

In this practical we will carry out some basic EDA and analysis of some popular ML Datasets that will be used in the course.

In [1]:
# Run these if you haven't installed the packages earlier.
# import Pkg;
# Pkg.build()
# # Pkg.precompile() #optional - but saves time from doing a full precompile on first run


### Fashion MNIST¶

In [2]:
using MLDatasets
fashionMNISTtrain_x, fashionMNISTtrain_y = FashionMNIST.traindata()
fashionMNISTtest_x,  fashionMNISTtest_y  = FashionMNIST.testdata()
classNames = FashionMNIST.classnames()

Out[2]:
10-element Array{String,1}:
"T-Shirt"
"Trouser"
"Pullover"
"Dress"
"Coat"
"Sandal"
"Shirt"
"Sneaker"
"Bag"
"Ankle boot"
In [3]:
typeof(fashionMNISTtest_x)

Out[3]:
Base.ReinterpretArray{FixedPointNumbers.Normed{UInt8,8},3,UInt8,Array{UInt8,3}}
In [4]:
size(fashionMNISTtrain_x)

Out[4]:
(28, 28, 60000)
In [5]:
fashionMNISTtest_x[22,2,3535]

Out[5]:
0.02N0f8
In [6]:
typeof(fashionMNISTtest_x[24,2,3535])

Out[6]:
FixedPointNumbers.Normed{UInt8,8}

### Here is the first image of the training set¶

In [7]:
using Plots; pyplot()
heatmap(fashionMNISTtest_x[:,:,1]',yflip=true,legend=false,color=:greys)

Out[7]:
In [8]:
using ImageCore
FashionMNIST.convert2image(fashionMNISTtest_x[:,:,1])

Out[8]:
In [9]:
fashionMNISTtrain_y[1]+1

Out[9]:
10
In [10]:
classNames[fashionMNISTtrain_y[1]+1] #The +1 is because the labels start at 0 and in Julia arrays start at 1

Out[10]:
"Ankle boot"

Task 1: Present the second image of the training set in the same way as done for the first.

In [11]:
#Solution:
heatmap(fashionMNISTtest_x[:,:,2]',yflip=true,legend=false,color=:greys)

Out[11]:

### Lets see if the data is balanced¶

In [12]:
using StatsBase

In [13]:
? counts

search: counts addcounts! codeunits countlines count_ones ncodeunits count_zeros


Out[13]:
counts(x, [wv::AbstractWeights])
counts(x, levels::UnitRange{<:Integer}, [wv::AbstractWeights])
counts(x, k::Integer, [wv::AbstractWeights])

Count the number of times each value in x occurs. If levels is provided, only values falling in that range will be considered (the others will be ignored without raising an error or a warning). If an integer k is provided, only values in the range 1:k will be considered.

If a weighting vector wv is specified, the sum of the weights is used rather than the raw counts.

The output is a vector of length length(levels).

In [14]:
counts([1,2,2,2,2,3,4,5])

Out[14]:
5-element Array{Int64,1}:
1
4
1
1
1
In [15]:
using StatsBase
counts(fashionMNISTtrain_y)

Out[15]:
10-element Array{Int64,1}:
6000
6000
6000
6000
6000
6000
6000
6000
6000
6000

Task 2: Do the same for the test set labels.

In [18]:
length(fashionMNISTtest_y)

Out[18]:
10000
In [19]:
counts(fashionMNISTtest_y)

Out[19]:
10-element Array{Int64,1}:
1000
1000
1000
1000
1000
1000
1000
1000
1000
1000

### Is it the same for MNIST?¶

In [20]:
? @show

Out[20]:
@show

Show an expression and result, returning the result. See also show.

In [21]:
@show 1+1.0

1 + 1.0 = 2.0

Out[21]:
2.0
In [22]:
using ImageCore
digitsMNISTtrain_x, digitsMNISTtrain_y = MNIST.traindata()
digitsMNISTtest_x,  digitsMNISTtest_y  = MNIST.testdata()
@show size(digitsMNISTtrain_x)
@show size(digitsMNISTtest_x)
@show digitsMNISTtrain_y[1]
MNIST.convert2image(digitsMNISTtrain_x[:,:,1])

size(digitsMNISTtrain_x) = (28, 28, 60000)
size(digitsMNISTtest_x) = (28, 28, 10000)
digitsMNISTtrain_y[1] = 5

Out[22]:
In [23]:
counts(digitsMNISTtrain_y)

Out[23]:
10-element Array{Int64,1}:
5923
6742
5958
6131
5842
5421
5918
6265
5851
5949
In [24]:
counts(digitsMNISTtest_y)

Out[24]:
10-element Array{Int64,1}:
980
1135
1032
1010
982
892
958
1028
974
1009

Answer: It is close to balanced (but we can still call that balanced).

### A rough comparison of the training and test set¶

In [25]:
sum(digitsMNISTtrain_x[:,:,1])/(28*28)

Out[25]:
0.13768007202881152
In [26]:
mean(digitsMNISTtrain_x[:,:,1])

Out[26]:
0.13768007f0
In [27]:
using Statistics
methods(mean)

Out[27]:
# 10 methods for generic function mean:
In [28]:
[i^2 for i in 1:10] #comprhension

Out[28]:
10-element Array{Int64,1}:
1
4
9
16
25
36
49
64
81
100
In [29]:
[1,2,3] .== 3

Out[29]:
3-element BitArray{1}:
0
0
1
In [30]:
using Statistics
meansTrain = [mean(digitsMNISTtrain_x[:,:,digitsMNISTtrain_y .== k]) for k in 0:9]

Out[30]:
10-element Array{Float32,1}:
0.17339931
0.075998634
0.14897512
0.14153014
0.12136559
0.1287494
0.13730177
0.11452771
0.15015598
0.12258995
In [31]:
meansTest = [mean(digitsMNISTtest_x[:,:,digitsMNISTtest_y .== k]) for k in 0:9]

Out[31]:
10-element Array{Float32,1}:
0.17231035
0.07673749
0.15018502
0.14330529
0.1226675
0.13205391
0.14357606
0.11490118
0.1531272
0.12526645
In [32]:
? sort

search: sort sort! sortperm sortperm! sortslices Cshort issorted @shorthands


Out[32]:
sort(v; alg::Algorithm=defalg(v), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)

Variant of sort! that returns a sorted copy of v leaving v itself unmodified.

# Examples¶

jldoctest
julia> v = [3, 1, 2];

julia> sort(v)
3-element Array{Int64,1}:
1
2
3

julia> v
3-element Array{Int64,1}:
3
1
2

sort(A; dims::Integer, alg::Algorithm=DEFAULT_UNSTABLE, lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)

Sort a multidimensional array A along the given dimension. See sort! for a description of possible keyword arguments.

To sort slices of an array, refer to sortslices.

# Examples¶

jldoctest
julia> A = [4 3; 1 2]
2Ã—2 Array{Int64,2}:
4  3
1  2

julia> sort(A, dims = 1)
2Ã—2 Array{Int64,2}:
1  2
4  3

julia> sort(A, dims = 2)
2Ã—2 Array{Int64,2}:
3  4
1  2
In [39]:
? sort!

search: sort! partialsort! sortperm! partialsortperm! setproperty! showgradient!


Out[39]:
sort!(v; alg::Algorithm=defalg(v), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)

Sort the vector v in place. QuickSort is used by default for numeric arrays while MergeSort is used for other arrays. You can specify an algorithm to use via the alg keyword (see Sorting Algorithms for available algorithms). The by keyword lets you provide a function that will be applied to each element before comparison; the lt keyword allows providing a custom "less than" function; use rev=true to reverse the sorting order. These options are independent and can be used together in all possible combinations: if both by and lt are specified, the lt function is applied to the result of the by function; rev=true reverses whatever ordering specified via the by and lt keywords.

# Examples¶

jldoctest
julia> v = [3, 1, 2]; sort!(v); v
3-element Array{Int64,1}:
1
2
3

julia> v = [3, 1, 2]; sort!(v, rev = true); v
3-element Array{Int64,1}:
3
2
1

julia> v = [(1, "c"), (3, "a"), (2, "b")]; sort!(v, by = x -> x[1]); v
3-element Array{Tuple{Int64,String},1}:
(1, "c")
(2, "b")
(3, "a")

julia> v = [(1, "c"), (3, "a"), (2, "b")]; sort!(v, by = x -> x[2]); v
3-element Array{Tuple{Int64,String},1}:
(3, "a")
(2, "b")
(1, "c")

sort!(A; dims::Integer, alg::Algorithm=defalg(A), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)

Sort the multidimensional array A along dimension dims. See sort! for a description of possible keyword arguments.

To sort slices of an array, refer to sortslices.

!!! compat "Julia 1.1" This function requires at least Julia 1.1.

# Examples¶

jldoctest
julia> A = [4 3; 1 2]
2Ã—2 Array{Int64,2}:
4  3
1  2

julia> sort!(A, dims = 1); A
2Ã—2 Array{Int64,2}:
1  2
4  3

julia> sort!(A, dims = 2); A
2Ã—2 Array{Int64,2}:
1  2
3  4
In [33]:
plot(0:9,meansTrain,label="Train")
plot!(0:9,meansTest,label="Test",ylim=(0,0.2))

Out[33]:

Task 3: Do the same for Fashion MNIST

In [34]:
using Statistics, Plots; pyplot()
meansFashionTrain = [mean(fashionMNISTtrain_x[:,:,fashionMNISTtrain_y .== k]) for k in 0:9]
meansFashionTest = [mean(fashionMNISTtest_x[:,:,fashionMNISTtest_y .== k]) for k in 0:9]
plot(0:9,meansFashionTrain,label="Train")
plot!(0:9,meansFashionTest,label="Test",ylim=(0,0.5))

Out[34]:

### Some linear binary classification¶

In [35]:
positiveTrain = digitsMNISTtrain_x[:,:,digitsMNISTtrain_y .== 3] #select the images that are '3'
nPos = size(positiveTrain)[3]
negativeTrain = digitsMNISTtrain_x[:,:,digitsMNISTtrain_y .== 8] #select '8'
nNeg = size(negativeTrain)[3]
@show nPos, nNeg;

(nPos, nNeg) = (6131, 5851)

In [36]:
vec([1 2; 3 4])

Out[36]:
4-element Array{Int64,1}:
1
3
2
4
In [37]:
? vec

search: vec Vector VecOrMat VecElement cvec BitVector DenseVector DenseVecOrMat


Out[37]:
vec(a::AbstractArray) -> AbstractVector

Reshape the array a as a one-dimensional column vector. Return a if it is already an AbstractVector. The resulting array shares the same underlying data as a, so it will only be mutable if a is mutable, in which case modifying one will also modify the other.

# Examples¶

jldoctest
julia> a = [1 2 3; 4 5 6]
2Ã—3 Array{Int64,2}:
1  2  3
4  5  6

julia> vec(a)
6-element Array{Int64,1}:
1
4
2
5
3
6

julia> vec(1:3)
1:3

See also reshape.

In [38]:
#making a vector out of an image
Float32.(vec(positiveTrain[:,:,1])) #. is the broadcast operator

Out[38]:
784-element Array{Float32,1}:
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
â‹®
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
In [39]:
#A data matrix of positive samples
tempPos = vcat([Float32.(vec(positiveTrain[:,:,i]))' for i in 1:nPos]...)

Out[39]:
6131Ã—784 Array{Float32,2}:
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
â‹®                        â‹®              â‹±                 â‹®
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
In [40]:
heatmap(tempPos)

Out[40]:
In [42]:
tempNeg = vcat([Float32.(vec(negativeTrain[:,:,i]))' for i in 1:nNeg]...)

Out[42]:
5851Ã—784 Array{Float32,2}:
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
â‹®                        â‹®              â‹±                 â‹®
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
In [43]:
ones(nPos+nNeg) #This will be for the intercept (bias) term

Out[43]:
11982-element Array{Float64,1}:
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
â‹®
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
In [44]:
#The dataMatrix (in statistics = design matrix)
A = hcat(ones(nPos+nNeg),vcat(tempPos,tempNeg))
@show size(A)
A

size(A) = (11982, 785)

Out[44]:
11982Ã—785 Array{Float64,2}:
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
â‹®                        â‹®              â‹±            â‹®
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  â€¦  0.0  0.0  0.0  0.0  0.0  0.0  0.0
1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
In [45]:
heatmap(A)

Out[45]:
In [46]:
nPos

Out[46]:
6131
In [47]:
nNeg

Out[47]:
5851
In [48]:
y = vcat(fill(+1,nPos),fill(-1,nNeg)) #These are the labels

Out[48]:
11982-element Array{Int64,1}:
1
1
1
1
1
1
1
1
1
1
1
1
1
â‹®
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1

We now minimize $$||y - A \beta ||^2$$ with $$\hat{\beta} = A^\dagger y.$$

In [ ]:
Ïµ  Î·  Î²

In [49]:
using LinearAlgebra
Î² = pinv(A)*y  #use \beta +[TAB] to get a Î² character

Out[49]:
785-element Array{Float64,1}:
0.2987432873524592
4.2556719302901376e-12
-2.643873035393093e-11
-9.037203207888133e-11
7.604479009643594e-11
1.8198085214592848e-11
-6.340822969999055e-11
1.458221199413348e-11
-2.403963059624386e-11
-1.1429367629369376e-10
-9.098817883498137e-11
-8.467390019323081e-12
1.579101874535701e-10
â‹®
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
In [50]:
plot(Î²,label = "Î²",xlabel = "Index",ylabel ="Value")

Out[50]:
In [51]:
#This is the actual classifier it takes an image (matrix or vector) and return -1 or +1
classify(x) = sign(Î²'*vcat(1,vec(x))) #one line Julia function classify()

Out[51]:
classify (generic function with 1 method)
In [52]:
classify(positiveTrain[:,:,2]), classify(negativeTrain[:,:,2])

Out[52]:
(1.0, -1.0)
In [53]:
positiveTest = digitsMNISTtest_x[:,:,digitsMNISTtest_y .== 3]
nPosTest = size(positiveTest)[3]
negativeTest = digitsMNISTtest_x[:,:,digitsMNISTtest_y .== 8]
nNegTest = size(negativeTest)[3]
@show nPosTest, nNegTest;

(nPosTest, nNegTest) = (1010, 974)

In [54]:
truePositives = sum([classify(positiveTest[:,:,i]) .== +1 for i in 1:nPosTest])

Out[54]:
968
In [55]:
trueNegatives = sum([classify(negativeTest[:,:,i]) .== -1 for i in 1:nNegTest])

Out[55]:
934

Task 4: What is the accuracy? What is the precision and recall? What is the $F_1$ score?

Reminder:

$$\text{Precision} = \frac{\big|\text{true positive}\big|}{\big|\text{true positive}\big| + \big|\text{false positive}\big|}, \qquad \text{Recall} = \frac{\big|\text{true positive}\big|}{\big|\text{true positive}\big| + \big|\text{false negative}\big|}.$$
In [58]:
#Solution:
accuracy = (truePositives + trueNegatives)/(nPosTest+nNegTest)

Out[58]:
0.9586693548387096
In [60]:
falsePositives = nNegTest - trueNegatives

Out[60]:
40
In [61]:
falseNegatives = nPosTest - truePositives

Out[61]:
42
In [67]:
precision = truePositives/(truePositives + falsePositives)

Out[67]:
0.9603174603174603
In [68]:
recall = truePositives/(truePositives + falseNegatives)

Out[68]:
0.9584158415841584
In [66]:
F1 = 1/mean(1 ./ [precision,recall])

Out[66]:
0.9593657086223986

Task 5: Repeat the above to make a classifier that distingiushes between the 0 digit and 3 digit

In [87]:
positiveTrain = digitsMNISTtrain_x[:,:,digitsMNISTtrain_y .== 3] #select the images that are '3'
nPos = size(positiveTrain)[3]
negativeTrain = digitsMNISTtrain_x[:,:,digitsMNISTtrain_y .== 0] #select '0'
nNeg = size(negativeTrain)[3]
tempPos = vcat([Float32.(vec(positiveTrain[:,:,i]))' for i in 1:nPos]...)
tempNeg = vcat([Float32.(vec(negativeTrain[:,:,i]))' for i in 1:nNeg]...)
#The dataMatrix (in statistics = design matrix)
A = hcat(ones(nPos+nNeg),vcat(tempPos,tempNeg))
y = vcat(fill(+1,nPos),fill(-1,nNeg)) #These are the labels
Î² = pinv(A)*y  #use \beta +[TAB] to get a Î² character
classify(positiveTrain[:,:,2]), classify(negativeTrain[:,:,2])

positiveTest = digitsMNISTtest_x[:,:,digitsMNISTtest_y .== 3]
nPosTest = size(positiveTest)[3]
negativeTest = digitsMNISTtest_x[:,:,digitsMNISTtest_y .== 0]
nNegTest = size(negativeTest)[3]
@show nPosTest, nNegTest;
truePositives = sum([classify(positiveTest[:,:,i]) .== +1 for i in 1:nPosTest])
trueNegatives = sum([classify(negativeTest[:,:,i]) .== -1 for i in 1:nNegTest])
falsePositives = nNegTest - trueNegatives
falseNegatives = nPosTest - truePositives
confusionmatrix = [trueNegatives falsePositives;
falseNegatives truePositives]
display(confusionmatrix)
precision = truePositives/(truePositives + falsePositives)
recall = truePositives/(truePositives + falseNegatives)
F1 = 1/mean(1 ./ [precision,recall])

2Ã—2 Array{Int64,2}:
977     3
10  1000
(nPosTest, nNegTest) = (1010, 980)

Out[87]:
0.9935419771485345

### CIFAR10¶

In [88]:
CIFAR10train_x, CIFAR10train_y = CIFAR10.traindata()
CIFAR10test_x,  CIFAR10test_y  = CIFAR10.testdata()
classNames = CIFAR10.classnames()

Out[88]:
10-element Array{String,1}:
"airplane"
"automobile"
"bird"
"cat"
"deer"
"dog"
"frog"
"horse"
"ship"
"truck"
In [89]:
counts(CIFAR10train_y)

Out[89]:
10-element Array{Int64,1}:
5000
5000
5000
5000
5000
5000
5000
5000
5000
5000
In [90]:
size(CIFAR10train_x) #It is a 4-tensor

Out[90]:
(32, 32, 3, 50000)
In [91]:
#The first image
CIFAR10train_x[:,:,:,1]

Out[91]:
32Ã—32Ã—3 Array{N0f8,3} with eltype Normed{UInt8,8}:
[:, :, 1] =
0.231  0.063  0.098  0.129  0.196  â€¦  0.847  0.863  0.816  0.706  0.694
0.169  0.0    0.063  0.149  0.231     0.757  0.788  0.788  0.678  0.659
0.196  0.071  0.192  0.341  0.4       0.659  0.729  0.776  0.729  0.702
0.267  0.2    0.325  0.416  0.498     0.592  0.675  0.749  0.761  0.737
0.384  0.345  0.431  0.451  0.486     0.514  0.612  0.718  0.776  0.792
0.467  0.471  0.506  0.459  0.475  â€¦  0.494  0.557  0.671  0.788  0.855
0.545  0.502  0.51   0.447  0.471     0.541  0.557  0.624  0.741  0.855
0.569  0.498  0.475  0.412  0.447     0.565  0.6    0.576  0.678  0.812
0.584  0.494  0.443  0.42   0.42      0.557  0.588  0.529  0.612  0.749
0.584  0.455  0.439  0.475  0.49      0.537  0.545  0.51   0.545  0.686
0.514  0.416  0.439  0.49   0.506  â€¦  0.471  0.494  0.545  0.557  0.651
0.49   0.396  0.416  0.427  0.416     0.514  0.533  0.576  0.569  0.639
0.557  0.412  0.412  0.443  0.424     0.569  0.58   0.565  0.553  0.639
â‹®                                  â‹±                       â‹®
0.545  0.498  0.51   0.518  0.525  â€¦  0.549  0.584  0.608  0.573  0.631
0.522  0.478  0.514  0.537  0.549     0.529  0.486  0.541  0.529  0.565
0.533  0.514  0.545  0.533  0.514     0.576  0.494  0.471  0.459  0.439
0.545  0.486  0.498  0.514  0.553     0.58   0.553  0.502  0.439  0.467
0.596  0.475  0.494  0.486  0.529     0.584  0.569  0.557  0.478  0.51
0.639  0.514  0.498  0.51   0.498  â€¦  0.584  0.576  0.529  0.408  0.471
0.659  0.518  0.51   0.518  0.475     0.537  0.498  0.353  0.227  0.361
0.624  0.522  0.557  0.529  0.467     0.561  0.447  0.196  0.133  0.404
0.62   0.522  0.51   0.51   0.404     0.796  0.729  0.537  0.514  0.667
0.62   0.482  0.463  0.49   0.341     0.808  0.678  0.627  0.722  0.847
0.596  0.467  0.471  0.475  0.294  â€¦  0.486  0.22   0.22   0.38   0.592
0.58   0.478  0.427  0.369  0.263     0.278  0.129  0.208  0.325  0.482

[:, :, 2] =
0.243  0.078  0.094  0.098  0.125  â€¦  0.682  0.714  0.667  0.545  0.565
0.18   0.0    0.027  0.078  0.125     0.533  0.588  0.6    0.482  0.506
0.188  0.031  0.106  0.212  0.255     0.478  0.58   0.631  0.565  0.557
0.212  0.106  0.196  0.247  0.31      0.435  0.545  0.616  0.6    0.584
0.286  0.2    0.282  0.275  0.302     0.345  0.471  0.573  0.62   0.659
0.357  0.322  0.361  0.29   0.302  â€¦  0.322  0.404  0.529  0.643  0.741
0.42   0.349  0.365  0.282  0.306     0.369  0.392  0.475  0.6    0.749
0.431  0.337  0.322  0.243  0.29      0.392  0.424  0.42   0.537  0.71
0.459  0.341  0.302  0.267  0.282     0.373  0.412  0.373  0.463  0.639
0.471  0.31   0.306  0.329  0.345     0.361  0.384  0.341  0.388  0.561
0.404  0.275  0.31   0.353  0.349  â€¦  0.306  0.345  0.365  0.38   0.518
0.388  0.263  0.294  0.294  0.267     0.341  0.361  0.384  0.38   0.502
0.451  0.275  0.286  0.302  0.278     0.388  0.4    0.373  0.361  0.498
â‹®                                  â‹±                       â‹®
0.439  0.361  0.373  0.361  0.365  â€¦  0.373  0.396  0.431  0.365  0.412
0.412  0.333  0.376  0.388  0.416     0.357  0.306  0.369  0.329  0.373
0.412  0.349  0.4    0.388  0.373     0.404  0.333  0.298  0.314  0.353
0.424  0.322  0.353  0.365  0.384     0.412  0.396  0.329  0.282  0.357
0.471  0.31   0.349  0.337  0.361     0.424  0.42   0.4    0.318  0.376
0.514  0.349  0.349  0.357  0.329  â€¦  0.427  0.439  0.404  0.263  0.341
0.533  0.357  0.361  0.353  0.31      0.396  0.396  0.271  0.122  0.263
0.506  0.369  0.412  0.365  0.31      0.42   0.341  0.094  0.02   0.306
0.51   0.376  0.369  0.353  0.263     0.655  0.608  0.412  0.369  0.549
0.518  0.345  0.329  0.341  0.224     0.678  0.565  0.522  0.58   0.722
0.49   0.325  0.329  0.333  0.184  â€¦  0.365  0.114  0.122  0.243  0.463
0.486  0.341  0.286  0.243  0.165     0.188  0.075  0.133  0.208  0.361

[:, :, 3] =
0.247  0.078  0.082  0.067  0.082  â€¦  0.341  0.357  0.376  0.376  0.455
0.176  0.0    0.0    0.016  0.043     0.063  0.086  0.133  0.165  0.369
0.169  0.0    0.031  0.098  0.133     0.075  0.094  0.102  0.118  0.341
0.165  0.031  0.09   0.11   0.153     0.137  0.11   0.106  0.098  0.263
0.204  0.082  0.161  0.129  0.141     0.133  0.102  0.133  0.133  0.267
0.247  0.169  0.212  0.137  0.141  â€¦  0.137  0.118  0.125  0.141  0.298
0.294  0.176  0.216  0.145  0.157     0.192  0.2    0.165  0.125  0.282
0.314  0.173  0.184  0.129  0.153     0.208  0.294  0.204  0.125  0.275
0.349  0.196  0.169  0.129  0.133     0.208  0.286  0.192  0.149  0.31
0.365  0.173  0.173  0.176  0.192     0.2    0.224  0.18   0.149  0.322
0.302  0.145  0.18   0.208  0.2    â€¦  0.133  0.149  0.224  0.192  0.337
0.298  0.137  0.176  0.157  0.122     0.161  0.184  0.243  0.22   0.361
0.357  0.141  0.149  0.149  0.129     0.204  0.243  0.216  0.204  0.369
â‹®                                  â‹±                       â‹®
0.294  0.18   0.212  0.2    0.196  â€¦  0.208  0.235  0.251  0.169  0.271
0.271  0.153  0.216  0.227  0.259     0.18   0.141  0.18   0.129  0.216
0.29   0.184  0.243  0.224  0.227     0.224  0.161  0.129  0.149  0.231
0.302  0.161  0.2    0.204  0.259     0.231  0.216  0.153  0.114  0.227
0.349  0.145  0.192  0.173  0.2       0.243  0.239  0.227  0.153  0.255
0.392  0.188  0.196  0.196  0.176  â€¦  0.247  0.267  0.243  0.118  0.231
0.424  0.208  0.208  0.192  0.161     0.212  0.231  0.157  0.043  0.18
0.4    0.227  0.267  0.2    0.157     0.224  0.18   0.043  0.0    0.224
0.408  0.235  0.227  0.196  0.125     0.4    0.384  0.235  0.224  0.408
0.424  0.216  0.196  0.196  0.106     0.412  0.341  0.275  0.369  0.549
0.4    0.196  0.196  0.188  0.09   â€¦  0.192  0.035  0.027  0.133  0.329
0.404  0.224  0.165  0.137  0.098     0.102  0.035  0.078  0.133  0.282
In [92]:
CIFAR10.convert2image(CIFAR10train_x[:,:,:,1])

Out[92]:
In [93]:
CIFAR10train_y[1]

Out[93]:
6
In [94]:
classNames[CIFAR10train_y[1]+1]

Out[94]:
"frog"
In [95]:
plot(CIFAR10.convert2image(CIFAR10train_x[:,:,:,1]),size=(120,120),ticks=false)

Out[95]:
In [96]:
pR = heatmap(CIFAR10train_x[:,:,1,1]',yflip=true,color=:reds,legend=false,size=(150,150),label="Red")
pG = heatmap(CIFAR10train_x[:,:,2,1]',yflip=true,color=:greens,legend=false,size=(150,150))
pB = heatmap(CIFAR10train_x[:,:,3,1]',yflip=true,color=:blues,legend=false,size=(150,150))
plot(pR,pG,pB,layout=(3,1),size=(150,450))

Out[96]:
In [97]:
first5Frogs = CIFAR10train_x[:,:,:,CIFAR10train_y .== 6][:,:,:,1:5]; #label 6 is "frog"
size(first5Frogs)

Out[97]:
(32, 32, 3, 5)
In [98]:
anim = Animation()

for frog in 1:5
plot(CIFAR10.convert2image(first5Frogs[:,:,:,frog]),size=(120,120),ticks=false)
annotate!(16, 16, text("Frog", :center, :center, 25,:red))
end

cd(@__DIR__)
gif(anim, "frogs.gif", fps = 2) #saves the animation with 2 frames per second

â”Œ Info: Saved animation to
â”‚   fn = /Users/uqjnazar/Dropbox/DeepLearning/MathematicalEngineeringDeepLearning/Practical_Unit1/frogs.gif
â”” @ Plots /Users/uqjnazar/.julia/packages/Plots/8GUYs/src/animation.jl:102

Out[98]:

Task 6: Create an animation that goes through the first 100 images and presents the label of each image as red text.

In [102]:
anim = Animation()

for i in 1:100
plot(CIFAR10.convert2image(CIFAR10train_x[:,:,:,i]),size=(120,120),ticks=false)
annotate!(16, 16, text(classNames[CIFAR10train_y[i]+1], :center, :center, 20,:red))

â”Œ Info: Saved animation to