This turorial gives a breif intro on using CNN for train and prediction (i.e. inference)
using MLDatasets
using NumNN
using Plots
gr()
Plots.GRBackend()
Uncomment the following line if you run this code for the first time*
# ] add https://github.com/timholy/ProgressMeter.jl.git ;
using ProgressMeter
ProgressMeter.ijulia_behavior(:clear);
X_train, Y_train = FashionMNIST.traindata(Float64);
X_test, Y_test = FashionMNIST.testdata(Float64);
Since the shape of the MNIST data is (28,28,size)
and to use it in CNN 2D it must be as 4D Array
X_train = reshape(X_train, (size(X_train)[1:2]..., 1, size(X_train)[end]))
X_test = reshape(X_test, (size(X_test)[1:2]...,1,size(X_test)[end]))
Y_train = oneHot(Y_train)
Y_test = oneHot(Y_test);
X_Input = Input(X_train)
X = Conv2D(10, (3,3))(X_Input)
X = BatchNorm(dim=3)(X) #to normalize across the channels
X = Activation(:relu)(X)
X = MaxPool2D((2,2))(X)
X = Conv2D(20, (5,5))(X)
X = BatchNorm(dim=3)(X)
X = Activation(:relu)(X)
X = AveragePool2D((3,3))(X)
X = Flatten()(X)
X_Output = FCLayer(10, :softmax)(X);
Another way when there is no side branches is to use the chain
function as follows:
X_Input, X_Ouput = chain(X_train,[Conv2D(10, (3,3)),
BatchNorm(dim=3),
Activation(:relu),
MaxPool2D((2,2)),
Conv2D(20, (5,5)),
BatchNorm(dim=3),
Activation(:relu),
AveragePool2D((3,3)),
Flatten(),
FCLayer(10,:softmax)]);
chain
returns a Tuple
of two pointers of the Input Layer
and Output Layer
This will also initialize the Layer
s' parameters
model = Model(X_train,Y_train,X_Input,X_Output, 0.005; optimizer=:adam);
predict
to see the current Accuracy¶TestP = predict(model, X_test, Y_test);
println()
println("The accuracy of Test Data before the training process $(round(TestP[:accuracy], digits=4))")
println("The cost of Test Data before the training process $(round(TestP[:cost], digits=4))")
Progress: 100%|█████████████████████████████████████████| Time: 0:00:15 Instances 10000: 10000
The accuracy of Test Data before the training process 0.1278 The cost of Test Data before the training process 2.4328
TrainP = predict(model, X_train, Y_train);
println()
println("The accuracy of Train Data before the training process $(round(TrainP[:accuracy], digits=4))")
println("The cost of Train Data before the training process $(round(TrainP[:cost], digits=4))")
Progress: 100%|█████████████████████████████████████████| Time: 0:00:22 Instances 60000: 60000
The accuracy of Train Data before the training process 0.1287 The cost of Train Data before the training process 2.4326
TrainD = train(X_train, Y_train, model, 10);# testData = X_test, testLabels = Y_test);
Progress: 100%|█████████████████████████████████████████| Time: 0:16:24 Epoch 10: 10 Instances 60000: 60000 Train Cost: 0.2496 Train Accuracy: 0.9099
train
function provides an extra kwargs
to use test Data/Labels to get the Costs and Accuracies during each training epoch.
Note This will take extra time to do the training
Instead it can be used as follows:
TrainD = train(X_train, Y_train, model, 10)
plot(1:10, TrainD[:trainAccuracies], label="Training Accuracies")
plot!(1:10, TrainD[:trainCosts], label="Training Costs")
# plot!(1:10, TrainD[:testAccuracies], label="Test Accuracies")
# plot!(1:10, TrainD[:testCosts], label="Test Costs")
ylabel!("Epochs")
TrainP = predict(model, X_train, Y_train);
println()
println("The accuracy of Train Data before the training process $(round(TrainP[:accuracy], digits=4))")
println("The cost of Train Data before the training process $(round(TrainP[:cost], digits=4))")
Progress: 100%|█████████████████████████████████████████| Time: 0:00:22 Instances 60000: 60000
The accuracy of Train Data before the training process 0.902 The cost of Train Data before the training process 0.275
TestP = predict(model, X_test, Y_test);
println()
println("The accuracy of Test Data before the training process $(round(TestP[:accuracy], digits=4))")
println("The cost of Test Data before the training process $(round(TestP[:cost], digits=4))")
Progress: 100%|█████████████████████████████████████████| Time: 0:00:03 Instances 10000: 10000
The accuracy of Test Data before the training process 0.8857 The cost of Test Data before the training process 0.342