Logistic Regression

Logistic Regression is a simple classification technique that analyzes the relationship between a quantitative variable x and a dichotomous categorical variable y. Similar to linear regression, this relationship is inferred by applying a linear transformation to the data

$$ Y = Xw $$

Binary-Cross-Entropy

However, Logistic Regression is unique in the way that it learns relationships. Given that we have to classify y given x, a natural choice of performance/criterion is the Binary Cross Entropy Loss.

In short, Cross Entropy is a popular choice of option to judge classification task models whose output is a probability mass function. It is computed by independently applying below function to each prediction

$$ L(\hat{y},y)=y\cdot -log(\hat{y})^T $$

And its binary re-formulation

$$ L(\hat{y},y)=y\cdot-log(\hat{y})+(1-y)\cdot(-log(1-\hat{y})) $$

In general, Cross-Entropy loss exponentially increases as our predictions diverge from the truth; conversely, as our predictions become infinitely close to our target, the loss equates to zero.

Below graph models Binary-Cross-Entropy when target is either 1 or 0 at different levels of prediction

image-20200508081156737

We can derive the gradient of our Loss function w.r.t. our prediction for the backward pass by using some simple Calculus

$$ \begin{split}\frac{\partial L(\hat{y},y)}{\partial \hat{y}} & =\frac{\partial}{\hat{y}}(y\cdot-log(\hat{y})) + \frac{\partial}{\hat{y}}(1-y)\cdot(-log(1-\hat{y})) \\& =\frac{-y}{\hat{y}} - (\frac{1-y}{1-\hat{y}}* -1) \\& = \frac{-y}{\hat{y}} + \frac{1-y}{1-\hat{y}}\end{split} $$

Given that the Loss function is usually the last forward operation, it also becomes the first gradient we need to compute for the backward pass. For this reason, there is no incoming gradient that we need to worry about integrating.

However, instead of calculating a backward pass of each prediction w.r.t. our weight parameters, we usually take the mean prediction confidence as a way to better gauge our model's performance.

In this case, we will be calculating below gradients

$$ \frac{\partial }{\partial \hat{w}}avg(L(\hat{y},y)) $$

Once our loss has been computed, we follow the general DL procedure of taking a "step" towards steepest descent by computing the gradient of our Loss/criterion function w.r.t. weight parameters

$$ w_j=w_j-\alpha\frac{\partial }{\partial w_j}L(w_j) $$

Sigmoid

In order for us to use the Binary-Cross-Entropy as our criterion, we must first ensure that our model's output ranges between [0,1]. Of coarse, once rounded, this ensures our binary prediction:

  • 1 = Yes
  • 0 = No

We will do this by applying a Sigmoid layer before feeding our inputs to the Loss function.

A sigmoid layers is an activation function that "squeezes" all our inputs to a range between [0,1] by applying below function

$$ \sigma(y)=\frac{1}{1+e^{-y}} $$

Sigmoid function - Wikipedia

One distinct property of the Sigmoid function is that its derivative can be calculated by a simple re-formulation of its forward operation

$$ \frac{\partial \sigma}{\partial y} = \sigma(y)(1-\sigma(y)) $$

Given that activation function is applied independently to each element, its derivative function will be equivalent for all inputs.

$$ \sigma(y) = \sigma\begin{pmatrix}y_1 & y_2 &y_3\end{pmatrix} = \begin{pmatrix}\sigma(y_1) & \sigma(y_2) & \sigma(y_3)\end{pmatrix} \\\frac{\partial \sigma}{\partial y} = \begin{pmatrix}\sigma(y_1)(1-\sigma(y_1) & \sigma(y_2)(1-\sigma(y_2) &\sigma(y_3)(1-\sigma(y_3)\end{pmatrix} $$

Further, given that Sigmoid introduces no new parameters, its backward pass classifies as an intermediate operation. As a result, we can integrate the latest incoming gradient of the chain rule ($\frac{\partial L}{\partial \sigma}$) with the partial of our sigmoid function ($\frac{\partial \sigma}{\partial y}$) by a simple Hadamard product

$$ \frac{\partial L}{\partial y}=\frac{\partial L}{\partial \sigma}\odot \frac{\partial \sigma}{\partial y} $$

NOTE: if the above statements do not make much sense, make sure to review the Linear Layer and/or ReLU tutorial where I expand on such concepts

Build Logistic Regression

Now that we have defined all our needed methods, we will now manually implement the forward/backward pass of each operation using PyTorch's capabilities.

NOTE: We will not go in-depth on our Linear Layer implementation as this was done on this tutorial

In [220]:
import torch
torch.randn((2,2)).cuda()
import torch.nn as nn
In [221]:
####################### Linear Layer ###################


class Linear_Layer(torch.autograd.Function):
    """
    Define a Linear Layer operation
    """
    @staticmethod
    def forward(ctx, input,weights, bias = None):
        """
        In the forward pass, we feed this class all necessary objects to 
        compute a  linear layer (input, weights, and bias)
        """
        # input.dim = (B, in_dim)
        # weights.dim = (in_dim, out_dim)
        
        # given that the grad(output) wrt weight parameters equals the input,
        # we will save it to use for backpropagation
        ctx.save_for_backward(input, weights, bias)
        
        
        # linear transformation
        # (B, out_dim) = (B, in_dim) * (in_dim, out_dim)
        output = torch.mm(input, weights)
        
        if bias is not None:
            # bias.shape = (out_dim)
            
            # expanded_bias.shape = (B, out_dim), repeats bias B times
            expanded_bias = bias.unsqueeze(0).expand_as(output)
            
            # element-wise addition
            output += expanded_bias
        
        return output

    
    @staticmethod
    def backward(ctx, incoming_grad):
        """
        In the backward pass we receive a Tensor (output_grad) containing the 
        gradient of the loss with respect to our f(x) output, 
        and we now need to compute the gradient of the loss
        with respect to our defined function.
        """
        # incoming_grad.shape = (B, out_dim)
        
        # extract inputs from forward pass
        input, weights, bias = ctx.saved_tensors 
        
        # assume none of the inputs need gradients
        grad_input = grad_weight = grad_bias = None
        
        
        # if input requires grad
        if ctx.needs_input_grad[0]:
            # (B, in_dim) = (B, out_dim) * (out_dim, in_dim)
            grad_input = incoming_grad.mm(weights.t())
            
        # if weights require grad
        if ctx.needs_input_grad[1]:
            # (out_dim, in_dim) = (out_dim, B) * (B, in_dim) 
            grad_weight = incoming_grad.t().mm(input)
            
        # if bias requires grad
        if bias is not None and ctx.needs_input_grad[2]:
            # torch.ones((1,B)).mm(incoming_grad)  
            # (out) = (1,B)*(B,out_dim)
            grad_bias = incoming_grad.sum(0)
        
        
        # below, if any of the grads = None, they will simply be ignored
        
        # add grad_output.t() to match original layout of weight parameter
        return grad_input, grad_weight.t(), grad_bias
        
        
In [222]:
class Linear(nn.Module):
    def __init__(self, in_dim, out_dim, bias = True):
        super().__init__()
        self.in_dim = in_dim
        self.out_dim = out_dim
        
        # define parameters
        
        # weight parameter
        self.weight = nn.Parameter(torch.randn((in_dim, out_dim)))
        
        # bias parameter
        if bias:
            self.bias = nn.Parameter(torch.randn((out_dim)))
        else:
            # register parameter as None if not initialized
            self.register_parameter('bias',None)
        
    def forward(self, input):
        output = Linear_Layer.apply(input, self.weight, self.bias)
        return output
In [223]:
################## Sigmoid Layer #######################

# Remember that our incoming gradient will be of equal dims as our output
# b/c of this, output now becomes an intermediate variable
# input.shape == out.shape == incoming_gradient.shape

import torch.nn as nn
import torch

class sigmoid_layer(torch.autograd.Function):
    
    def __init__(self):
        ''
    
    def sigmoid(self,x):
        sig = 1 / (1 + (-1*x).exp())
        return sig
    
    # forward pass
    def forward(self, input):
        # save input for backward() pass 
        self.save_for_backward(input) 
        activated_input = self.sigmoid(input)
        return activated_input

    # integrate backward pass with incoming_grad
    def backward(self, incoming_grad):
        """
        In the backward pass we receive a Tensor containing the 
        gradient of the loss with respect to our f(x) output, 
        and we need to compute the gradient of the loss
        with respect to the input.
        """
        input, = self.saved_tensors
        chained_grad = (self.sigmoid(input) * (1- self.sigmoid(input))) * incoming_grad
        return chained_grad
In [224]:
# test forward pass

weight = torch.tensor([1.], requires_grad = True)
input = torch.tensor([1.])
x = input * weight
sig = sigmoid_layer()(x)
sig
Out[224]:
tensor([0.7311], grad_fn=<sigmoid_layer>)
In [225]:
# test backward pass

sig.backward(torch.tensor([1.]))
weight.grad
Out[225]:
tensor([0.1966])
In [226]:
# compare output with PyTorch's inherent Method

weight = torch.tensor([1.], requires_grad = True)
input = torch.tensor([1.])
x = input * weight

sig = nn.Sigmoid()(x)
sig
Out[226]:
tensor([0.7311], grad_fn=<SigmoidBackward>)
In [227]:
sig.backward()
weight.grad
Out[227]:
tensor([0.1966])
In [228]:
# Wrap ReLU_layer function in nn.module

class Sigmoid(nn.Module):
    def __init__(self):
        super().__init__()

        
    def forward(self, input):
        output = sigmoid_layer()(input)
        return output
    
In [229]:
####################  Binary Cross Entropy ################

# inputs must all be of type .float()
class BCE_loss(torch.autograd.Function):
    

    @staticmethod
    def forward(self, yhat, y):
        # save input for backward() pass 
        self.save_for_backward(y,yhat) 
        loss = - (y * yhat.log() + (1-y)* (1-yhat).log())
        return loss

    @staticmethod
    def backward(self, output_grad):
        y,yhat = self.saved_tensors
        chained_grad = ((yhat-y) / (yhat * (1- yhat)))
        
        # y does not need gradient and thus we pass None to signify this
        return chained_grad, None
In [230]:
# test above method
output = torch.tensor([.50], requires_grad = True)
y = torch.tensor([1.])
loss = BCE_loss.apply(output,y)
loss
Out[230]:
tensor([0.6931], grad_fn=<BCE_lossBackward>)
In [231]:
# test backward() method
loss.backward()
output.grad
Out[231]:
tensor([-2.])
In [232]:
# test with PyTorch
output = torch.tensor([.50], requires_grad = True)
y = torch.tensor([1.])
bce = nn.BCELoss()
loss = bce(output,y)
loss
Out[232]:
tensor(0.6931, grad_fn=<BinaryCrossEntropyBackward>)
In [233]:
# test backward() method
loss.backward()
output.grad
Out[233]:
tensor([-2.])
In [234]:
# Wrap BCELoss function in nn.module

class BCELoss(nn.Module):
    def __init__(self, reduction = 'mean'):
        super().__init__()

        
    def forward(self, pred, target):
        output = BCE_loss.apply(pred,target)
        # reduce output by average
        output = output.mean()
        return output
    

Now that we have all of our "ingredients", we can now create our Logistic model

In [235]:
# Create Logistic Regression function
class LogisticRegression(nn.Module):
    def __init__(self, input_dim = 30):
        super().__init__()
        self.linear = Linear(input_dim, 1) 
        self.sigmoid = Sigmoid()
        
    def forward(self,x):
        # output.shape = (B, 1)
        output = self.sigmoid(self.linear(x))
        return output.view(-1)

Wisconcin Breast Cancer Dataset

To showcase our Linear Regression, we will train our model to differentiate between malignant and benign cancer cells, given the characteristics of the cell nuclei.

Refer to link) to learn more about the data

In [236]:
# import data
import pandas as pd
url = 'https://raw.githubusercontent.com/PacktPublishing/Machine-Learning-with-R-Second-Edition/master/Chapter%2003/wisc_bc_data.csv'
df = pd.read_csv(url)
df.index = df.diagnosis
df.drop(columns = ['diagnosis','id'],inplace = True)
df.head()
Out[236]:
radius_mean texture_mean perimeter_mean area_mean smoothness_mean compactness_mean concavity_mean points_mean symmetry_mean dimension_mean ... radius_worst texture_worst perimeter_worst area_worst smoothness_worst compactness_worst concavity_worst points_worst symmetry_worst dimension_worst
diagnosis
B 12.32 12.39 78.85 464.1 0.10280 0.06981 0.03987 0.03700 0.1959 0.05955 ... 13.50 15.64 86.97 549.1 0.1385 0.1266 0.12420 0.09391 0.2827 0.06771
B 10.60 18.95 69.28 346.4 0.09688 0.11470 0.06387 0.02642 0.1922 0.06491 ... 11.88 22.94 78.28 424.8 0.1213 0.2515 0.19160 0.07926 0.2940 0.07587
B 11.04 16.83 70.92 373.2 0.10770 0.07804 0.03046 0.02480 0.1714 0.06340 ... 12.41 26.44 79.93 471.4 0.1369 0.1482 0.10670 0.07431 0.2998 0.07881
B 11.28 13.39 73.00 384.8 0.11640 0.11360 0.04635 0.04796 0.1771 0.06072 ... 11.92 15.77 76.53 434.0 0.1367 0.1822 0.08669 0.08611 0.2102 0.06784
B 15.19 13.21 97.65 711.8 0.07963 0.06934 0.03393 0.02657 0.1721 0.05544 ... 16.20 15.73 104.50 819.1 0.1126 0.1737 0.13620 0.08178 0.2487 0.06766

5 rows × 30 columns

In [237]:
df.info(verbose = True)
<class 'pandas.core.frame.DataFrame'>
Index: 569 entries, B to M
Data columns (total 30 columns):
radius_mean          569 non-null float64
texture_mean         569 non-null float64
perimeter_mean       569 non-null float64
area_mean            569 non-null float64
smoothness_mean      569 non-null float64
compactness_mean     569 non-null float64
concavity_mean       569 non-null float64
points_mean          569 non-null float64
symmetry_mean        569 non-null float64
dimension_mean       569 non-null float64
radius_se            569 non-null float64
texture_se           569 non-null float64
perimeter_se         569 non-null float64
area_se              569 non-null float64
smoothness_se        569 non-null float64
compactness_se       569 non-null float64
concavity_se         569 non-null float64
points_se            569 non-null float64
symmetry_se          569 non-null float64
dimension_se         569 non-null float64
radius_worst         569 non-null float64
texture_worst        569 non-null float64
perimeter_worst      569 non-null float64
area_worst           569 non-null float64
smoothness_worst     569 non-null float64
compactness_worst    569 non-null float64
concavity_worst      569 non-null float64
points_worst         569 non-null float64
symmetry_worst       569 non-null float64
dimension_worst      569 non-null float64
dtypes: float64(30)
memory usage: 137.8+ KB
In [238]:
# visualize the distribution of our binary classes

import seaborn as sns
import matplotlib.pyplot as plt
plt.style.use('ggplot')

sns.countplot(df.index);plt.show()

Given that there is about twice more data on Benign cells than Malignant, a model will become bias towards classifying Benign cells

Data Preprocessing

In [239]:
# Separate features (X) from target (y)
import numpy as np

X = df.values
y = (df.index == 'M')
y = y.astype(np.double)
In [240]:
# normalize features
from sklearn.preprocessing import normalize
X = normalize(X, axis = 0)
In [241]:
# parse data to training and testing set for evaluation

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = .20, random_state = 42, shuffle = True)
In [242]:
# Transform data to PyTorch tensors and separate data into batches
from skorch.dataset import Dataset
from torch.utils.data import DataLoader

# Wrap each observation with its corresponding target
train = Dataset(X_train,y_train) 
test = Dataset(X_test,y_test) 

# separate data into batches of 16
train_dl = DataLoader(train, batch_size = 16, pin_memory = True)
test_dl = DataLoader(test, batch_size = 16, pin_memory = True)

Now that we have all the data formatted, let's instatiate our model, criterion, and optimizer

Instantiate Logistic Regression

In [243]:
# instantiate model and place it on GPU

device = torch.device('cuda')
model = LogisticRegression(30).to(device)
model
Out[243]:
LogisticRegression(
  (linear): Linear()
  (sigmoid): Sigmoid()
)
In [244]:
# initiate loss function
criterion = BCELoss()
In [245]:
# initiate optimizer
from torch import optim
optimizer = optim.SGD(model.parameters(), lr = .01)

Make one forward pass to make sure everything works as it should

In [246]:
# test train_dl
batch_X,batch_y = next(iter(train_dl))
print(f"batch_X.shape: {batch_X.shape}")
print('-'*35)
print(f"batch_X.shape: {batch_y.shape}")
batch_X.shape: torch.Size([16, 30])
-----------------------------------
batch_X.shape: torch.Size([16])
In [247]:
# Assert our model makes as many predictions according to the batch
# all inputs must be of type .float()

output = model(batch_X.cuda().float())
output.shape
Out[247]:
torch.Size([16])
In [248]:
# average loss
loss = criterion(output,batch_y.cuda().float())
loss
Out[248]:
tensor(0.6896, device='cuda:0', grad_fn=<MeanBackward0>)
In [249]:
# compute gradients by calling .backward()
loss.backward()
In [250]:
# take a step
optimizer.step()

Train Logistic Regression

Now that we have asserted our model works as should, it's time to train it

In [251]:
def train(model, iterator, optimizer, criterion):
    
    # hold avg loss and acc sum of all batches
    epoch_loss = 0
    epoch_acc = 0
    
    
    for batch in iterator:
        
        # zero-out all gradients (if any) from our model parameters
        model.zero_grad()
        
        
        
        # extract input and label
        
        # input.shape = (B, fetures)
        input = batch[0].cuda().float()
        # label.shape = (B)
        label = batch[1].cuda().float()
        
        
        # Start PyTorch's Dynamic Graph
        
        # predictions.shape = (B)
        predictions = model(input)
        
        # average batch loss 
        loss = criterion(predictions, label)
        
        # calculate grad(loss) / grad(parameters)
        # "clears" PyTorch's dynamic graph
        loss.backward()
        
        
        # perform SGD "step" operation
        optimizer.step()
        
        
        # Given that PyTorch variables are "contagious" (they record all operations)
        # we need to ".detach()" to stop them from recording any performance
        # statistics
        
        
        # average batch accuracy
        acc = binary_accuracy(predictions.detach(), label)
        

        
        # record our stats
        epoch_loss += loss.detach()
        epoch_acc += acc
        
    # NOTE: tense.item() unpacks Tensor item to a regular python object 
    # tense.tensor([1]).item() == 1
        
    # return average loss and acc of epoch
    return epoch_loss.item() / len(iterator), epoch_acc / len(iterator)
In [252]:
# compute average accuracy per batch

def binary_accuracy(preds, y):
    # preds.shape = (B)
    # y.shape = (B)

    #round predictions to the closest integer
    rounded_preds = torch.round(preds)
    correct = (rounded_preds == y).sum()
    acc = correct.item() / len(y)
    return acc
In [253]:
import time

def epoch_time(start_time, end_time):
    elapsed_time = end_time - start_time
    elapsed_mins = int(elapsed_time / 60)
    elapsed_secs = int(elapsed_time - (elapsed_mins * 60))
    return elapsed_mins, elapsed_secs
    
    
In [254]:
def evaluate(model, iterator, criterion):
    
    epoch_loss = 0
    epoch_acc = 0
        
    # turn off grad tracking as we are only evaluation performance
    with torch.no_grad():
    
        for batch in iterator:

            # extract input and label       
            input = batch[0].cuda().float()
            label = batch[1].cuda().float()


            # predictions.shape = (B,1)
            predictions = model(input)

            # average batch loss 
            loss = criterion(predictions, label)

            # average batch accuracy
            acc = binary_accuracy(predictions, label)

            epoch_loss += loss
            epoch_acc += acc
        
    return epoch_loss.item() / len(iterator), epoch_acc / len(iterator)
In [255]:
N_EPOCHS = 150

# track statistics
track_stats = {'epoch': [],
               'train_loss': [],
              'train_acc': [],
              'valid_loss':[],
              'valid_acc':[]}


best_valid_loss = float('inf')

for epoch in range(N_EPOCHS):

    start_time = time.time()
    
    train_loss, train_acc = train(model, train_dl, optimizer, criterion)
    valid_loss, valid_acc = evaluate(model, test_dl, criterion)
    
    end_time = time.time()
    
    # record operations
    track_stats['epoch'].append(epoch + 1)
    track_stats['train_loss'].append(train_loss)
    track_stats['train_acc'].append(train_acc)
    track_stats['valid_loss'].append(valid_loss)
    track_stats['valid_acc'].append(valid_acc)
    
    

    epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
    # if this was our best performance, record model parameters
    if valid_loss < best_valid_loss:
        best_valid_loss = valid_loss
        torch.save(model.state_dict(), 'best_log_regression.pt')
    
    # print out stats
    print('-'*75)
    print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Acc: {valid_acc*100:.2f}%')
---------------------------------------------------------------------------
Epoch: 01 | Epoch Time: 0m 0s
	Train Loss: 0.662 | Train Acc: 62.65%
	 Val. Loss: 0.653 |  Val. Acc: 63.28%
---------------------------------------------------------------------------
Epoch: 02 | Epoch Time: 0m 0s
	Train Loss: 0.658 | Train Acc: 62.65%
	 Val. Loss: 0.649 |  Val. Acc: 63.28%
---------------------------------------------------------------------------
Epoch: 03 | Epoch Time: 0m 0s
	Train Loss: 0.655 | Train Acc: 62.65%
	 Val. Loss: 0.646 |  Val. Acc: 63.28%
---------------------------------------------------------------------------
Epoch: 04 | Epoch Time: 0m 0s
	Train Loss: 0.652 | Train Acc: 62.65%
	 Val. Loss: 0.643 |  Val. Acc: 63.28%
---------------------------------------------------------------------------
Epoch: 05 | Epoch Time: 0m 0s
	Train Loss: 0.649 | Train Acc: 62.65%
	 Val. Loss: 0.639 |  Val. Acc: 63.28%
---------------------------------------------------------------------------
Epoch: 06 | Epoch Time: 0m 0s
	Train Loss: 0.645 | Train Acc: 62.65%
	 Val. Loss: 0.636 |  Val. Acc: 63.28%
---------------------------------------------------------------------------
Epoch: 07 | Epoch Time: 0m 0s
	Train Loss: 0.642 | Train Acc: 62.87%
	 Val. Loss: 0.633 |  Val. Acc: 63.28%
---------------------------------------------------------------------------
Epoch: 08 | Epoch Time: 0m 0s
	Train Loss: 0.639 | Train Acc: 62.87%
	 Val. Loss: 0.630 |  Val. Acc: 63.28%
---------------------------------------------------------------------------
Epoch: 09 | Epoch Time: 0m 0s
	Train Loss: 0.636 | Train Acc: 62.87%
	 Val. Loss: 0.627 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 10 | Epoch Time: 0m 0s
	Train Loss: 0.633 | Train Acc: 62.87%
	 Val. Loss: 0.624 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 11 | Epoch Time: 0m 0s
	Train Loss: 0.630 | Train Acc: 62.87%
	 Val. Loss: 0.621 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 12 | Epoch Time: 0m 0s
	Train Loss: 0.627 | Train Acc: 62.65%
	 Val. Loss: 0.618 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 13 | Epoch Time: 0m 0s
	Train Loss: 0.624 | Train Acc: 62.65%
	 Val. Loss: 0.615 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 14 | Epoch Time: 0m 0s
	Train Loss: 0.621 | Train Acc: 62.44%
	 Val. Loss: 0.612 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 15 | Epoch Time: 0m 0s
	Train Loss: 0.618 | Train Acc: 62.44%
	 Val. Loss: 0.609 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 16 | Epoch Time: 0m 0s
	Train Loss: 0.615 | Train Acc: 62.44%
	 Val. Loss: 0.606 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 17 | Epoch Time: 0m 0s
	Train Loss: 0.612 | Train Acc: 62.87%
	 Val. Loss: 0.603 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 18 | Epoch Time: 0m 0s
	Train Loss: 0.609 | Train Acc: 63.52%
	 Val. Loss: 0.600 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 19 | Epoch Time: 0m 0s
	Train Loss: 0.607 | Train Acc: 63.95%
	 Val. Loss: 0.597 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 20 | Epoch Time: 0m 0s
	Train Loss: 0.604 | Train Acc: 63.95%
	 Val. Loss: 0.594 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 21 | Epoch Time: 0m 0s
	Train Loss: 0.601 | Train Acc: 64.59%
	 Val. Loss: 0.592 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 22 | Epoch Time: 0m 0s
	Train Loss: 0.599 | Train Acc: 64.81%
	 Val. Loss: 0.589 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 23 | Epoch Time: 0m 0s
	Train Loss: 0.596 | Train Acc: 65.02%
	 Val. Loss: 0.586 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 24 | Epoch Time: 0m 0s
	Train Loss: 0.593 | Train Acc: 65.02%
	 Val. Loss: 0.584 |  Val. Acc: 64.06%
---------------------------------------------------------------------------
Epoch: 25 | Epoch Time: 0m 0s
	Train Loss: 0.591 | Train Acc: 65.24%
	 Val. Loss: 0.581 |  Val. Acc: 64.84%
---------------------------------------------------------------------------
Epoch: 26 | Epoch Time: 0m 0s
	Train Loss: 0.588 | Train Acc: 65.46%
	 Val. Loss: 0.578 |  Val. Acc: 64.84%
---------------------------------------------------------------------------
Epoch: 27 | Epoch Time: 0m 0s
	Train Loss: 0.586 | Train Acc: 65.89%
	 Val. Loss: 0.576 |  Val. Acc: 64.84%
---------------------------------------------------------------------------
Epoch: 28 | Epoch Time: 0m 0s
	Train Loss: 0.583 | Train Acc: 66.32%
	 Val. Loss: 0.573 |  Val. Acc: 65.62%
---------------------------------------------------------------------------
Epoch: 29 | Epoch Time: 0m 0s
	Train Loss: 0.581 | Train Acc: 66.75%
	 Val. Loss: 0.571 |  Val. Acc: 65.62%
---------------------------------------------------------------------------
Epoch: 30 | Epoch Time: 0m 0s
	Train Loss: 0.578 | Train Acc: 66.96%
	 Val. Loss: 0.568 |  Val. Acc: 65.62%
---------------------------------------------------------------------------
Epoch: 31 | Epoch Time: 0m 0s
	Train Loss: 0.576 | Train Acc: 66.96%
	 Val. Loss: 0.566 |  Val. Acc: 66.41%
---------------------------------------------------------------------------
Epoch: 32 | Epoch Time: 0m 0s
	Train Loss: 0.573 | Train Acc: 67.40%
	 Val. Loss: 0.564 |  Val. Acc: 68.75%
---------------------------------------------------------------------------
Epoch: 33 | Epoch Time: 0m 0s
	Train Loss: 0.571 | Train Acc: 67.61%
	 Val. Loss: 0.561 |  Val. Acc: 68.75%
---------------------------------------------------------------------------
Epoch: 34 | Epoch Time: 0m 0s
	Train Loss: 0.569 | Train Acc: 68.04%
	 Val. Loss: 0.559 |  Val. Acc: 69.53%
---------------------------------------------------------------------------
Epoch: 35 | Epoch Time: 0m 0s
	Train Loss: 0.566 | Train Acc: 68.26%
	 Val. Loss: 0.556 |  Val. Acc: 69.53%
---------------------------------------------------------------------------
Epoch: 36 | Epoch Time: 0m 0s
	Train Loss: 0.564 | Train Acc: 68.26%
	 Val. Loss: 0.554 |  Val. Acc: 69.53%
---------------------------------------------------------------------------
Epoch: 37 | Epoch Time: 0m 0s
	Train Loss: 0.562 | Train Acc: 68.69%
	 Val. Loss: 0.552 |  Val. Acc: 69.53%
---------------------------------------------------------------------------
Epoch: 38 | Epoch Time: 0m 0s
	Train Loss: 0.560 | Train Acc: 68.69%
	 Val. Loss: 0.550 |  Val. Acc: 69.53%
---------------------------------------------------------------------------
Epoch: 39 | Epoch Time: 0m 0s
	Train Loss: 0.558 | Train Acc: 68.90%
	 Val. Loss: 0.547 |  Val. Acc: 69.53%
---------------------------------------------------------------------------
Epoch: 40 | Epoch Time: 0m 0s
	Train Loss: 0.555 | Train Acc: 69.77%
	 Val. Loss: 0.545 |  Val. Acc: 69.53%
---------------------------------------------------------------------------
Epoch: 41 | Epoch Time: 0m 0s
	Train Loss: 0.553 | Train Acc: 70.84%
	 Val. Loss: 0.543 |  Val. Acc: 69.53%
---------------------------------------------------------------------------
Epoch: 42 | Epoch Time: 0m 0s
	Train Loss: 0.551 | Train Acc: 71.55%
	 Val. Loss: 0.541 |  Val. Acc: 69.53%
---------------------------------------------------------------------------
Epoch: 43 | Epoch Time: 0m 0s
	Train Loss: 0.549 | Train Acc: 71.77%
	 Val. Loss: 0.539 |  Val. Acc: 70.31%
---------------------------------------------------------------------------
Epoch: 44 | Epoch Time: 0m 0s
	Train Loss: 0.547 | Train Acc: 71.77%
	 Val. Loss: 0.536 |  Val. Acc: 70.31%
---------------------------------------------------------------------------
Epoch: 45 | Epoch Time: 0m 0s
	Train Loss: 0.545 | Train Acc: 71.98%
	 Val. Loss: 0.534 |  Val. Acc: 70.31%
---------------------------------------------------------------------------
Epoch: 46 | Epoch Time: 0m 0s
	Train Loss: 0.543 | Train Acc: 72.20%
	 Val. Loss: 0.532 |  Val. Acc: 70.31%
---------------------------------------------------------------------------
Epoch: 47 | Epoch Time: 0m 0s
	Train Loss: 0.541 | Train Acc: 73.06%
	 Val. Loss: 0.530 |  Val. Acc: 71.09%
---------------------------------------------------------------------------
Epoch: 48 | Epoch Time: 0m 0s
	Train Loss: 0.539 | Train Acc: 73.06%
	 Val. Loss: 0.528 |  Val. Acc: 71.09%
---------------------------------------------------------------------------
Epoch: 49 | Epoch Time: 0m 0s
	Train Loss: 0.537 | Train Acc: 73.06%
	 Val. Loss: 0.526 |  Val. Acc: 71.09%
---------------------------------------------------------------------------
Epoch: 50 | Epoch Time: 0m 0s
	Train Loss: 0.535 | Train Acc: 73.49%
	 Val. Loss: 0.524 |  Val. Acc: 71.09%
---------------------------------------------------------------------------
Epoch: 51 | Epoch Time: 0m 0s
	Train Loss: 0.533 | Train Acc: 73.71%
	 Val. Loss: 0.522 |  Val. Acc: 71.88%
---------------------------------------------------------------------------
Epoch: 52 | Epoch Time: 0m 0s
	Train Loss: 0.531 | Train Acc: 73.92%
	 Val. Loss: 0.520 |  Val. Acc: 71.88%
---------------------------------------------------------------------------
Epoch: 53 | Epoch Time: 0m 0s
	Train Loss: 0.529 | Train Acc: 74.14%
	 Val. Loss: 0.518 |  Val. Acc: 71.88%
---------------------------------------------------------------------------
Epoch: 54 | Epoch Time: 0m 0s
	Train Loss: 0.527 | Train Acc: 74.14%
	 Val. Loss: 0.516 |  Val. Acc: 72.66%
---------------------------------------------------------------------------
Epoch: 55 | Epoch Time: 0m 0s
	Train Loss: 0.525 | Train Acc: 74.57%
	 Val. Loss: 0.514 |  Val. Acc: 73.44%
---------------------------------------------------------------------------
Epoch: 56 | Epoch Time: 0m 0s
	Train Loss: 0.524 | Train Acc: 74.57%
	 Val. Loss: 0.512 |  Val. Acc: 74.22%
---------------------------------------------------------------------------
Epoch: 57 | Epoch Time: 0m 0s
	Train Loss: 0.522 | Train Acc: 74.78%
	 Val. Loss: 0.511 |  Val. Acc: 74.22%
---------------------------------------------------------------------------
Epoch: 58 | Epoch Time: 0m 0s
	Train Loss: 0.520 | Train Acc: 75.00%
	 Val. Loss: 0.509 |  Val. Acc: 74.22%
---------------------------------------------------------------------------
Epoch: 59 | Epoch Time: 0m 0s
	Train Loss: 0.518 | Train Acc: 75.43%
	 Val. Loss: 0.507 |  Val. Acc: 74.22%
---------------------------------------------------------------------------
Epoch: 60 | Epoch Time: 0m 0s
	Train Loss: 0.516 | Train Acc: 76.08%
	 Val. Loss: 0.505 |  Val. Acc: 74.22%
---------------------------------------------------------------------------
Epoch: 61 | Epoch Time: 0m 0s
	Train Loss: 0.515 | Train Acc: 76.51%
	 Val. Loss: 0.503 |  Val. Acc: 74.22%
---------------------------------------------------------------------------
Epoch: 62 | Epoch Time: 0m 0s
	Train Loss: 0.513 | Train Acc: 76.72%
	 Val. Loss: 0.502 |  Val. Acc: 75.78%
---------------------------------------------------------------------------
Epoch: 63 | Epoch Time: 0m 0s
	Train Loss: 0.511 | Train Acc: 77.16%
	 Val. Loss: 0.500 |  Val. Acc: 75.78%
---------------------------------------------------------------------------
Epoch: 64 | Epoch Time: 0m 0s
	Train Loss: 0.510 | Train Acc: 77.37%
	 Val. Loss: 0.498 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 65 | Epoch Time: 0m 0s
	Train Loss: 0.508 | Train Acc: 77.37%
	 Val. Loss: 0.496 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 66 | Epoch Time: 0m 0s
	Train Loss: 0.506 | Train Acc: 77.37%
	 Val. Loss: 0.495 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 67 | Epoch Time: 0m 0s
	Train Loss: 0.505 | Train Acc: 77.59%
	 Val. Loss: 0.493 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 68 | Epoch Time: 0m 0s
	Train Loss: 0.503 | Train Acc: 77.59%
	 Val. Loss: 0.491 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 69 | Epoch Time: 0m 0s
	Train Loss: 0.501 | Train Acc: 78.02%
	 Val. Loss: 0.490 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 70 | Epoch Time: 0m 0s
	Train Loss: 0.500 | Train Acc: 78.02%
	 Val. Loss: 0.488 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 71 | Epoch Time: 0m 0s
	Train Loss: 0.498 | Train Acc: 78.45%
	 Val. Loss: 0.486 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 72 | Epoch Time: 0m 0s
	Train Loss: 0.497 | Train Acc: 78.45%
	 Val. Loss: 0.485 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 73 | Epoch Time: 0m 0s
	Train Loss: 0.495 | Train Acc: 78.66%
	 Val. Loss: 0.483 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 74 | Epoch Time: 0m 0s
	Train Loss: 0.494 | Train Acc: 79.09%
	 Val. Loss: 0.482 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 75 | Epoch Time: 0m 0s
	Train Loss: 0.492 | Train Acc: 79.31%
	 Val. Loss: 0.480 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 76 | Epoch Time: 0m 0s
	Train Loss: 0.491 | Train Acc: 79.31%
	 Val. Loss: 0.478 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 77 | Epoch Time: 0m 0s
	Train Loss: 0.489 | Train Acc: 79.53%
	 Val. Loss: 0.477 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 78 | Epoch Time: 0m 0s
	Train Loss: 0.488 | Train Acc: 79.74%
	 Val. Loss: 0.475 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 79 | Epoch Time: 0m 0s
	Train Loss: 0.486 | Train Acc: 80.17%
	 Val. Loss: 0.474 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 80 | Epoch Time: 0m 0s
	Train Loss: 0.485 | Train Acc: 80.17%
	 Val. Loss: 0.472 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 81 | Epoch Time: 0m 0s
	Train Loss: 0.483 | Train Acc: 80.60%
	 Val. Loss: 0.471 |  Val. Acc: 76.56%
---------------------------------------------------------------------------
Epoch: 82 | Epoch Time: 0m 0s
	Train Loss: 0.482 | Train Acc: 81.03%
	 Val. Loss: 0.469 |  Val. Acc: 77.34%
---------------------------------------------------------------------------
Epoch: 83 | Epoch Time: 0m 0s
	Train Loss: 0.480 | Train Acc: 81.47%
	 Val. Loss: 0.468 |  Val. Acc: 77.34%
---------------------------------------------------------------------------
Epoch: 84 | Epoch Time: 0m 0s
	Train Loss: 0.479 | Train Acc: 81.68%
	 Val. Loss: 0.466 |  Val. Acc: 78.12%
---------------------------------------------------------------------------
Epoch: 85 | Epoch Time: 0m 0s
	Train Loss: 0.478 | Train Acc: 81.68%
	 Val. Loss: 0.465 |  Val. Acc: 78.12%
---------------------------------------------------------------------------
Epoch: 86 | Epoch Time: 0m 0s
	Train Loss: 0.476 | Train Acc: 81.68%
	 Val. Loss: 0.464 |  Val. Acc: 78.12%
---------------------------------------------------------------------------
Epoch: 87 | Epoch Time: 0m 0s
	Train Loss: 0.475 | Train Acc: 81.68%
	 Val. Loss: 0.462 |  Val. Acc: 78.12%
---------------------------------------------------------------------------
Epoch: 88 | Epoch Time: 0m 0s
	Train Loss: 0.474 | Train Acc: 82.11%
	 Val. Loss: 0.461 |  Val. Acc: 78.91%
---------------------------------------------------------------------------
Epoch: 89 | Epoch Time: 0m 0s
	Train Loss: 0.472 | Train Acc: 82.11%
	 Val. Loss: 0.459 |  Val. Acc: 78.91%
---------------------------------------------------------------------------
Epoch: 90 | Epoch Time: 0m 0s
	Train Loss: 0.471 | Train Acc: 82.11%
	 Val. Loss: 0.458 |  Val. Acc: 78.91%
---------------------------------------------------------------------------
Epoch: 91 | Epoch Time: 0m 0s
	Train Loss: 0.470 | Train Acc: 82.33%
	 Val. Loss: 0.457 |  Val. Acc: 78.91%
---------------------------------------------------------------------------
Epoch: 92 | Epoch Time: 0m 0s
	Train Loss: 0.468 | Train Acc: 82.76%
	 Val. Loss: 0.455 |  Val. Acc: 78.91%
---------------------------------------------------------------------------
Epoch: 93 | Epoch Time: 0m 0s
	Train Loss: 0.467 | Train Acc: 83.41%
	 Val. Loss: 0.454 |  Val. Acc: 79.69%
---------------------------------------------------------------------------
Epoch: 94 | Epoch Time: 0m 0s
	Train Loss: 0.466 | Train Acc: 83.62%
	 Val. Loss: 0.453 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 95 | Epoch Time: 0m 0s
	Train Loss: 0.465 | Train Acc: 83.84%
	 Val. Loss: 0.451 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 96 | Epoch Time: 0m 0s
	Train Loss: 0.463 | Train Acc: 84.27%
	 Val. Loss: 0.450 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 97 | Epoch Time: 0m 0s
	Train Loss: 0.462 | Train Acc: 84.48%
	 Val. Loss: 0.449 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 98 | Epoch Time: 0m 0s
	Train Loss: 0.461 | Train Acc: 84.48%
	 Val. Loss: 0.447 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 99 | Epoch Time: 0m 0s
	Train Loss: 0.460 | Train Acc: 84.48%
	 Val. Loss: 0.446 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 100 | Epoch Time: 0m 0s
	Train Loss: 0.458 | Train Acc: 84.48%
	 Val. Loss: 0.445 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 101 | Epoch Time: 0m 0s
	Train Loss: 0.457 | Train Acc: 84.48%
	 Val. Loss: 0.444 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 102 | Epoch Time: 0m 0s
	Train Loss: 0.456 | Train Acc: 84.48%
	 Val. Loss: 0.442 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 103 | Epoch Time: 0m 0s
	Train Loss: 0.455 | Train Acc: 84.91%
	 Val. Loss: 0.441 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 104 | Epoch Time: 0m 0s
	Train Loss: 0.454 | Train Acc: 84.91%
	 Val. Loss: 0.440 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 105 | Epoch Time: 0m 0s
	Train Loss: 0.452 | Train Acc: 84.91%
	 Val. Loss: 0.439 |  Val. Acc: 80.47%
---------------------------------------------------------------------------
Epoch: 106 | Epoch Time: 0m 0s
	Train Loss: 0.451 | Train Acc: 84.91%
	 Val. Loss: 0.438 |  Val. Acc: 81.25%
---------------------------------------------------------------------------
Epoch: 107 | Epoch Time: 0m 0s
	Train Loss: 0.450 | Train Acc: 84.91%
	 Val. Loss: 0.436 |  Val. Acc: 81.25%
---------------------------------------------------------------------------
Epoch: 108 | Epoch Time: 0m 0s
	Train Loss: 0.449 | Train Acc: 84.91%
	 Val. Loss: 0.435 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 109 | Epoch Time: 0m 0s
	Train Loss: 0.448 | Train Acc: 85.13%
	 Val. Loss: 0.434 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 110 | Epoch Time: 0m 0s
	Train Loss: 0.447 | Train Acc: 85.13%
	 Val. Loss: 0.433 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 111 | Epoch Time: 0m 0s
	Train Loss: 0.446 | Train Acc: 85.13%
	 Val. Loss: 0.432 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 112 | Epoch Time: 0m 0s
	Train Loss: 0.445 | Train Acc: 85.34%
	 Val. Loss: 0.430 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 113 | Epoch Time: 0m 0s
	Train Loss: 0.444 | Train Acc: 85.34%
	 Val. Loss: 0.429 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 114 | Epoch Time: 0m 0s
	Train Loss: 0.443 | Train Acc: 85.99%
	 Val. Loss: 0.428 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 115 | Epoch Time: 0m 0s
	Train Loss: 0.441 | Train Acc: 85.99%
	 Val. Loss: 0.427 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 116 | Epoch Time: 0m 0s
	Train Loss: 0.440 | Train Acc: 85.99%
	 Val. Loss: 0.426 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 117 | Epoch Time: 0m 0s
	Train Loss: 0.439 | Train Acc: 85.99%
	 Val. Loss: 0.425 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 118 | Epoch Time: 0m 0s
	Train Loss: 0.438 | Train Acc: 86.21%
	 Val. Loss: 0.424 |  Val. Acc: 82.03%
---------------------------------------------------------------------------
Epoch: 119 | Epoch Time: 0m 0s
	Train Loss: 0.437 | Train Acc: 86.42%
	 Val. Loss: 0.423 |  Val. Acc: 89.84%
---------------------------------------------------------------------------
Epoch: 120 | Epoch Time: 0m 0s
	Train Loss: 0.436 | Train Acc: 86.42%
	 Val. Loss: 0.422 |  Val. Acc: 89.84%
---------------------------------------------------------------------------
Epoch: 121 | Epoch Time: 0m 0s
	Train Loss: 0.435 | Train Acc: 86.42%
	 Val. Loss: 0.421 |  Val. Acc: 89.84%
---------------------------------------------------------------------------
Epoch: 122 | Epoch Time: 0m 0s
	Train Loss: 0.434 | Train Acc: 86.64%
	 Val. Loss: 0.419 |  Val. Acc: 89.84%
---------------------------------------------------------------------------
Epoch: 123 | Epoch Time: 0m 0s
	Train Loss: 0.433 | Train Acc: 86.85%
	 Val. Loss: 0.418 |  Val. Acc: 89.84%
---------------------------------------------------------------------------
Epoch: 124 | Epoch Time: 0m 0s
	Train Loss: 0.432 | Train Acc: 86.85%
	 Val. Loss: 0.417 |  Val. Acc: 89.84%
---------------------------------------------------------------------------
Epoch: 125 | Epoch Time: 0m 0s
	Train Loss: 0.431 | Train Acc: 86.85%
	 Val. Loss: 0.416 |  Val. Acc: 89.84%
---------------------------------------------------------------------------
Epoch: 126 | Epoch Time: 0m 0s
	Train Loss: 0.430 | Train Acc: 86.85%
	 Val. Loss: 0.415 |  Val. Acc: 89.84%
---------------------------------------------------------------------------
Epoch: 127 | Epoch Time: 0m 0s
	Train Loss: 0.429 | Train Acc: 86.85%
	 Val. Loss: 0.414 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 128 | Epoch Time: 0m 0s
	Train Loss: 0.428 | Train Acc: 86.85%
	 Val. Loss: 0.413 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 129 | Epoch Time: 0m 0s
	Train Loss: 0.427 | Train Acc: 86.85%
	 Val. Loss: 0.412 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 130 | Epoch Time: 0m 0s
	Train Loss: 0.426 | Train Acc: 86.85%
	 Val. Loss: 0.411 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 131 | Epoch Time: 0m 0s
	Train Loss: 0.425 | Train Acc: 86.85%
	 Val. Loss: 0.410 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 132 | Epoch Time: 0m 0s
	Train Loss: 0.425 | Train Acc: 86.85%
	 Val. Loss: 0.409 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 133 | Epoch Time: 0m 0s
	Train Loss: 0.424 | Train Acc: 86.85%
	 Val. Loss: 0.408 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 134 | Epoch Time: 0m 0s
	Train Loss: 0.423 | Train Acc: 86.85%
	 Val. Loss: 0.407 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 135 | Epoch Time: 0m 0s
	Train Loss: 0.422 | Train Acc: 87.07%
	 Val. Loss: 0.406 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 136 | Epoch Time: 0m 0s
	Train Loss: 0.421 | Train Acc: 87.28%
	 Val. Loss: 0.405 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 137 | Epoch Time: 0m 0s
	Train Loss: 0.420 | Train Acc: 87.28%
	 Val. Loss: 0.404 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 138 | Epoch Time: 0m 0s
	Train Loss: 0.419 | Train Acc: 87.28%
	 Val. Loss: 0.403 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 139 | Epoch Time: 0m 0s
	Train Loss: 0.418 | Train Acc: 87.28%
	 Val. Loss: 0.402 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 140 | Epoch Time: 0m 0s
	Train Loss: 0.417 | Train Acc: 87.28%
	 Val. Loss: 0.402 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 141 | Epoch Time: 0m 0s
	Train Loss: 0.416 | Train Acc: 87.50%
	 Val. Loss: 0.401 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 142 | Epoch Time: 0m 0s
	Train Loss: 0.416 | Train Acc: 87.50%
	 Val. Loss: 0.400 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 143 | Epoch Time: 0m 0s
	Train Loss: 0.415 | Train Acc: 87.50%
	 Val. Loss: 0.399 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 144 | Epoch Time: 0m 0s
	Train Loss: 0.414 | Train Acc: 87.50%
	 Val. Loss: 0.398 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 145 | Epoch Time: 0m 0s
	Train Loss: 0.413 | Train Acc: 87.50%
	 Val. Loss: 0.397 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 146 | Epoch Time: 0m 0s
	Train Loss: 0.412 | Train Acc: 87.50%
	 Val. Loss: 0.396 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 147 | Epoch Time: 0m 0s
	Train Loss: 0.411 | Train Acc: 87.50%
	 Val. Loss: 0.395 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 148 | Epoch Time: 0m 0s
	Train Loss: 0.411 | Train Acc: 87.50%
	 Val. Loss: 0.394 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 149 | Epoch Time: 0m 0s
	Train Loss: 0.410 | Train Acc: 87.50%
	 Val. Loss: 0.393 |  Val. Acc: 90.62%
---------------------------------------------------------------------------
Epoch: 150 | Epoch Time: 0m 0s
	Train Loss: 0.409 | Train Acc: 87.50%
	 Val. Loss: 0.393 |  Val. Acc: 90.62%

Visualization

Our model performed very well! With a top validation accuracy of 90.62%

Now, let us graph our results

In [256]:
# format data 
import pandas as pd

stats = pd.DataFrame(track_stats)
stats
Out[256]:
epoch train_loss train_acc valid_loss valid_acc
0 1 0.661884 0.626539 0.652851 0.632812
1 2 0.658471 0.626539 0.649458 0.632812
2 3 0.655121 0.626539 0.646100 0.632812
3 4 0.651815 0.626539 0.642780 0.632812
4 5 0.648550 0.626539 0.639496 0.632812
5 6 0.645323 0.626539 0.636249 0.632812
6 7 0.642134 0.628695 0.633039 0.632812
7 8 0.638983 0.628695 0.629865 0.632812
8 9 0.635868 0.628695 0.626728 0.640625
9 10 0.632790 0.628695 0.623626 0.640625
10 11 0.629748 0.628695 0.620559 0.640625
11 12 0.626742 0.626539 0.617527 0.640625
12 13 0.623771 0.626539 0.614529 0.640625
13 14 0.620835 0.624384 0.611565 0.640625
14 15 0.617933 0.624384 0.608634 0.640625
15 16 0.615065 0.624384 0.605737 0.640625
16 17 0.612231 0.628695 0.602873 0.640625
17 18 0.609430 0.635160 0.600040 0.640625
18 19 0.606661 0.639470 0.597240 0.640625
19 20 0.603925 0.639470 0.594471 0.640625
20 21 0.601220 0.645936 0.591733 0.640625
21 22 0.598547 0.648091 0.589025 0.640625
22 23 0.595905 0.650246 0.586348 0.640625
23 24 0.593293 0.650246 0.583701 0.640625
24 25 0.590712 0.652401 0.581083 0.648438
25 26 0.588160 0.654557 0.578494 0.648438
26 27 0.585638 0.658867 0.575934 0.648438
27 28 0.583144 0.663177 0.573402 0.656250
28 29 0.580679 0.667488 0.570897 0.656250
29 30 0.578242 0.669643 0.568421 0.656250
... ... ... ... ... ...
120 121 0.435226 0.864224 0.420519 0.898438
121 122 0.434218 0.866379 0.419455 0.898438
122 123 0.433219 0.868534 0.418398 0.898438
123 124 0.432226 0.868534 0.417349 0.898438
124 125 0.431242 0.868534 0.416308 0.898438
125 126 0.430265 0.868534 0.415275 0.898438
126 127 0.429295 0.868534 0.414249 0.906250
127 128 0.428333 0.868534 0.413231 0.906250
128 129 0.427378 0.868534 0.412220 0.906250
129 130 0.426430 0.868534 0.411216 0.906250
130 131 0.425490 0.868534 0.410220 0.906250
131 132 0.424556 0.868534 0.409231 0.906250
132 133 0.423630 0.868534 0.408249 0.906250
133 134 0.422710 0.868534 0.407273 0.906250
134 135 0.421797 0.870690 0.406305 0.906250
135 136 0.420891 0.872845 0.405344 0.906250
136 137 0.419992 0.872845 0.404389 0.906250
137 138 0.419099 0.872845 0.403441 0.906250
138 139 0.418212 0.872845 0.402499 0.906250
139 140 0.417332 0.872845 0.401564 0.906250
140 141 0.416458 0.875000 0.400636 0.906250
141 142 0.415591 0.875000 0.399714 0.906250
142 143 0.414730 0.875000 0.398798 0.906250
143 144 0.413874 0.875000 0.397888 0.906250
144 145 0.413025 0.875000 0.396985 0.906250
145 146 0.412182 0.875000 0.396088 0.906250
146 147 0.411345 0.875000 0.395196 0.906250
147 148 0.410514 0.875000 0.394311 0.906250
148 149 0.409689 0.875000 0.393432 0.906250
149 150 0.408869 0.875000 0.392558 0.906250

150 rows × 5 columns

In [257]:
data = []
for row in stats.iterrows():
    data.append(row[1].to_dict())
data
Out[257]:
[{'epoch': 1.0,
  'train_loss': 0.6618835843842605,
  'train_acc': 0.6265394088669951,
  'valid_loss': 0.6528506278991699,
  'valid_acc': 0.6328125},
 {'epoch': 2.0,
  'train_loss': 0.6584710745975889,
  'train_acc': 0.6265394088669951,
  'valid_loss': 0.6494578719139099,
  'valid_acc': 0.6328125},
 {'epoch': 3.0,
  'train_loss': 0.6551205207561624,
  'train_acc': 0.6265394088669951,
  'valid_loss': 0.6461004018783569,
  'valid_acc': 0.6328125},
 {'epoch': 4.0,
  'train_loss': 0.6518148882635708,
  'train_acc': 0.6265394088669951,
  'valid_loss': 0.642779529094696,
  'valid_acc': 0.6328125},
 {'epoch': 5.0,
  'train_loss': 0.6485495074041958,
  'train_acc': 0.6265394088669951,
  'valid_loss': 0.6394957304000854,
  'valid_acc': 0.6328125},
 {'epoch': 6.0,
  'train_loss': 0.6453227339119747,
  'train_acc': 0.6265394088669951,
  'valid_loss': 0.6362490057945251,
  'valid_acc': 0.6328125},
 {'epoch': 7.0,
  'train_loss': 0.6421338443098397,
  'train_acc': 0.6286945812807881,
  'valid_loss': 0.6330390572547913,
  'valid_acc': 0.6328125},
 {'epoch': 8.0,
  'train_loss': 0.6389825755152209,
  'train_acc': 0.6286945812807881,
  'valid_loss': 0.6298654079437256,
  'valid_acc': 0.6328125},
 {'epoch': 9.0,
  'train_loss': 0.6358680067391231,
  'train_acc': 0.6286945812807881,
  'valid_loss': 0.6267277598381042,
  'valid_acc': 0.640625},
 {'epoch': 10.0,
  'train_loss': 0.6327900722109038,
  'train_acc': 0.6286945812807881,
  'valid_loss': 0.623625636100769,
  'valid_acc': 0.640625},
 {'epoch': 11.0,
  'train_loss': 0.629748245765423,
  'train_acc': 0.6286945812807881,
  'valid_loss': 0.6205587387084961,
  'valid_acc': 0.640625},
 {'epoch': 12.0,
  'train_loss': 0.6267420670081829,
  'train_acc': 0.6265394088669951,
  'valid_loss': 0.6175265908241272,
  'valid_acc': 0.640625},
 {'epoch': 13.0,
  'train_loss': 0.6237710097740436,
  'train_acc': 0.6265394088669951,
  'valid_loss': 0.6145287752151489,
  'valid_acc': 0.640625},
 {'epoch': 14.0,
  'train_loss': 0.6208349425217201,
  'train_acc': 0.624384236453202,
  'valid_loss': 0.6115648746490479,
  'valid_acc': 0.640625},
 {'epoch': 15.0,
  'train_loss': 0.6179331417741447,
  'train_acc': 0.624384236453202,
  'valid_loss': 0.6086344718933105,
  'valid_acc': 0.640625},
 {'epoch': 16.0,
  'train_loss': 0.6150652129074623,
  'train_acc': 0.624384236453202,
  'valid_loss': 0.6057370901107788,
  'valid_acc': 0.640625},
 {'epoch': 17.0,
  'train_loss': 0.6122307612978178,
  'train_acc': 0.6286945812807881,
  'valid_loss': 0.6028725504875183,
  'valid_acc': 0.640625},
 {'epoch': 18.0,
  'train_loss': 0.609429721174569,
  'train_acc': 0.6351600985221675,
  'valid_loss': 0.6000401973724365,
  'valid_acc': 0.640625},
 {'epoch': 19.0,
  'train_loss': 0.6066611059780779,
  'train_acc': 0.6394704433497537,
  'valid_loss': 0.5972397327423096,
  'valid_acc': 0.640625},
 {'epoch': 20.0,
  'train_loss': 0.603924718396417,
  'train_acc': 0.6394704433497537,
  'valid_loss': 0.5944706201553345,
  'valid_acc': 0.640625},
 {'epoch': 21.0,
  'train_loss': 0.6012202295763739,
  'train_acc': 0.645935960591133,
  'valid_loss': 0.5917326211929321,
  'valid_acc': 0.640625},
 {'epoch': 22.0,
  'train_loss': 0.598547179123451,
  'train_acc': 0.6480911330049262,
  'valid_loss': 0.5890252590179443,
  'valid_acc': 0.640625},
 {'epoch': 23.0,
  'train_loss': 0.5959050408725081,
  'train_acc': 0.6502463054187192,
  'valid_loss': 0.5863481163978577,
  'valid_acc': 0.640625},
 {'epoch': 24.0,
  'train_loss': 0.5932934201996902,
  'train_acc': 0.6502463054187192,
  'valid_loss': 0.5837007761001587,
  'valid_acc': 0.640625},
 {'epoch': 25.0,
  'train_loss': 0.5907119882517847,
  'train_acc': 0.6524014778325123,
  'valid_loss': 0.581082820892334,
  'valid_acc': 0.6484375},
 {'epoch': 26.0,
  'train_loss': 0.5881601530930092,
  'train_acc': 0.6545566502463054,
  'valid_loss': 0.5784939527511597,
  'valid_acc': 0.6484375},
 {'epoch': 27.0,
  'train_loss': 0.5856377831820784,
  'train_acc': 0.6588669950738917,
  'valid_loss': 0.5759336352348328,
  'valid_acc': 0.6484375},
 {'epoch': 28.0,
  'train_loss': 0.5831442865832098,
  'train_acc': 0.6631773399014779,
  'valid_loss': 0.5734015703201294,
  'valid_acc': 0.65625},
 {'epoch': 29.0,
  'train_loss': 0.580679202901906,
  'train_acc': 0.6674876847290641,
  'valid_loss': 0.5708974003791809,
  'valid_acc': 0.65625},
 {'epoch': 30.0,
  'train_loss': 0.5782424005968817,
  'train_acc': 0.6696428571428572,
  'valid_loss': 0.5684206485748291,
  'valid_acc': 0.65625},
 {'epoch': 31.0,
  'train_loss': 0.5758331561910695,
  'train_acc': 0.6696428571428572,
  'valid_loss': 0.5659710168838501,
  'valid_acc': 0.6640625},
 {'epoch': 32.0,
  'train_loss': 0.5734513381431843,
  'train_acc': 0.6739532019704434,
  'valid_loss': 0.5635480284690857,
  'valid_acc': 0.6875},
 {'epoch': 33.0,
  'train_loss': 0.5710964202880859,
  'train_acc': 0.6761083743842364,
  'valid_loss': 0.561151385307312,
  'valid_acc': 0.6875},
 {'epoch': 34.0,
  'train_loss': 0.5687679422312769,
  'train_acc': 0.6804187192118227,
  'valid_loss': 0.5587806701660156,
  'valid_acc': 0.6953125},
 {'epoch': 35.0,
  'train_loss': 0.5664656408901872,
  'train_acc': 0.6825738916256158,
  'valid_loss': 0.5564355850219727,
  'valid_acc': 0.6953125},
 {'epoch': 36.0,
  'train_loss': 0.5641893847235318,
  'train_acc': 0.6825738916256158,
  'valid_loss': 0.5541156530380249,
  'valid_acc': 0.6953125},
 {'epoch': 37.0,
  'train_loss': 0.5619383844836005,
  'train_acc': 0.686884236453202,
  'valid_loss': 0.5518207550048828,
  'valid_acc': 0.6953125},
 {'epoch': 38.0,
  'train_loss': 0.5597124428584658,
  'train_acc': 0.686884236453202,
  'valid_loss': 0.5495502352714539,
  'valid_acc': 0.6953125},
 {'epoch': 39.0,
  'train_loss': 0.5575110994536301,
  'train_acc': 0.6890394088669951,
  'valid_loss': 0.5473039746284485,
  'valid_acc': 0.6953125},
 {'epoch': 40.0,
  'train_loss': 0.5553342227278084,
  'train_acc': 0.6976600985221675,
  'valid_loss': 0.5450814962387085,
  'valid_acc': 0.6953125},
 {'epoch': 41.0,
  'train_loss': 0.5531812865158607,
  'train_acc': 0.708435960591133,
  'valid_loss': 0.5428824424743652,
  'valid_acc': 0.6953125},
 {'epoch': 42.0,
  'train_loss': 0.551051929079253,
  'train_acc': 0.7155172413793104,
  'valid_loss': 0.5407066345214844,
  'valid_acc': 0.6953125},
 {'epoch': 43.0,
  'train_loss': 0.5489459531060581,
  'train_acc': 0.7176724137931034,
  'valid_loss': 0.5385535955429077,
  'valid_acc': 0.703125},
 {'epoch': 44.0,
  'train_loss': 0.5468627995458143,
  'train_acc': 0.7176724137931034,
  'valid_loss': 0.5364230871200562,
  'valid_acc': 0.703125},
 {'epoch': 45.0,
  'train_loss': 0.5448023697425579,
  'train_acc': 0.7198275862068966,
  'valid_loss': 0.5343146920204163,
  'valid_acc': 0.703125},
 {'epoch': 46.0,
  'train_loss': 0.5427641704164702,
  'train_acc': 0.7219827586206896,
  'valid_loss': 0.5322281718254089,
  'valid_acc': 0.703125},
 {'epoch': 47.0,
  'train_loss': 0.5407478398290174,
  'train_acc': 0.7306034482758621,
  'valid_loss': 0.5301631689071655,
  'valid_acc': 0.7109375},
 {'epoch': 48.0,
  'train_loss': 0.538753345094878,
  'train_acc': 0.7306034482758621,
  'valid_loss': 0.5281195044517517,
  'valid_acc': 0.7109375},
 {'epoch': 49.0,
  'train_loss': 0.5367800942782698,
  'train_acc': 0.7306034482758621,
  'valid_loss': 0.5260966420173645,
  'valid_acc': 0.7109375},
 {'epoch': 50.0,
  'train_loss': 0.5348278900672649,
  'train_acc': 0.7349137931034483,
  'valid_loss': 0.5240945219993591,
  'valid_acc': 0.7109375},
 {'epoch': 51.0,
  'train_loss': 0.5328963049526872,
  'train_acc': 0.7370689655172413,
  'valid_loss': 0.5221127271652222,
  'valid_acc': 0.71875},
 {'epoch': 52.0,
  'train_loss': 0.5309851745079304,
  'train_acc': 0.7392241379310345,
  'valid_loss': 0.520150899887085,
  'valid_acc': 0.71875},
 {'epoch': 53.0,
  'train_loss': 0.5290941369944605,
  'train_acc': 0.7413793103448276,
  'valid_loss': 0.5182088613510132,
  'valid_acc': 0.71875},
 {'epoch': 54.0,
  'train_loss': 0.5272229951003502,
  'train_acc': 0.7413793103448276,
  'valid_loss': 0.516286313533783,
  'valid_acc': 0.7265625},
 {'epoch': 55.0,
  'train_loss': 0.5253712555457806,
  'train_acc': 0.7456896551724138,
  'valid_loss': 0.5143830180168152,
  'valid_acc': 0.734375},
 {'epoch': 56.0,
  'train_loss': 0.5235388525601091,
  'train_acc': 0.7456896551724138,
  'valid_loss': 0.5124986171722412,
  'valid_acc': 0.7421875},
 {'epoch': 57.0,
  'train_loss': 0.5217253586341595,
  'train_acc': 0.7478448275862069,
  'valid_loss': 0.5106328129768372,
  'valid_acc': 0.7421875},
 {'epoch': 58.0,
  'train_loss': 0.5199305764560042,
  'train_acc': 0.75,
  'valid_loss': 0.5087854862213135,
  'valid_acc': 0.7421875},
 {'epoch': 59.0,
  'train_loss': 0.5181542429430731,
  'train_acc': 0.7543103448275862,
  'valid_loss': 0.5069563388824463,
  'valid_acc': 0.7421875},
 {'epoch': 60.0,
  'train_loss': 0.5163960950127964,
  'train_acc': 0.7607758620689655,
  'valid_loss': 0.5051449537277222,
  'valid_acc': 0.7421875},
 {'epoch': 61.0,
  'train_loss': 0.5146557380413187,
  'train_acc': 0.7650862068965517,
  'valid_loss': 0.5033513307571411,
  'valid_acc': 0.7421875},
 {'epoch': 62.0,
  'train_loss': 0.512933139143319,
  'train_acc': 0.7672413793103449,
  'valid_loss': 0.5015749931335449,
  'valid_acc': 0.7578125},
 {'epoch': 63.0,
  'train_loss': 0.5112278379242996,
  'train_acc': 0.771551724137931,
  'valid_loss': 0.49981582164764404,
  'valid_acc': 0.7578125},
 {'epoch': 64.0,
  'train_loss': 0.509539768613618,
  'train_acc': 0.7737068965517241,
  'valid_loss': 0.4980735778808594,
  'valid_acc': 0.765625},
 {'epoch': 65.0,
  'train_loss': 0.5078684708167767,
  'train_acc': 0.7737068965517241,
  'valid_loss': 0.4963480234146118,
  'valid_acc': 0.765625},
 {'epoch': 66.0,
  'train_loss': 0.5062139116484543,
  'train_acc': 0.7737068965517241,
  'valid_loss': 0.4946388006210327,
  'valid_acc': 0.765625},
 {'epoch': 67.0,
  'train_loss': 0.5045756964847959,
  'train_acc': 0.7758620689655172,
  'valid_loss': 0.4929458796977997,
  'valid_acc': 0.765625},
 {'epoch': 68.0,
  'train_loss': 0.5029537924404802,
  'train_acc': 0.7758620689655172,
  'valid_loss': 0.49126893281936646,
  'valid_acc': 0.765625},
 {'epoch': 69.0,
  'train_loss': 0.501347673350367,
  'train_acc': 0.7801724137931034,
  'valid_loss': 0.48960769176483154,
  'valid_acc': 0.765625},
 {'epoch': 70.0,
  'train_loss': 0.4997573720997778,
  'train_acc': 0.7801724137931034,
  'valid_loss': 0.487962007522583,
  'valid_acc': 0.765625},
 {'epoch': 71.0,
  'train_loss': 0.49818255983549975,
  'train_acc': 0.7844827586206896,
  'valid_loss': 0.4863317012786865,
  'valid_acc': 0.765625},
 {'epoch': 72.0,
  'train_loss': 0.4966230063602842,
  'train_acc': 0.7844827586206896,
  'valid_loss': 0.4847164750099182,
  'valid_acc': 0.765625},
 {'epoch': 73.0,
  'train_loss': 0.4950785472475249,
  'train_acc': 0.7866379310344828,
  'valid_loss': 0.48311617970466614,
  'valid_acc': 0.765625},
 {'epoch': 74.0,
  'train_loss': 0.4935490509559368,
  'train_acc': 0.790948275862069,
  'valid_loss': 0.4815305769443512,
  'valid_acc': 0.765625},
 {'epoch': 75.0,
  'train_loss': 0.49203392554973735,
  'train_acc': 0.7931034482758621,
  'valid_loss': 0.47995948791503906,
  'valid_acc': 0.765625},
 {'epoch': 76.0,
  'train_loss': 0.49053353276746026,
  'train_acc': 0.7931034482758621,
  'valid_loss': 0.4784027636051178,
  'valid_acc': 0.765625},
 {'epoch': 77.0,
  'train_loss': 0.489047280673323,
  'train_acc': 0.7952586206896551,
  'valid_loss': 0.47686007618904114,
  'valid_acc': 0.765625},
 {'epoch': 78.0,
  'train_loss': 0.4875750706113618,
  'train_acc': 0.7974137931034483,
  'valid_loss': 0.4753313660621643,
  'valid_acc': 0.765625},
 {'epoch': 79.0,
  'train_loss': 0.4861166723843279,
  'train_acc': 0.8017241379310345,
  'valid_loss': 0.4738163650035858,
  'valid_acc': 0.765625},
 {'epoch': 80.0,
  'train_loss': 0.4846720531068999,
  'train_acc': 0.8017241379310345,
  'valid_loss': 0.4723149538040161,
  'valid_acc': 0.765625},
 {'epoch': 81.0,
  'train_loss': 0.4832408181552229,
  'train_acc': 0.8060344827586207,
  'valid_loss': 0.4708269238471985,
  'valid_acc': 0.765625},
 {'epoch': 82.0,
  'train_loss': 0.48182293464397563,
  'train_acc': 0.8103448275862069,
  'valid_loss': 0.4693520665168762,
  'valid_acc': 0.7734375},
 {'epoch': 83.0,
  'train_loss': 0.48041820526123047,
  'train_acc': 0.8146551724137931,
  'valid_loss': 0.4678902328014374,
  'valid_acc': 0.7734375},
 {'epoch': 84.0,
  'train_loss': 0.47902633403909617,
  'train_acc': 0.8168103448275862,
  'valid_loss': 0.46644127368927,
  'valid_acc': 0.78125},
 {'epoch': 85.0,
  'train_loss': 0.4776471894362877,
  'train_acc': 0.8168103448275862,
  'valid_loss': 0.4650050103664398,
  'valid_acc': 0.78125},
 {'epoch': 86.0,
  'train_loss': 0.4762807056821626,
  'train_acc': 0.8168103448275862,
  'valid_loss': 0.46358129382133484,
  'valid_acc': 0.78125},
 {'epoch': 87.0,
  'train_loss': 0.47492652103818694,
  'train_acc': 0.8168103448275862,
  'valid_loss': 0.4621698558330536,
  'valid_acc': 0.78125},
 {'epoch': 88.0,
  'train_loss': 0.47358470127500335,
  'train_acc': 0.8211206896551724,
  'valid_loss': 0.4607706367969513,
  'valid_acc': 0.7890625},
 {'epoch': 89.0,
  'train_loss': 0.4722549504247205,
  'train_acc': 0.8211206896551724,
  'valid_loss': 0.45938345789909363,
  'valid_acc': 0.7890625},
 {'epoch': 90.0,
  'train_loss': 0.4709371040607321,
  'train_acc': 0.8211206896551724,
  'valid_loss': 0.45800817012786865,
  'valid_acc': 0.7890625},
 {'epoch': 91.0,
  'train_loss': 0.46963099775643186,
  'train_acc': 0.8232758620689655,
  'valid_loss': 0.45664459466934204,
  'valid_acc': 0.7890625},
 {'epoch': 92.0,
  'train_loss': 0.468336532855856,
  'train_acc': 0.8275862068965517,
  'valid_loss': 0.45529258251190186,
  'valid_acc': 0.7890625},
 {'epoch': 93.0,
  'train_loss': 0.46705347916175577,
  'train_acc': 0.834051724137931,
  'valid_loss': 0.45395195484161377,
  'valid_acc': 0.796875},
 {'epoch': 94.0,
  'train_loss': 0.4657817709034887,
  'train_acc': 0.8362068965517241,
  'valid_loss': 0.452622652053833,
  'valid_acc': 0.8046875},
 {'epoch': 95.0,
  'train_loss': 0.4645211449984846,
  'train_acc': 0.8383620689655172,
  'valid_loss': 0.45130446553230286,
  'valid_acc': 0.8046875},
 {'epoch': 96.0,
  'train_loss': 0.4632716014467437,
  'train_acc': 0.8426724137931034,
  'valid_loss': 0.44999733567237854,
  'valid_acc': 0.8046875},
 {'epoch': 97.0,
  'train_loss': 0.46203294293633823,
  'train_acc': 0.8448275862068966,
  'valid_loss': 0.4487009644508362,
  'valid_acc': 0.8046875},
 {'epoch': 98.0,
  'train_loss': 0.4608049721553408,
  'train_acc': 0.8448275862068966,
  'valid_loss': 0.447415292263031,
  'valid_acc': 0.8046875},
 {'epoch': 99.0,
  'train_loss': 0.45958755756246633,
  'train_acc': 0.8448275862068966,
  'valid_loss': 0.4461402893066406,
  'valid_acc': 0.8046875},
 {'epoch': 100.0,
  'train_loss': 0.45838060050175106,
  'train_acc': 0.8448275862068966,
  'valid_loss': 0.44487571716308594,
  'valid_acc': 0.8046875},
 {'epoch': 101.0,
  'train_loss': 0.4571839036612675,
  'train_acc': 0.8448275862068966,
  'valid_loss': 0.443621426820755,
  'valid_acc': 0.8046875},
 {'epoch': 102.0,
  'train_loss': 0.4559974341556944,
  'train_acc': 0.8448275862068966,
  'valid_loss': 0.44237738847732544,
  'valid_acc': 0.8046875},
 {'epoch': 103.0,
  'train_loss': 0.45482102755842535,
  'train_acc': 0.8491379310344828,
  'valid_loss': 0.44114333391189575,
  'valid_acc': 0.8046875},
 {'epoch': 104.0,
  'train_loss': 0.45365445367221174,
  'train_acc': 0.8491379310344828,
  'valid_loss': 0.43991929292678833,
  'valid_acc': 0.8046875},
 {'epoch': 105.0,
  'train_loss': 0.45249781115301724,
  'train_acc': 0.8491379310344828,
  'valid_loss': 0.4387049674987793,
  'valid_acc': 0.8046875},
 {'epoch': 106.0,
  'train_loss': 0.45135073826230804,
  'train_acc': 0.8491379310344828,
  'valid_loss': 0.4375004470348358,
  'valid_acc': 0.8125},
 {'epoch': 107.0,
  'train_loss': 0.45021320211476296,
  'train_acc': 0.8491379310344828,
  'valid_loss': 0.4363054931163788,
  'valid_acc': 0.8125},
 {'epoch': 108.0,
  'train_loss': 0.4490851040544181,
  'train_acc': 0.8491379310344828,
  'valid_loss': 0.4351199269294739,
  'valid_acc': 0.8203125},
 {'epoch': 109.0,
  'train_loss': 0.4479663783106311,
  'train_acc': 0.8512931034482759,
  'valid_loss': 0.4339437484741211,
  'valid_acc': 0.8203125},
 {'epoch': 110.0,
  'train_loss': 0.4468568604567955,
  'train_acc': 0.8512931034482759,
  'valid_loss': 0.4327767789363861,
  'valid_acc': 0.8203125},
 {'epoch': 111.0,
  'train_loss': 0.4457563202956627,
  'train_acc': 0.8512931034482759,
  'valid_loss': 0.43161892890930176,
  'valid_acc': 0.8203125},
 {'epoch': 112.0,
  'train_loss': 0.4446647249419114,
  'train_acc': 0.853448275862069,
  'valid_loss': 0.4304701089859009,
  'valid_acc': 0.8203125},
 {'epoch': 113.0,
  'train_loss': 0.4435821072808627,
  'train_acc': 0.853448275862069,
  'valid_loss': 0.4293302297592163,
  'valid_acc': 0.8203125},
 {'epoch': 114.0,
  'train_loss': 0.442508105573983,
  'train_acc': 0.8599137931034483,
  'valid_loss': 0.42819908261299133,
  'valid_acc': 0.8203125},
 {'epoch': 115.0,
  'train_loss': 0.44144278559191474,
  'train_acc': 0.8599137931034483,
  'valid_loss': 0.42707669734954834,
  'valid_acc': 0.8203125},
 {'epoch': 116.0,
  'train_loss': 0.44038595002273034,
  'train_acc': 0.8599137931034483,
  'valid_loss': 0.4259628653526306,
  'valid_acc': 0.8203125},
 {'epoch': 117.0,
  'train_loss': 0.43933756598110857,
  'train_acc': 0.8599137931034483,
  'valid_loss': 0.424857497215271,
  'valid_acc': 0.8203125},
 {'epoch': 118.0,
  'train_loss': 0.43829746904044314,
  'train_acc': 0.8620689655172413,
  'valid_loss': 0.4237605333328247,
  'valid_acc': 0.8203125},
 {'epoch': 119.0,
  'train_loss': 0.4372657249713766,
  'train_acc': 0.8642241379310345,
  'valid_loss': 0.4226718544960022,
  'valid_acc': 0.8984375},
 {'epoch': 120.0,
  'train_loss': 0.43624200492069637,
  'train_acc': 0.8642241379310345,
  'valid_loss': 0.4215913414955139,
  'valid_acc': 0.8984375},
 {'epoch': 121.0,
  'train_loss': 0.43522624311776,
  'train_acc': 0.8642241379310345,
  'valid_loss': 0.4205189347267151,
  'valid_acc': 0.8984375},
 {'epoch': 122.0,
  'train_loss': 0.4342184724478886,
  'train_acc': 0.8663793103448276,
  'valid_loss': 0.41945451498031616,
  'valid_acc': 0.8984375},
 {'epoch': 123.0,
  'train_loss': 0.43321859425511855,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.41839802265167236,
  'valid_acc': 0.8984375},
 {'epoch': 124.0,
  'train_loss': 0.4322263454568797,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.41734933853149414,
  'valid_acc': 0.8984375},
 {'epoch': 125.0,
  'train_loss': 0.43124179182381467,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.4163084030151367,
  'valid_acc': 0.8984375},
 {'epoch': 126.0,
  'train_loss': 0.43026470315867454,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.41527506709098816,
  'valid_acc': 0.8984375},
 {'epoch': 127.0,
  'train_loss': 0.42929527677338697,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.41424936056137085,
  'valid_acc': 0.90625},
 {'epoch': 128.0,
  'train_loss': 0.42833305227345436,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.4132310152053833,
  'valid_acc': 0.90625},
 {'epoch': 129.0,
  'train_loss': 0.4273781940854829,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.4122201204299927,
  'valid_acc': 0.90625},
 {'epoch': 130.0,
  'train_loss': 0.4264304391269026,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.41121649742126465,
  'valid_acc': 0.90625},
 {'epoch': 131.0,
  'train_loss': 0.4254899189389985,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.4102201461791992,
  'valid_acc': 0.90625},
 {'epoch': 132.0,
  'train_loss': 0.42455633755387934,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.4092308580875397,
  'valid_acc': 0.90625},
 {'epoch': 133.0,
  'train_loss': 0.42362982651283,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.4082486629486084,
  'valid_acc': 0.90625},
 {'epoch': 134.0,
  'train_loss': 0.4227101556186018,
  'train_acc': 0.8685344827586207,
  'valid_loss': 0.40727338194847107,
  'valid_acc': 0.90625},
 {'epoch': 135.0,
  'train_loss': 0.4217972919858735,
  'train_acc': 0.8706896551724138,
  'valid_loss': 0.4063050448894501,
  'valid_acc': 0.90625},
 {'epoch': 136.0,
  'train_loss': 0.4208910711880388,
  'train_acc': 0.8728448275862069,
  'valid_loss': 0.40534353256225586,
  'valid_acc': 0.90625},
 {'epoch': 137.0,
  'train_loss': 0.41999155899574014,
  'train_acc': 0.8728448275862069,
  'valid_loss': 0.40438878536224365,
  'valid_acc': 0.90625},
 {'epoch': 138.0,
  'train_loss': 0.41909852521172886,
  'train_acc': 0.8728448275862069,
  'valid_loss': 0.4034406840801239,
  'valid_acc': 0.90625},
 {'epoch': 139.0,
  'train_loss': 0.41821210137728987,
  'train_acc': 0.8728448275862069,
  'valid_loss': 0.4024991989135742,
  'valid_acc': 0.90625},
 {'epoch': 140.0,
  'train_loss': 0.4173319586392107,
  'train_acc': 0.8728448275862069,
  'valid_loss': 0.40156421065330505,
  'valid_acc': 0.90625},
 {'epoch': 141.0,
  'train_loss': 0.41645826142409753,
  'train_acc': 0.875,
  'valid_loss': 0.4006357192993164,
  'valid_acc': 0.90625},
 {'epoch': 142.0,
  'train_loss': 0.4155908781906654,
  'train_acc': 0.875,
  'valid_loss': 0.39971357583999634,
  'valid_acc': 0.90625},
 {'epoch': 143.0,
  'train_loss': 0.4147295458563443,
  'train_acc': 0.875,
  'valid_loss': 0.39879775047302246,
  'valid_acc': 0.90625},
 {'epoch': 144.0,
  'train_loss': 0.4138744617330617,
  'train_acc': 0.875,
  'valid_loss': 0.39788818359375,
  'valid_acc': 0.90625},
 {'epoch': 145.0,
  'train_loss': 0.41302542850889007,
  'train_acc': 0.875,
  'valid_loss': 0.39698484539985657,
  'valid_acc': 0.90625},
 {'epoch': 146.0,
  'train_loss': 0.4121824461838295,
  'train_acc': 0.875,
  'valid_loss': 0.3960876166820526,
  'valid_acc': 0.90625},
 {'epoch': 147.0,
  'train_loss': 0.4113452845606311,
  'train_acc': 0.875,
  'valid_loss': 0.3951963782310486,
  'valid_acc': 0.90625},
 {'epoch': 148.0,
  'train_loss': 0.4105140422952586,
  'train_acc': 0.875,
  'valid_loss': 0.3943111300468445,
  'valid_acc': 0.90625},
 {'epoch': 149.0,
  'train_loss': 0.40968858784642714,
  'train_acc': 0.875,
  'valid_loss': 0.3934318423271179,
  'valid_acc': 0.90625},
 {'epoch': 150.0,
  'train_loss': 0.40886885544349405,
  'train_acc': 0.875,
  'valid_loss': 0.39255842566490173,
  'valid_acc': 0.90625}]
In [258]:
import hiplot as hip
hip.Experiment.from_iterable(data).display(force_full_width = True)
HiPlot
Loading HiPlot...
Out[258]:
<hiplot.ipython.IPythonExperimentDisplayed at 0x1e0b3bc9c18>

The above graph gives us a very nice way to visualize our expected general patterns: as the number of epoch increases, train and validation loss decreases while train and validation accuracy increase

Further, we can investigate the magnitude of each weight parameter to shed insight on the variables that had a higher level of influence on our prediction (assuming that higher magnitudes correlate with higher importance)

NOTE: this is only possible as our model is just a one layer linear operation. If this was a "deep" model, interpretation by weight magnitude would not be possible

In [259]:
params = list(model.parameters())[0].detach().cpu().view(-1).numpy()
param_df = pd.DataFrame(params).T
param_df.columns = df.columns
param_df
Out[259]:
radius_mean texture_mean perimeter_mean area_mean smoothness_mean compactness_mean concavity_mean points_mean symmetry_mean dimension_mean ... radius_worst texture_worst perimeter_worst area_worst smoothness_worst compactness_worst concavity_worst points_worst symmetry_worst dimension_worst
0 1.161765 0.677237 2.255095 3.453925 -0.381582 2.046112 5.107539 3.268645 0.249327 0.61387 ... 2.525464 0.678428 -0.570225 5.125237 0.906672 3.380883 1.896072 4.835514 2.884102 2.432787

1 rows × 30 columns

In [260]:
plt.figure(figsize = (4,8))
sns.heatmap(param_df.T.sort_values(0,ascending = False))
Out[260]:
<matplotlib.axes._subplots.AxesSubplot at 0x1e09d070828>

Assuming that weights with higher magnitudes equate to a higher level of importance, we can see that

  1. area_worst
  2. area_se and
  3. points_worst

were the top 3 variables that helped us differentiate between Malign and Benign cells.

Conclusion

Logistic Regression is a poweful method to analyze the relationship between quantitative and binary qualitative variables that uses the techniques of DL to present a meaninful relationship.

Given its "shallow" architecture in comparison with alternative DL architectures, it does not necessitate much data to learn relationships.

As such, Logistic Regression is an important concept to have on anyone's Data Science arsenal