Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

Bidirectional Multi-layer RNN with LSTM with Own Dataset in CSV Format (AG News)

Dataset Description

AG's News Topic Classification Dataset

Version 3, Updated 09/09/2015


ORIGIN

AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000  news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic community for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc), xml, data compression, data streaming, and any other non-commercial activity. For more information, please refer to the link http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html .

The AG's news topic classification dataset is constructed by Xiang Zhang ([email protected]) from the dataset above. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).


DESCRIPTION

The AG's news topic classification dataset is constructed by choosing 4 largest classes from the original corpus. Each class contains 30,000 training samples and 1,900 testing samples. The total number of training samples is 120,000 and testing 7,600.

The file classes.txt contains a list of classes corresponding to each label.

The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 to 4), title and description. The title and description are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is "\n".
In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch


import torch
import torch.nn.functional as F
from torchtext import data
from torchtext import datasets
import time
import random
import pandas as pd
import numpy as np

torch.backends.cudnn.deterministic = True
Sebastian Raschka 

CPython 3.7.3
IPython 7.9.0

torch 1.4.0

General Settings

In [2]:
RANDOM_SEED = 123
torch.manual_seed(RANDOM_SEED)

VOCABULARY_SIZE = 5000
LEARNING_RATE = 1e-3
BATCH_SIZE = 128
NUM_EPOCHS = 50
DROPOUT = 0.5
DEVICE = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

EMBEDDING_DIM = 128
BIDIRECTIONAL = True
HIDDEN_DIM = 256
NUM_LAYERS = 2
OUTPUT_DIM = 4

Dataset

The AG News dataset is available from Xiang Zhang's Google Drive folder at

https://drive.google.com/drive/u/0/folders/0Bz8a_Dbh9Qhbfll6bVpmNUtUcFdjYmF2SEpmZUZUcVNiMUw1TWN6RDV3a0JHT3kxLVhVR2M

From the Google Drive folder, download the file

  • ag_news_csv.tar.gz
In [3]:
# !tar xvzf  ag_news_csv.tar.gz
In [4]:
!cat ag_news_csv/classes.txt
World
Sports
Business
Sci/Tech

Check that the dataset looks okay:

In [5]:
df = pd.read_csv('ag_news_csv/train.csv', header=None, index_col=None)
df.columns = ['classlabel', 'title', 'content']
df['classlabel'] = df['classlabel']-1
df.head()
Out[5]:
classlabel title content
0 2 Wall St. Bears Claw Back Into the Black (Reuters) Reuters - Short-sellers, Wall Street's dwindli...
1 2 Carlyle Looks Toward Commercial Aerospace (Reu... Reuters - Private investment firm Carlyle Grou...
2 2 Oil and Economy Cloud Stocks' Outlook (Reuters) Reuters - Soaring crude prices plus worries\ab...
3 2 Iraq Halts Oil Exports from Main Southern Pipe... Reuters - Authorities have halted oil export\f...
4 2 Oil prices soar to all-time record, posing new... AFP - Tearaway world oil prices, toppling reco...
In [6]:
np.unique(df['classlabel'].values)
Out[6]:
array([0, 1, 2, 3])
In [7]:
np.bincount(df['classlabel'])
Out[7]:
array([30000, 30000, 30000, 30000])
In [8]:
df[['classlabel', 'content']].to_csv('ag_news_csv/train_prepocessed.csv', index=None)
In [9]:
df = pd.read_csv('ag_news_csv/test.csv', header=None, index_col=None)
df.columns = ['classlabel', 'title', 'content']
df['classlabel'] = df['classlabel']-1
df.head()
Out[9]:
classlabel title content
0 2 Fears for T N pension after talks Unions representing workers at Turner Newall...
1 3 The Race is On: Second Private Team Sets Launc... SPACE.com - TORONTO, Canada -- A second\team o...
2 3 Ky. Company Wins Grant to Study Peptides (AP) AP - A company founded by a chemistry research...
3 3 Prediction Unit Helps Forecast Wildfires (AP) AP - It's barely dawn when Mike Fitzpatrick st...
4 3 Calif. Aims to Limit Farm-Related Smog (AP) AP - Southern California's smog-fighting agenc...
In [10]:
np.unique(df['classlabel'].values)
Out[10]:
array([0, 1, 2, 3])
In [11]:
np.bincount(df['classlabel'])
Out[11]:
array([1900, 1900, 1900, 1900])
In [12]:
df[['classlabel', 'content']].to_csv('ag_news_csv/test_prepocessed.csv', index=None)
In [13]:
del df

Define the Label and Text field formatters:

In [14]:
TEXT = data.Field(sequential=True,
                  tokenize='spacy',
                  include_lengths=True) # necessary for packed_padded_sequence

LABEL = data.LabelField(dtype=torch.float)


# If you get an error [E050] Can't find model 'en'
# you need to run the following on your command line:
#  python -m spacy download en

Process the dataset:

In [15]:
fields = [('classlabel', LABEL), ('content', TEXT)]

train_dataset = data.TabularDataset(
    path="ag_news_csv/train_prepocessed.csv", format='csv',
    skip_header=True, fields=fields)

test_dataset = data.TabularDataset(
    path="ag_news_csv/test_prepocessed.csv", format='csv',
    skip_header=True, fields=fields)

Split the training dataset into training and validation:

In [16]:
train_data, valid_data = train_dataset.split(
    split_ratio=[0.95, 0.05],
    random_state=random.seed(RANDOM_SEED))

print(f'Num Train: {len(train_data)}')
print(f'Num Valid: {len(valid_data)}')
Num Train: 114000
Num Valid: 6000

Build the vocabulary based on the top "VOCABULARY_SIZE" words:

In [17]:
TEXT.build_vocab(train_data,
                 max_size=VOCABULARY_SIZE,
                 vectors='glove.6B.100d',
                 unk_init=torch.Tensor.normal_)

LABEL.build_vocab(train_data)

print(f'Vocabulary size: {len(TEXT.vocab)}')
print(f'Number of classes: {len(LABEL.vocab)}')
Vocabulary size: 5002
Number of classes: 4
In [18]:
list(LABEL.vocab.freqs)[-10:]
Out[18]:
['1', '3', '0', '2']

The TEXT.vocab dictionary will contain the word counts and indices. The reason why the number of words is VOCABULARY_SIZE + 2 is that it contains to special tokens for padding and unknown words: <unk> and <pad>.

Make dataset iterators:

In [19]:
train_loader, valid_loader, test_loader = data.BucketIterator.splits(
    (train_data, valid_data, test_dataset), 
    batch_size=BATCH_SIZE,
    sort_within_batch=True, # necessary for packed_padded_sequence
    sort_key=lambda x: len(x.content),
    device=DEVICE)

Testing the iterators (note that the number of rows depends on the longest document in the respective batch):

In [20]:
print('Train')
for batch in train_loader:
    print(f'Text matrix size: {batch.content[0].size()}')
    print(f'Target vector size: {batch.classlabel.size()}')
    break
    
print('\nValid:')
for batch in valid_loader:
    print(f'Text matrix size: {batch.content[0].size()}')
    print(f'Target vector size: {batch.classlabel.size()}')
    break
    
print('\nTest:')
for batch in test_loader:
    print(f'Text matrix size: {batch.content[0].size()}')
    print(f'Target vector size: {batch.classlabel.size()}')
    break
Train
Text matrix size: torch.Size([35, 128])
Target vector size: torch.Size([128])

Valid:
Text matrix size: torch.Size([17, 128])
Target vector size: torch.Size([128])

Test:
Text matrix size: torch.Size([16, 128])
Target vector size: torch.Size([128])

Model

In [21]:
import torch.nn as nn


class RNN(nn.Module):
    def __init__(self, input_dim, embedding_dim, bidirectional, hidden_dim, num_layers, output_dim, dropout, pad_idx):
        
        super().__init__()
        
        self.embedding = nn.Embedding(input_dim, embedding_dim, padding_idx=pad_idx)
        self.rnn = nn.LSTM(embedding_dim, 
                           hidden_dim,
                           num_layers=num_layers,
                           bidirectional=bidirectional, 
                           dropout=dropout)
        self.fc1 = nn.Linear(hidden_dim * num_layers, 64)
        self.fc2 = nn.Linear(64, output_dim)
        self.dropout = nn.Dropout(dropout)
        
    def forward(self, text, text_length):

        embedded = self.dropout(self.embedding(text))
        packed_embedded = nn.utils.rnn.pack_padded_sequence(embedded, text_length)
        packed_output, (hidden, cell) = self.rnn(packed_embedded)
        # output, output_lengths = nn.utils.rnn.pad_packed_sequence(packed_output)
        hidden = self.dropout(torch.cat((hidden[-2,:,:], hidden[-1,:,:]), dim=1))
        hidden = self.fc1(hidden)
        hidden = self.dropout(hidden)
        hidden = self.fc2(hidden)
        return hidden
In [22]:
INPUT_DIM = len(TEXT.vocab)

PAD_IDX = TEXT.vocab.stoi[TEXT.pad_token]

torch.manual_seed(RANDOM_SEED)
model = RNN(INPUT_DIM, EMBEDDING_DIM, BIDIRECTIONAL, HIDDEN_DIM, NUM_LAYERS, OUTPUT_DIM, DROPOUT, PAD_IDX)
model = model.to(DEVICE)
optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)

Training

In [23]:
def compute_accuracy(model, data_loader, device):
    model.eval()
    correct_pred, num_examples = 0, 0
    with torch.no_grad():
        for batch_idx, batch_data in enumerate(data_loader):
            text, text_lengths = batch_data.content
            logits = model(text, text_lengths)
            _, predicted_labels = torch.max(logits, 1)
            num_examples += batch_data.classlabel.size(0)
            correct_pred += (predicted_labels.long() == batch_data.classlabel.long()).sum()
        return correct_pred.float()/num_examples * 100
In [24]:
start_time = time.time()

for epoch in range(NUM_EPOCHS):
    model.train()
    for batch_idx, batch_data in enumerate(train_loader):
        
        text, text_lengths = batch_data.content
        
        ### FORWARD AND BACK PROP
        logits = model(text, text_lengths)
        cost = F.cross_entropy(logits, batch_data.classlabel.long())
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 50:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} | '
                   f'Cost: {cost:.4f}')

    with torch.set_grad_enabled(False):
        print(f'training accuracy: '
              f'{compute_accuracy(model, train_loader, DEVICE):.2f}%'
              f'\nvalid accuracy: '
              f'{compute_accuracy(model, valid_loader, DEVICE):.2f}%')
        
    print(f'Time elapsed: {(time.time() - start_time)/60:.2f} min')
    
print(f'Total Training Time: {(time.time() - start_time)/60:.2f} min')
print(f'Test accuracy: {compute_accuracy(model, test_loader, DEVICE):.2f}%')
Epoch: 001/050 | Batch 000/891 | Cost: 1.3877
Epoch: 001/050 | Batch 050/891 | Cost: 1.2299
Epoch: 001/050 | Batch 100/891 | Cost: 1.0337
Epoch: 001/050 | Batch 150/891 | Cost: 0.8675
Epoch: 001/050 | Batch 200/891 | Cost: 0.8217
Epoch: 001/050 | Batch 250/891 | Cost: 0.6656
Epoch: 001/050 | Batch 300/891 | Cost: 0.6976
Epoch: 001/050 | Batch 350/891 | Cost: 0.7211
Epoch: 001/050 | Batch 400/891 | Cost: 0.5315
Epoch: 001/050 | Batch 450/891 | Cost: 0.5550
Epoch: 001/050 | Batch 500/891 | Cost: 0.5794
Epoch: 001/050 | Batch 550/891 | Cost: 0.5368
Epoch: 001/050 | Batch 600/891 | Cost: 0.4791
Epoch: 001/050 | Batch 650/891 | Cost: 0.6736
Epoch: 001/050 | Batch 700/891 | Cost: 0.4740
Epoch: 001/050 | Batch 750/891 | Cost: 0.9449
Epoch: 001/050 | Batch 800/891 | Cost: 0.5111
Epoch: 001/050 | Batch 850/891 | Cost: 0.4126
training accuracy: 84.93%
valid accuracy: 84.10%
Time elapsed: 0.50 min
Epoch: 002/050 | Batch 000/891 | Cost: 0.4338
Epoch: 002/050 | Batch 050/891 | Cost: 0.4728
Epoch: 002/050 | Batch 100/891 | Cost: 0.4738
Epoch: 002/050 | Batch 150/891 | Cost: 0.5698
Epoch: 002/050 | Batch 200/891 | Cost: 0.5169
Epoch: 002/050 | Batch 250/891 | Cost: 0.3508
Epoch: 002/050 | Batch 300/891 | Cost: 0.5877
Epoch: 002/050 | Batch 350/891 | Cost: 0.5818
Epoch: 002/050 | Batch 400/891 | Cost: 0.4324
Epoch: 002/050 | Batch 450/891 | Cost: 0.3107
Epoch: 002/050 | Batch 500/891 | Cost: 0.4036
Epoch: 002/050 | Batch 550/891 | Cost: 0.4403
Epoch: 002/050 | Batch 600/891 | Cost: 0.6188
Epoch: 002/050 | Batch 650/891 | Cost: 0.3749
Epoch: 002/050 | Batch 700/891 | Cost: 0.4714
Epoch: 002/050 | Batch 750/891 | Cost: 0.4166
Epoch: 002/050 | Batch 800/891 | Cost: 0.4081
Epoch: 002/050 | Batch 850/891 | Cost: 0.5608
training accuracy: 89.13%
valid accuracy: 88.40%
Time elapsed: 1.00 min
Epoch: 003/050 | Batch 000/891 | Cost: 0.3957
Epoch: 003/050 | Batch 050/891 | Cost: 0.4003
Epoch: 003/050 | Batch 100/891 | Cost: 0.3422
Epoch: 003/050 | Batch 150/891 | Cost: 0.3350
Epoch: 003/050 | Batch 200/891 | Cost: 0.2862
Epoch: 003/050 | Batch 250/891 | Cost: 0.7255
Epoch: 003/050 | Batch 300/891 | Cost: 0.3194
Epoch: 003/050 | Batch 350/891 | Cost: 0.4845
Epoch: 003/050 | Batch 400/891 | Cost: 0.3754
Epoch: 003/050 | Batch 450/891 | Cost: 0.4159
Epoch: 003/050 | Batch 500/891 | Cost: 0.3210
Epoch: 003/050 | Batch 550/891 | Cost: 0.3639
Epoch: 003/050 | Batch 600/891 | Cost: 0.2480
Epoch: 003/050 | Batch 650/891 | Cost: 0.3586
Epoch: 003/050 | Batch 700/891 | Cost: 0.8477
Epoch: 003/050 | Batch 750/891 | Cost: 0.2967
Epoch: 003/050 | Batch 800/891 | Cost: 0.3125
Epoch: 003/050 | Batch 850/891 | Cost: 0.2451
training accuracy: 90.50%
valid accuracy: 89.47%
Time elapsed: 1.51 min
Epoch: 004/050 | Batch 000/891 | Cost: 0.2751
Epoch: 004/050 | Batch 050/891 | Cost: 0.3306
Epoch: 004/050 | Batch 100/891 | Cost: 0.8538
Epoch: 004/050 | Batch 150/891 | Cost: 0.5015
Epoch: 004/050 | Batch 200/891 | Cost: 0.3141
Epoch: 004/050 | Batch 250/891 | Cost: 0.2756
Epoch: 004/050 | Batch 300/891 | Cost: 0.2920
Epoch: 004/050 | Batch 350/891 | Cost: 0.4124
Epoch: 004/050 | Batch 400/891 | Cost: 0.4118
Epoch: 004/050 | Batch 450/891 | Cost: 0.3355
Epoch: 004/050 | Batch 500/891 | Cost: 0.2594
Epoch: 004/050 | Batch 550/891 | Cost: 0.2008
Epoch: 004/050 | Batch 600/891 | Cost: 0.2917
Epoch: 004/050 | Batch 650/891 | Cost: 0.1437
Epoch: 004/050 | Batch 700/891 | Cost: 0.2682
Epoch: 004/050 | Batch 750/891 | Cost: 0.2572
Epoch: 004/050 | Batch 800/891 | Cost: 0.2653
Epoch: 004/050 | Batch 850/891 | Cost: 0.1637
training accuracy: 91.44%
valid accuracy: 90.28%
Time elapsed: 2.02 min
Epoch: 005/050 | Batch 000/891 | Cost: 0.3751
Epoch: 005/050 | Batch 050/891 | Cost: 0.3224
Epoch: 005/050 | Batch 100/891 | Cost: 0.4595
Epoch: 005/050 | Batch 150/891 | Cost: 0.4083
Epoch: 005/050 | Batch 200/891 | Cost: 0.3154
Epoch: 005/050 | Batch 250/891 | Cost: 0.2272
Epoch: 005/050 | Batch 300/891 | Cost: 0.2790
Epoch: 005/050 | Batch 350/891 | Cost: 0.3233
Epoch: 005/050 | Batch 400/891 | Cost: 0.3187
Epoch: 005/050 | Batch 450/891 | Cost: 0.2227
Epoch: 005/050 | Batch 500/891 | Cost: 0.3384
Epoch: 005/050 | Batch 550/891 | Cost: 0.3132
Epoch: 005/050 | Batch 600/891 | Cost: 0.3325
Epoch: 005/050 | Batch 650/891 | Cost: 0.2679
Epoch: 005/050 | Batch 700/891 | Cost: 0.4807
Epoch: 005/050 | Batch 750/891 | Cost: 0.2496
Epoch: 005/050 | Batch 800/891 | Cost: 0.2778
Epoch: 005/050 | Batch 850/891 | Cost: 0.2846
training accuracy: 92.07%
valid accuracy: 90.63%
Time elapsed: 2.55 min
Epoch: 006/050 | Batch 000/891 | Cost: 0.6337
Epoch: 006/050 | Batch 050/891 | Cost: 0.1976
Epoch: 006/050 | Batch 100/891 | Cost: 0.3846
Epoch: 006/050 | Batch 150/891 | Cost: 0.2781
Epoch: 006/050 | Batch 200/891 | Cost: 0.2588
Epoch: 006/050 | Batch 250/891 | Cost: 0.3977
Epoch: 006/050 | Batch 300/891 | Cost: 0.3890
Epoch: 006/050 | Batch 350/891 | Cost: 0.3945
Epoch: 006/050 | Batch 400/891 | Cost: 0.3439
Epoch: 006/050 | Batch 450/891 | Cost: 0.2981
Epoch: 006/050 | Batch 500/891 | Cost: 0.3398
Epoch: 006/050 | Batch 550/891 | Cost: 0.3683
Epoch: 006/050 | Batch 600/891 | Cost: 0.2633
Epoch: 006/050 | Batch 650/891 | Cost: 0.2803
Epoch: 006/050 | Batch 700/891 | Cost: 0.3132
Epoch: 006/050 | Batch 750/891 | Cost: 0.1624
Epoch: 006/050 | Batch 800/891 | Cost: 0.2108
Epoch: 006/050 | Batch 850/891 | Cost: 0.3767
training accuracy: 92.38%
valid accuracy: 91.03%
Time elapsed: 3.08 min
Epoch: 007/050 | Batch 000/891 | Cost: 0.2237
Epoch: 007/050 | Batch 050/891 | Cost: 0.2486
Epoch: 007/050 | Batch 100/891 | Cost: 0.4113
Epoch: 007/050 | Batch 150/891 | Cost: 0.1935
Epoch: 007/050 | Batch 200/891 | Cost: 0.1993
Epoch: 007/050 | Batch 250/891 | Cost: 0.2414
Epoch: 007/050 | Batch 300/891 | Cost: 0.3459
Epoch: 007/050 | Batch 350/891 | Cost: 0.2281
Epoch: 007/050 | Batch 400/891 | Cost: 0.3353
Epoch: 007/050 | Batch 450/891 | Cost: 0.2310
Epoch: 007/050 | Batch 500/891 | Cost: 0.2138
Epoch: 007/050 | Batch 550/891 | Cost: 0.2781
Epoch: 007/050 | Batch 600/891 | Cost: 0.1706
Epoch: 007/050 | Batch 650/891 | Cost: 0.3315
Epoch: 007/050 | Batch 700/891 | Cost: 0.4015
Epoch: 007/050 | Batch 750/891 | Cost: 0.6616
Epoch: 007/050 | Batch 800/891 | Cost: 0.1962
Epoch: 007/050 | Batch 850/891 | Cost: 0.3632
training accuracy: 92.55%
valid accuracy: 90.55%
Time elapsed: 3.61 min
Epoch: 008/050 | Batch 000/891 | Cost: 0.1526
Epoch: 008/050 | Batch 050/891 | Cost: 0.2569
Epoch: 008/050 | Batch 100/891 | Cost: 0.2024
Epoch: 008/050 | Batch 150/891 | Cost: 0.4151
Epoch: 008/050 | Batch 200/891 | Cost: 0.3168
Epoch: 008/050 | Batch 250/891 | Cost: 0.2224
Epoch: 008/050 | Batch 300/891 | Cost: 0.2139
Epoch: 008/050 | Batch 350/891 | Cost: 0.1567
Epoch: 008/050 | Batch 400/891 | Cost: 0.1942
Epoch: 008/050 | Batch 450/891 | Cost: 0.3651
Epoch: 008/050 | Batch 500/891 | Cost: 0.4346
Epoch: 008/050 | Batch 550/891 | Cost: 0.2333
Epoch: 008/050 | Batch 600/891 | Cost: 0.4014
Epoch: 008/050 | Batch 650/891 | Cost: 0.2443
Epoch: 008/050 | Batch 700/891 | Cost: 0.2304
Epoch: 008/050 | Batch 750/891 | Cost: 0.3688
Epoch: 008/050 | Batch 800/891 | Cost: 0.2376
Epoch: 008/050 | Batch 850/891 | Cost: 0.2561
training accuracy: 93.36%
valid accuracy: 91.45%
Time elapsed: 4.13 min
Epoch: 009/050 | Batch 000/891 | Cost: 0.2353
Epoch: 009/050 | Batch 050/891 | Cost: 0.3266
Epoch: 009/050 | Batch 100/891 | Cost: 0.2345
Epoch: 009/050 | Batch 150/891 | Cost: 0.3621
Epoch: 009/050 | Batch 200/891 | Cost: 0.2544
Epoch: 009/050 | Batch 250/891 | Cost: 0.3480
Epoch: 009/050 | Batch 300/891 | Cost: 0.4042
Epoch: 009/050 | Batch 350/891 | Cost: 0.2110
Epoch: 009/050 | Batch 400/891 | Cost: 0.1583
Epoch: 009/050 | Batch 450/891 | Cost: 0.2829
Epoch: 009/050 | Batch 500/891 | Cost: 0.2130
Epoch: 009/050 | Batch 550/891 | Cost: 0.2088
Epoch: 009/050 | Batch 600/891 | Cost: 0.3278
Epoch: 009/050 | Batch 650/891 | Cost: 0.3618
Epoch: 009/050 | Batch 700/891 | Cost: 0.2778
Epoch: 009/050 | Batch 750/891 | Cost: 0.4374
Epoch: 009/050 | Batch 800/891 | Cost: 0.2463
Epoch: 009/050 | Batch 850/891 | Cost: 0.2187
training accuracy: 93.74%
valid accuracy: 91.37%
Time elapsed: 4.65 min
Epoch: 010/050 | Batch 000/891 | Cost: 0.1810
Epoch: 010/050 | Batch 050/891 | Cost: 0.2491
Epoch: 010/050 | Batch 100/891 | Cost: 0.1872
Epoch: 010/050 | Batch 150/891 | Cost: 0.5379
Epoch: 010/050 | Batch 200/891 | Cost: 0.3171
Epoch: 010/050 | Batch 250/891 | Cost: 0.1732
Epoch: 010/050 | Batch 300/891 | Cost: 0.2367
Epoch: 010/050 | Batch 350/891 | Cost: 0.2784
Epoch: 010/050 | Batch 400/891 | Cost: 0.4789
Epoch: 010/050 | Batch 450/891 | Cost: 0.2235
Epoch: 010/050 | Batch 500/891 | Cost: 0.2694
Epoch: 010/050 | Batch 550/891 | Cost: 0.2759
Epoch: 010/050 | Batch 600/891 | Cost: 0.2000
Epoch: 010/050 | Batch 650/891 | Cost: 0.2420
Epoch: 010/050 | Batch 700/891 | Cost: 0.2196
Epoch: 010/050 | Batch 750/891 | Cost: 0.3454
Epoch: 010/050 | Batch 800/891 | Cost: 0.2498
Epoch: 010/050 | Batch 850/891 | Cost: 0.2910
training accuracy: 93.97%
valid accuracy: 91.20%
Time elapsed: 5.18 min
Epoch: 011/050 | Batch 000/891 | Cost: 0.1767
Epoch: 011/050 | Batch 050/891 | Cost: 0.1857
Epoch: 011/050 | Batch 100/891 | Cost: 0.1880
Epoch: 011/050 | Batch 150/891 | Cost: 0.3116
Epoch: 011/050 | Batch 200/891 | Cost: 0.1706
Epoch: 011/050 | Batch 250/891 | Cost: 0.2218
Epoch: 011/050 | Batch 300/891 | Cost: 0.1673
Epoch: 011/050 | Batch 350/891 | Cost: 0.4530
Epoch: 011/050 | Batch 400/891 | Cost: 0.2309
Epoch: 011/050 | Batch 450/891 | Cost: 0.1871
Epoch: 011/050 | Batch 500/891 | Cost: 0.1490
Epoch: 011/050 | Batch 550/891 | Cost: 0.2857
Epoch: 011/050 | Batch 600/891 | Cost: 0.2446
Epoch: 011/050 | Batch 650/891 | Cost: 0.1511
Epoch: 011/050 | Batch 700/891 | Cost: 0.1921
Epoch: 011/050 | Batch 750/891 | Cost: 0.3078
Epoch: 011/050 | Batch 800/891 | Cost: 0.1326
Epoch: 011/050 | Batch 850/891 | Cost: 0.1922
training accuracy: 94.24%
valid accuracy: 91.45%
Time elapsed: 5.71 min
Epoch: 012/050 | Batch 000/891 | Cost: 0.1514
Epoch: 012/050 | Batch 050/891 | Cost: 0.2781
Epoch: 012/050 | Batch 100/891 | Cost: 0.1480
Epoch: 012/050 | Batch 150/891 | Cost: 0.1900
Epoch: 012/050 | Batch 200/891 | Cost: 0.2881
Epoch: 012/050 | Batch 250/891 | Cost: 0.3169
Epoch: 012/050 | Batch 300/891 | Cost: 0.1878
Epoch: 012/050 | Batch 350/891 | Cost: 0.1954
Epoch: 012/050 | Batch 400/891 | Cost: 0.2762
Epoch: 012/050 | Batch 450/891 | Cost: 0.2427
Epoch: 012/050 | Batch 500/891 | Cost: 0.1896
Epoch: 012/050 | Batch 550/891 | Cost: 0.2148
Epoch: 012/050 | Batch 600/891 | Cost: 0.1580
Epoch: 012/050 | Batch 650/891 | Cost: 0.2480
Epoch: 012/050 | Batch 700/891 | Cost: 0.2760
Epoch: 012/050 | Batch 750/891 | Cost: 0.3120
Epoch: 012/050 | Batch 800/891 | Cost: 0.1030
Epoch: 012/050 | Batch 850/891 | Cost: 0.2267
training accuracy: 94.46%
valid accuracy: 91.68%
Time elapsed: 6.23 min
Epoch: 013/050 | Batch 000/891 | Cost: 0.2121
Epoch: 013/050 | Batch 050/891 | Cost: 0.2147
Epoch: 013/050 | Batch 100/891 | Cost: 0.2820
Epoch: 013/050 | Batch 150/891 | Cost: 0.3664
Epoch: 013/050 | Batch 200/891 | Cost: 0.1671
Epoch: 013/050 | Batch 250/891 | Cost: 0.2233
Epoch: 013/050 | Batch 300/891 | Cost: 0.2576
Epoch: 013/050 | Batch 350/891 | Cost: 0.4332
Epoch: 013/050 | Batch 400/891 | Cost: 0.2597
Epoch: 013/050 | Batch 450/891 | Cost: 0.1760
Epoch: 013/050 | Batch 500/891 | Cost: 0.3148
Epoch: 013/050 | Batch 550/891 | Cost: 0.2573
Epoch: 013/050 | Batch 600/891 | Cost: 0.2120
Epoch: 013/050 | Batch 650/891 | Cost: 0.2722
Epoch: 013/050 | Batch 700/891 | Cost: 0.0958
Epoch: 013/050 | Batch 750/891 | Cost: 0.1930
Epoch: 013/050 | Batch 800/891 | Cost: 0.0750
Epoch: 013/050 | Batch 850/891 | Cost: 0.2382
training accuracy: 94.74%
valid accuracy: 91.47%
Time elapsed: 6.76 min
Epoch: 014/050 | Batch 000/891 | Cost: 0.1844
Epoch: 014/050 | Batch 050/891 | Cost: 0.1602
Epoch: 014/050 | Batch 100/891 | Cost: 0.2462
Epoch: 014/050 | Batch 150/891 | Cost: 0.1282
Epoch: 014/050 | Batch 200/891 | Cost: 0.1453
Epoch: 014/050 | Batch 250/891 | Cost: 0.2589
Epoch: 014/050 | Batch 300/891 | Cost: 0.2492
Epoch: 014/050 | Batch 350/891 | Cost: 0.0958
Epoch: 014/050 | Batch 400/891 | Cost: 0.6354
Epoch: 014/050 | Batch 450/891 | Cost: 0.1346
Epoch: 014/050 | Batch 500/891 | Cost: 0.3579
Epoch: 014/050 | Batch 550/891 | Cost: 0.1079
Epoch: 014/050 | Batch 600/891 | Cost: 0.1896
Epoch: 014/050 | Batch 650/891 | Cost: 0.2278
Epoch: 014/050 | Batch 700/891 | Cost: 0.4933
Epoch: 014/050 | Batch 750/891 | Cost: 0.3213
Epoch: 014/050 | Batch 800/891 | Cost: 0.2413
Epoch: 014/050 | Batch 850/891 | Cost: 0.2485
training accuracy: 94.84%
valid accuracy: 91.70%
Time elapsed: 7.28 min
Epoch: 015/050 | Batch 000/891 | Cost: 0.2655
Epoch: 015/050 | Batch 050/891 | Cost: 0.0850
Epoch: 015/050 | Batch 100/891 | Cost: 0.2339
Epoch: 015/050 | Batch 150/891 | Cost: 0.1445
Epoch: 015/050 | Batch 200/891 | Cost: 0.1013
Epoch: 015/050 | Batch 250/891 | Cost: 0.2296
Epoch: 015/050 | Batch 300/891 | Cost: 0.1205
Epoch: 015/050 | Batch 350/891 | Cost: 0.1492
Epoch: 015/050 | Batch 400/891 | Cost: 0.3134
Epoch: 015/050 | Batch 450/891 | Cost: 0.2489
Epoch: 015/050 | Batch 500/891 | Cost: 0.1313
Epoch: 015/050 | Batch 550/891 | Cost: 0.2463
Epoch: 015/050 | Batch 600/891 | Cost: 0.1853
Epoch: 015/050 | Batch 650/891 | Cost: 0.1878
Epoch: 015/050 | Batch 700/891 | Cost: 0.2329
Epoch: 015/050 | Batch 750/891 | Cost: 0.1648
Epoch: 015/050 | Batch 800/891 | Cost: 0.1891
Epoch: 015/050 | Batch 850/891 | Cost: 0.1200
training accuracy: 95.22%
valid accuracy: 91.83%
Time elapsed: 7.80 min
Epoch: 016/050 | Batch 000/891 | Cost: 0.2548
Epoch: 016/050 | Batch 050/891 | Cost: 0.3054
Epoch: 016/050 | Batch 100/891 | Cost: 0.1123
Epoch: 016/050 | Batch 150/891 | Cost: 0.1788
Epoch: 016/050 | Batch 200/891 | Cost: 0.0968
Epoch: 016/050 | Batch 250/891 | Cost: 0.2611
Epoch: 016/050 | Batch 300/891 | Cost: 0.1720
Epoch: 016/050 | Batch 350/891 | Cost: 0.1352
Epoch: 016/050 | Batch 400/891 | Cost: 0.2122
Epoch: 016/050 | Batch 450/891 | Cost: 0.3495
Epoch: 016/050 | Batch 500/891 | Cost: 0.2742
Epoch: 016/050 | Batch 550/891 | Cost: 0.3351
Epoch: 016/050 | Batch 600/891 | Cost: 0.0711
Epoch: 016/050 | Batch 650/891 | Cost: 0.1606
Epoch: 016/050 | Batch 700/891 | Cost: 0.1502
Epoch: 016/050 | Batch 750/891 | Cost: 0.1500
Epoch: 016/050 | Batch 800/891 | Cost: 0.1290
Epoch: 016/050 | Batch 850/891 | Cost: 0.1974
training accuracy: 95.38%
valid accuracy: 91.82%
Time elapsed: 8.33 min
Epoch: 017/050 | Batch 000/891 | Cost: 0.3753
Epoch: 017/050 | Batch 050/891 | Cost: 0.2603
Epoch: 017/050 | Batch 100/891 | Cost: 0.0900
Epoch: 017/050 | Batch 150/891 | Cost: 0.1902
Epoch: 017/050 | Batch 200/891 | Cost: 0.2403
Epoch: 017/050 | Batch 250/891 | Cost: 0.1488
Epoch: 017/050 | Batch 300/891 | Cost: 0.1474
Epoch: 017/050 | Batch 350/891 | Cost: 0.2314
Epoch: 017/050 | Batch 400/891 | Cost: 0.1752
Epoch: 017/050 | Batch 450/891 | Cost: 0.1610
Epoch: 017/050 | Batch 500/891 | Cost: 0.2189
Epoch: 017/050 | Batch 550/891 | Cost: 0.2283
Epoch: 017/050 | Batch 600/891 | Cost: 0.2098
Epoch: 017/050 | Batch 650/891 | Cost: 0.2482
Epoch: 017/050 | Batch 700/891 | Cost: 0.1573
Epoch: 017/050 | Batch 750/891 | Cost: 0.1941
Epoch: 017/050 | Batch 800/891 | Cost: 0.1842
Epoch: 017/050 | Batch 850/891 | Cost: 0.1926
training accuracy: 95.67%
valid accuracy: 92.17%
Time elapsed: 8.85 min
Epoch: 018/050 | Batch 000/891 | Cost: 0.2376
Epoch: 018/050 | Batch 050/891 | Cost: 0.1245
Epoch: 018/050 | Batch 100/891 | Cost: 0.1663
Epoch: 018/050 | Batch 150/891 | Cost: 0.1179
Epoch: 018/050 | Batch 200/891 | Cost: 0.2016
Epoch: 018/050 | Batch 250/891 | Cost: 0.1451
Epoch: 018/050 | Batch 300/891 | Cost: 0.1310
Epoch: 018/050 | Batch 350/891 | Cost: 0.2826
Epoch: 018/050 | Batch 400/891 | Cost: 0.1151
Epoch: 018/050 | Batch 450/891 | Cost: 0.0847
Epoch: 018/050 | Batch 500/891 | Cost: 0.3294
Epoch: 018/050 | Batch 550/891 | Cost: 0.2216
Epoch: 018/050 | Batch 600/891 | Cost: 0.3044
Epoch: 018/050 | Batch 650/891 | Cost: 0.2693
Epoch: 018/050 | Batch 700/891 | Cost: 0.1898
Epoch: 018/050 | Batch 750/891 | Cost: 0.1130
Epoch: 018/050 | Batch 800/891 | Cost: 0.4650
Epoch: 018/050 | Batch 850/891 | Cost: 0.1036
training accuracy: 95.91%
valid accuracy: 91.83%
Time elapsed: 9.37 min
Epoch: 019/050 | Batch 000/891 | Cost: 0.1979
Epoch: 019/050 | Batch 050/891 | Cost: 0.3851
Epoch: 019/050 | Batch 100/891 | Cost: 0.1857
Epoch: 019/050 | Batch 150/891 | Cost: 0.1524
Epoch: 019/050 | Batch 200/891 | Cost: 0.2859
Epoch: 019/050 | Batch 250/891 | Cost: 0.1978
Epoch: 019/050 | Batch 300/891 | Cost: 0.1859
Epoch: 019/050 | Batch 350/891 | Cost: 0.3971
Epoch: 019/050 | Batch 400/891 | Cost: 0.1393
Epoch: 019/050 | Batch 450/891 | Cost: 0.4079
Epoch: 019/050 | Batch 500/891 | Cost: 0.3164
Epoch: 019/050 | Batch 550/891 | Cost: 0.2275
Epoch: 019/050 | Batch 600/891 | Cost: 0.0996
Epoch: 019/050 | Batch 650/891 | Cost: 0.1961
Epoch: 019/050 | Batch 700/891 | Cost: 0.1276
Epoch: 019/050 | Batch 750/891 | Cost: 0.2926
Epoch: 019/050 | Batch 800/891 | Cost: 0.1130
Epoch: 019/050 | Batch 850/891 | Cost: 0.1441
training accuracy: 95.94%
valid accuracy: 91.77%
Time elapsed: 9.89 min
Epoch: 020/050 | Batch 000/891 | Cost: 0.1467
Epoch: 020/050 | Batch 050/891 | Cost: 0.1647
Epoch: 020/050 | Batch 100/891 | Cost: 0.2161
Epoch: 020/050 | Batch 150/891 | Cost: 0.1547
Epoch: 020/050 | Batch 200/891 | Cost: 0.4020
Epoch: 020/050 | Batch 250/891 | Cost: 0.2447
Epoch: 020/050 | Batch 300/891 | Cost: 0.2506
Epoch: 020/050 | Batch 350/891 | Cost: 0.1571
Epoch: 020/050 | Batch 400/891 | Cost: 0.1199
Epoch: 020/050 | Batch 450/891 | Cost: 0.0830
Epoch: 020/050 | Batch 500/891 | Cost: 0.1957
Epoch: 020/050 | Batch 550/891 | Cost: 0.1530
Epoch: 020/050 | Batch 600/891 | Cost: 0.4454
Epoch: 020/050 | Batch 650/891 | Cost: 0.1160
Epoch: 020/050 | Batch 700/891 | Cost: 0.1137
Epoch: 020/050 | Batch 750/891 | Cost: 0.1297
Epoch: 020/050 | Batch 800/891 | Cost: 0.3041
Epoch: 020/050 | Batch 850/891 | Cost: 0.0867
training accuracy: 96.05%
valid accuracy: 91.98%
Time elapsed: 10.41 min
Epoch: 021/050 | Batch 000/891 | Cost: 0.1451
Epoch: 021/050 | Batch 050/891 | Cost: 0.1487
Epoch: 021/050 | Batch 100/891 | Cost: 0.0903
Epoch: 021/050 | Batch 150/891 | Cost: 0.1391
Epoch: 021/050 | Batch 200/891 | Cost: 0.1830
Epoch: 021/050 | Batch 250/891 | Cost: 0.1728
Epoch: 021/050 | Batch 300/891 | Cost: 0.2294
Epoch: 021/050 | Batch 350/891 | Cost: 0.0633
Epoch: 021/050 | Batch 400/891 | Cost: 0.1786
Epoch: 021/050 | Batch 450/891 | Cost: 0.0823
Epoch: 021/050 | Batch 500/891 | Cost: 0.0895
Epoch: 021/050 | Batch 550/891 | Cost: 0.1229
Epoch: 021/050 | Batch 600/891 | Cost: 0.2732
Epoch: 021/050 | Batch 650/891 | Cost: 0.1437
Epoch: 021/050 | Batch 700/891 | Cost: 0.0879
Epoch: 021/050 | Batch 750/891 | Cost: 0.1119
Epoch: 021/050 | Batch 800/891 | Cost: 0.1358
Epoch: 021/050 | Batch 850/891 | Cost: 0.1967
training accuracy: 96.15%
valid accuracy: 91.85%
Time elapsed: 10.94 min
Epoch: 022/050 | Batch 000/891 | Cost: 0.2011
Epoch: 022/050 | Batch 050/891 | Cost: 0.1103
Epoch: 022/050 | Batch 100/891 | Cost: 0.2082
Epoch: 022/050 | Batch 150/891 | Cost: 0.1258
Epoch: 022/050 | Batch 200/891 | Cost: 0.3730
Epoch: 022/050 | Batch 250/891 | Cost: 0.4325
Epoch: 022/050 | Batch 300/891 | Cost: 0.2348
Epoch: 022/050 | Batch 350/891 | Cost: 0.1401
Epoch: 022/050 | Batch 400/891 | Cost: 0.3020
Epoch: 022/050 | Batch 450/891 | Cost: 0.1173
Epoch: 022/050 | Batch 500/891 | Cost: 0.1262
Epoch: 022/050 | Batch 550/891 | Cost: 0.2594
Epoch: 022/050 | Batch 600/891 | Cost: 0.1213
Epoch: 022/050 | Batch 650/891 | Cost: 0.0961
Epoch: 022/050 | Batch 700/891 | Cost: 0.1579
Epoch: 022/050 | Batch 750/891 | Cost: 0.1669
Epoch: 022/050 | Batch 800/891 | Cost: 0.1836
Epoch: 022/050 | Batch 850/891 | Cost: 0.1857
training accuracy: 96.38%
valid accuracy: 91.85%
Time elapsed: 11.46 min
Epoch: 023/050 | Batch 000/891 | Cost: 0.1616
Epoch: 023/050 | Batch 050/891 | Cost: 0.0922
Epoch: 023/050 | Batch 100/891 | Cost: 0.2086
Epoch: 023/050 | Batch 150/891 | Cost: 0.4053
Epoch: 023/050 | Batch 200/891 | Cost: 0.2502
Epoch: 023/050 | Batch 250/891 | Cost: 0.1509
Epoch: 023/050 | Batch 300/891 | Cost: 0.3012
Epoch: 023/050 | Batch 350/891 | Cost: 0.1202
Epoch: 023/050 | Batch 400/891 | Cost: 0.4270
Epoch: 023/050 | Batch 450/891 | Cost: 0.2277
Epoch: 023/050 | Batch 500/891 | Cost: 0.1788
Epoch: 023/050 | Batch 550/891 | Cost: 0.1663
Epoch: 023/050 | Batch 600/891 | Cost: 0.1667
Epoch: 023/050 | Batch 650/891 | Cost: 0.1578
Epoch: 023/050 | Batch 700/891 | Cost: 0.1768
Epoch: 023/050 | Batch 750/891 | Cost: 0.1270
Epoch: 023/050 | Batch 800/891 | Cost: 0.1632
Epoch: 023/050 | Batch 850/891 | Cost: 0.2621
training accuracy: 96.47%
valid accuracy: 92.28%
Time elapsed: 11.99 min
Epoch: 024/050 | Batch 000/891 | Cost: 0.0953
Epoch: 024/050 | Batch 050/891 | Cost: 0.0845
Epoch: 024/050 | Batch 100/891 | Cost: 0.1797
Epoch: 024/050 | Batch 150/891 | Cost: 0.1241
Epoch: 024/050 | Batch 200/891 | Cost: 0.1801
Epoch: 024/050 | Batch 250/891 | Cost: 0.2227
Epoch: 024/050 | Batch 300/891 | Cost: 0.4965
Epoch: 024/050 | Batch 350/891 | Cost: 0.1874
Epoch: 024/050 | Batch 400/891 | Cost: 0.1172
Epoch: 024/050 | Batch 450/891 | Cost: 0.2244
Epoch: 024/050 | Batch 500/891 | Cost: 0.1262
Epoch: 024/050 | Batch 550/891 | Cost: 0.2427
Epoch: 024/050 | Batch 600/891 | Cost: 0.1131
Epoch: 024/050 | Batch 650/891 | Cost: 0.2320
Epoch: 024/050 | Batch 700/891 | Cost: 0.1078
Epoch: 024/050 | Batch 750/891 | Cost: 0.0839
Epoch: 024/050 | Batch 800/891 | Cost: 0.2036
Epoch: 024/050 | Batch 850/891 | Cost: 0.1953
training accuracy: 96.49%
valid accuracy: 91.95%
Time elapsed: 12.51 min
Epoch: 025/050 | Batch 000/891 | Cost: 0.2558
Epoch: 025/050 | Batch 050/891 | Cost: 0.1072
Epoch: 025/050 | Batch 100/891 | Cost: 0.2158
Epoch: 025/050 | Batch 150/891 | Cost: 0.1381
Epoch: 025/050 | Batch 200/891 | Cost: 0.0871
Epoch: 025/050 | Batch 250/891 | Cost: 0.3461
Epoch: 025/050 | Batch 300/891 | Cost: 0.0968
Epoch: 025/050 | Batch 350/891 | Cost: 0.3009
Epoch: 025/050 | Batch 400/891 | Cost: 0.1789
Epoch: 025/050 | Batch 450/891 | Cost: 0.1351
Epoch: 025/050 | Batch 500/891 | Cost: 0.4432
Epoch: 025/050 | Batch 550/891 | Cost: 0.1543
Epoch: 025/050 | Batch 600/891 | Cost: 0.1859
Epoch: 025/050 | Batch 650/891 | Cost: 0.2304
Epoch: 025/050 | Batch 700/891 | Cost: 0.1545
Epoch: 025/050 | Batch 750/891 | Cost: 0.2133
Epoch: 025/050 | Batch 800/891 | Cost: 0.1626
Epoch: 025/050 | Batch 850/891 | Cost: 0.1345
training accuracy: 96.52%
valid accuracy: 91.67%
Time elapsed: 13.04 min
Epoch: 026/050 | Batch 000/891 | Cost: 0.1949
Epoch: 026/050 | Batch 050/891 | Cost: 0.3824
Epoch: 026/050 | Batch 100/891 | Cost: 0.1669
Epoch: 026/050 | Batch 150/891 | Cost: 0.0921
Epoch: 026/050 | Batch 200/891 | Cost: 0.1204
Epoch: 026/050 | Batch 250/891 | Cost: 0.2094
Epoch: 026/050 | Batch 300/891 | Cost: 0.3778
Epoch: 026/050 | Batch 350/891 | Cost: 0.1472
Epoch: 026/050 | Batch 400/891 | Cost: 0.2276
Epoch: 026/050 | Batch 450/891 | Cost: 0.3556
Epoch: 026/050 | Batch 500/891 | Cost: 0.2241
Epoch: 026/050 | Batch 550/891 | Cost: 0.4314
Epoch: 026/050 | Batch 600/891 | Cost: 0.2155
Epoch: 026/050 | Batch 650/891 | Cost: 0.1677
Epoch: 026/050 | Batch 700/891 | Cost: 0.1383
Epoch: 026/050 | Batch 750/891 | Cost: 0.1661
Epoch: 026/050 | Batch 800/891 | Cost: 0.3100
Epoch: 026/050 | Batch 850/891 | Cost: 0.1083
training accuracy: 96.67%
valid accuracy: 91.70%
Time elapsed: 13.56 min
Epoch: 027/050 | Batch 000/891 | Cost: 0.0772
Epoch: 027/050 | Batch 050/891 | Cost: 0.0812
Epoch: 027/050 | Batch 100/891 | Cost: 0.1793
Epoch: 027/050 | Batch 150/891 | Cost: 0.1480
Epoch: 027/050 | Batch 200/891 | Cost: 0.1768
Epoch: 027/050 | Batch 250/891 | Cost: 0.3068
Epoch: 027/050 | Batch 300/891 | Cost: 0.1652
Epoch: 027/050 | Batch 350/891 | Cost: 0.1633
Epoch: 027/050 | Batch 400/891 | Cost: 0.2064
Epoch: 027/050 | Batch 450/891 | Cost: 0.1655
Epoch: 027/050 | Batch 500/891 | Cost: 0.1756
Epoch: 027/050 | Batch 550/891 | Cost: 0.1434
Epoch: 027/050 | Batch 600/891 | Cost: 0.2568
Epoch: 027/050 | Batch 650/891 | Cost: 0.0844
Epoch: 027/050 | Batch 700/891 | Cost: 0.0799
Epoch: 027/050 | Batch 750/891 | Cost: 0.1349
Epoch: 027/050 | Batch 800/891 | Cost: 0.2556
Epoch: 027/050 | Batch 850/891 | Cost: 0.1254
training accuracy: 96.94%
valid accuracy: 92.20%
Time elapsed: 14.09 min
Epoch: 028/050 | Batch 000/891 | Cost: 0.1303
Epoch: 028/050 | Batch 050/891 | Cost: 0.2345
Epoch: 028/050 | Batch 100/891 | Cost: 0.1625
Epoch: 028/050 | Batch 150/891 | Cost: 0.1978
Epoch: 028/050 | Batch 200/891 | Cost: 0.1598
Epoch: 028/050 | Batch 250/891 | Cost: 0.1072
Epoch: 028/050 | Batch 300/891 | Cost: 0.1831
Epoch: 028/050 | Batch 350/891 | Cost: 0.0910
Epoch: 028/050 | Batch 400/891 | Cost: 0.0870
Epoch: 028/050 | Batch 450/891 | Cost: 0.1054
Epoch: 028/050 | Batch 500/891 | Cost: 0.1814
Epoch: 028/050 | Batch 550/891 | Cost: 0.1450
Epoch: 028/050 | Batch 600/891 | Cost: 0.1180
Epoch: 028/050 | Batch 650/891 | Cost: 0.1368
Epoch: 028/050 | Batch 700/891 | Cost: 0.1233
Epoch: 028/050 | Batch 750/891 | Cost: 0.0832
Epoch: 028/050 | Batch 800/891 | Cost: 0.1648
Epoch: 028/050 | Batch 850/891 | Cost: 0.1635
training accuracy: 97.06%
valid accuracy: 92.13%
Time elapsed: 14.61 min
Epoch: 029/050 | Batch 000/891 | Cost: 0.0895
Epoch: 029/050 | Batch 050/891 | Cost: 0.0893
Epoch: 029/050 | Batch 100/891 | Cost: 0.2326
Epoch: 029/050 | Batch 150/891 | Cost: 0.2078
Epoch: 029/050 | Batch 200/891 | Cost: 0.0743
Epoch: 029/050 | Batch 250/891 | Cost: 0.1169
Epoch: 029/050 | Batch 300/891 | Cost: 0.2828
Epoch: 029/050 | Batch 350/891 | Cost: 0.1916
Epoch: 029/050 | Batch 400/891 | Cost: 0.1416
Epoch: 029/050 | Batch 450/891 | Cost: 0.1501
Epoch: 029/050 | Batch 500/891 | Cost: 0.2920
Epoch: 029/050 | Batch 550/891 | Cost: 0.1433
Epoch: 029/050 | Batch 600/891 | Cost: 0.1443
Epoch: 029/050 | Batch 650/891 | Cost: 0.4024
Epoch: 029/050 | Batch 700/891 | Cost: 0.1745
Epoch: 029/050 | Batch 750/891 | Cost: 0.1506
Epoch: 029/050 | Batch 800/891 | Cost: 0.1827
Epoch: 029/050 | Batch 850/891 | Cost: 0.1941
training accuracy: 97.06%
valid accuracy: 91.80%
Time elapsed: 15.13 min
Epoch: 030/050 | Batch 000/891 | Cost: 0.0538
Epoch: 030/050 | Batch 050/891 | Cost: 0.1574
Epoch: 030/050 | Batch 100/891 | Cost: 0.1078
Epoch: 030/050 | Batch 150/891 | Cost: 0.0910
Epoch: 030/050 | Batch 200/891 | Cost: 0.4213
Epoch: 030/050 | Batch 250/891 | Cost: 0.4354
Epoch: 030/050 | Batch 300/891 | Cost: 0.1978
Epoch: 030/050 | Batch 350/891 | Cost: 0.3105
Epoch: 030/050 | Batch 400/891 | Cost: 0.0855
Epoch: 030/050 | Batch 450/891 | Cost: 0.0950
Epoch: 030/050 | Batch 500/891 | Cost: 0.1578
Epoch: 030/050 | Batch 550/891 | Cost: 0.1812
Epoch: 030/050 | Batch 600/891 | Cost: 0.1503
Epoch: 030/050 | Batch 650/891 | Cost: 0.2524
Epoch: 030/050 | Batch 700/891 | Cost: 0.2850
Epoch: 030/050 | Batch 750/891 | Cost: 0.2929
Epoch: 030/050 | Batch 800/891 | Cost: 0.1662
Epoch: 030/050 | Batch 850/891 | Cost: 0.1461
training accuracy: 96.99%
valid accuracy: 91.63%
Time elapsed: 15.65 min
Epoch: 031/050 | Batch 000/891 | Cost: 0.2477
Epoch: 031/050 | Batch 050/891 | Cost: 0.0818
Epoch: 031/050 | Batch 100/891 | Cost: 0.3006
Epoch: 031/050 | Batch 150/891 | Cost: 0.1007
Epoch: 031/050 | Batch 200/891 | Cost: 0.1521
Epoch: 031/050 | Batch 250/891 | Cost: 0.2553
Epoch: 031/050 | Batch 300/891 | Cost: 0.1161
Epoch: 031/050 | Batch 350/891 | Cost: 0.1272
Epoch: 031/050 | Batch 400/891 | Cost: 0.1417
Epoch: 031/050 | Batch 450/891 | Cost: 0.2192
Epoch: 031/050 | Batch 500/891 | Cost: 0.1461
Epoch: 031/050 | Batch 550/891 | Cost: 0.0548
Epoch: 031/050 | Batch 600/891 | Cost: 0.0588
Epoch: 031/050 | Batch 650/891 | Cost: 0.1124
Epoch: 031/050 | Batch 700/891 | Cost: 0.1215
Epoch: 031/050 | Batch 750/891 | Cost: 0.1673
Epoch: 031/050 | Batch 800/891 | Cost: 0.3652
Epoch: 031/050 | Batch 850/891 | Cost: 0.1577
training accuracy: 97.28%
valid accuracy: 91.93%
Time elapsed: 16.18 min
Epoch: 032/050 | Batch 000/891 | Cost: 0.2157
Epoch: 032/050 | Batch 050/891 | Cost: 0.1044
Epoch: 032/050 | Batch 100/891 | Cost: 0.1418
Epoch: 032/050 | Batch 150/891 | Cost: 0.1295
Epoch: 032/050 | Batch 200/891 | Cost: 0.1992
Epoch: 032/050 | Batch 250/891 | Cost: 0.1287
Epoch: 032/050 | Batch 300/891 | Cost: 0.1237
Epoch: 032/050 | Batch 350/891 | Cost: 0.1700
Epoch: 032/050 | Batch 400/891 | Cost: 0.0834
Epoch: 032/050 | Batch 450/891 | Cost: 0.1187
Epoch: 032/050 | Batch 500/891 | Cost: 0.1210
Epoch: 032/050 | Batch 550/891 | Cost: 0.1013
Epoch: 032/050 | Batch 600/891 | Cost: 0.1093
Epoch: 032/050 | Batch 650/891 | Cost: 0.1273
Epoch: 032/050 | Batch 700/891 | Cost: 0.0825
Epoch: 032/050 | Batch 750/891 | Cost: 0.0576
Epoch: 032/050 | Batch 800/891 | Cost: 0.3141
Epoch: 032/050 | Batch 850/891 | Cost: 0.1311
training accuracy: 97.36%
valid accuracy: 91.75%
Time elapsed: 16.70 min
Epoch: 033/050 | Batch 000/891 | Cost: 0.1296
Epoch: 033/050 | Batch 050/891 | Cost: 0.1740
Epoch: 033/050 | Batch 100/891 | Cost: 0.1495
Epoch: 033/050 | Batch 150/891 | Cost: 0.1684
Epoch: 033/050 | Batch 200/891 | Cost: 0.1388
Epoch: 033/050 | Batch 250/891 | Cost: 0.0879
Epoch: 033/050 | Batch 300/891 | Cost: 0.1247
Epoch: 033/050 | Batch 350/891 | Cost: 0.0976
Epoch: 033/050 | Batch 400/891 | Cost: 0.1558
Epoch: 033/050 | Batch 450/891 | Cost: 0.1188
Epoch: 033/050 | Batch 500/891 | Cost: 0.2809
Epoch: 033/050 | Batch 550/891 | Cost: 0.1375
Epoch: 033/050 | Batch 600/891 | Cost: 0.1907
Epoch: 033/050 | Batch 650/891 | Cost: 0.2093
Epoch: 033/050 | Batch 700/891 | Cost: 0.1889
Epoch: 033/050 | Batch 750/891 | Cost: 0.1331
Epoch: 033/050 | Batch 800/891 | Cost: 0.1284
Epoch: 033/050 | Batch 850/891 | Cost: 0.1431
training accuracy: 97.42%
valid accuracy: 91.95%
Time elapsed: 17.22 min
Epoch: 034/050 | Batch 000/891 | Cost: 0.1271
Epoch: 034/050 | Batch 050/891 | Cost: 0.1313
Epoch: 034/050 | Batch 100/891 | Cost: 0.1259
Epoch: 034/050 | Batch 150/891 | Cost: 0.1604
Epoch: 034/050 | Batch 200/891 | Cost: 0.1298
Epoch: 034/050 | Batch 250/891 | Cost: 0.2076
Epoch: 034/050 | Batch 300/891 | Cost: 0.1235
Epoch: 034/050 | Batch 350/891 | Cost: 0.1878
Epoch: 034/050 | Batch 400/891 | Cost: 0.1428
Epoch: 034/050 | Batch 450/891 | Cost: 0.1437
Epoch: 034/050 | Batch 500/891 | Cost: 0.2830
Epoch: 034/050 | Batch 550/891 | Cost: 0.1939
Epoch: 034/050 | Batch 600/891 | Cost: 0.2164
Epoch: 034/050 | Batch 650/891 | Cost: 0.1532
Epoch: 034/050 | Batch 700/891 | Cost: 0.0598
Epoch: 034/050 | Batch 750/891 | Cost: 0.2219
Epoch: 034/050 | Batch 800/891 | Cost: 0.0449
Epoch: 034/050 | Batch 850/891 | Cost: 0.1881
training accuracy: 97.50%
valid accuracy: 91.83%
Time elapsed: 17.75 min
Epoch: 035/050 | Batch 000/891 | Cost: 0.2045
Epoch: 035/050 | Batch 050/891 | Cost: 0.0852
Epoch: 035/050 | Batch 100/891 | Cost: 0.1590
Epoch: 035/050 | Batch 150/891 | Cost: 0.1173
Epoch: 035/050 | Batch 200/891 | Cost: 0.0929
Epoch: 035/050 | Batch 250/891 | Cost: 0.1028
Epoch: 035/050 | Batch 300/891 | Cost: 0.1426
Epoch: 035/050 | Batch 350/891 | Cost: 0.1643
Epoch: 035/050 | Batch 400/891 | Cost: 0.1684
Epoch: 035/050 | Batch 450/891 | Cost: 0.1423
Epoch: 035/050 | Batch 500/891 | Cost: 0.0537
Epoch: 035/050 | Batch 550/891 | Cost: 0.1361
Epoch: 035/050 | Batch 600/891 | Cost: 0.1196
Epoch: 035/050 | Batch 650/891 | Cost: 0.2022
Epoch: 035/050 | Batch 700/891 | Cost: 0.1325
Epoch: 035/050 | Batch 750/891 | Cost: 0.1634
Epoch: 035/050 | Batch 800/891 | Cost: 0.0780
Epoch: 035/050 | Batch 850/891 | Cost: 0.0622
training accuracy: 97.54%
valid accuracy: 92.08%
Time elapsed: 18.27 min
Epoch: 036/050 | Batch 000/891 | Cost: 0.2047
Epoch: 036/050 | Batch 050/891 | Cost: 0.1147
Epoch: 036/050 | Batch 100/891 | Cost: 0.1562
Epoch: 036/050 | Batch 150/891 | Cost: 0.1287
Epoch: 036/050 | Batch 200/891 | Cost: 0.1003
Epoch: 036/050 | Batch 250/891 | Cost: 0.0321
Epoch: 036/050 | Batch 300/891 | Cost: 0.0996
Epoch: 036/050 | Batch 350/891 | Cost: 0.3548
Epoch: 036/050 | Batch 400/891 | Cost: 0.3519
Epoch: 036/050 | Batch 450/891 | Cost: 0.1706
Epoch: 036/050 | Batch 500/891 | Cost: 0.0928
Epoch: 036/050 | Batch 550/891 | Cost: 0.2362
Epoch: 036/050 | Batch 600/891 | Cost: 0.0272
Epoch: 036/050 | Batch 650/891 | Cost: 0.1204
Epoch: 036/050 | Batch 700/891 | Cost: 0.1232
Epoch: 036/050 | Batch 750/891 | Cost: 0.0554
Epoch: 036/050 | Batch 800/891 | Cost: 0.1261
Epoch: 036/050 | Batch 850/891 | Cost: 0.1711
training accuracy: 97.61%
valid accuracy: 91.78%
Time elapsed: 18.79 min
Epoch: 037/050 | Batch 000/891 | Cost: 0.2958
Epoch: 037/050 | Batch 050/891 | Cost: 0.1589
Epoch: 037/050 | Batch 100/891 | Cost: 0.1260
Epoch: 037/050 | Batch 150/891 | Cost: 0.1790
Epoch: 037/050 | Batch 200/891 | Cost: 0.1086
Epoch: 037/050 | Batch 250/891 | Cost: 0.1195
Epoch: 037/050 | Batch 300/891 | Cost: 0.0967
Epoch: 037/050 | Batch 350/891 | Cost: 0.1505
Epoch: 037/050 | Batch 400/891 | Cost: 0.1043
Epoch: 037/050 | Batch 450/891 | Cost: 0.0591
Epoch: 037/050 | Batch 500/891 | Cost: 0.1217
Epoch: 037/050 | Batch 550/891 | Cost: 0.1842
Epoch: 037/050 | Batch 600/891 | Cost: 0.1192
Epoch: 037/050 | Batch 650/891 | Cost: 0.1334
Epoch: 037/050 | Batch 700/891 | Cost: 0.1788
Epoch: 037/050 | Batch 750/891 | Cost: 0.0667
Epoch: 037/050 | Batch 800/891 | Cost: 0.3219
Epoch: 037/050 | Batch 850/891 | Cost: 0.1975
training accuracy: 97.71%
valid accuracy: 91.72%
Time elapsed: 19.32 min
Epoch: 038/050 | Batch 000/891 | Cost: 0.1844
Epoch: 038/050 | Batch 050/891 | Cost: 0.1545
Epoch: 038/050 | Batch 100/891 | Cost: 0.1334
Epoch: 038/050 | Batch 150/891 | Cost: 0.1063
Epoch: 038/050 | Batch 200/891 | Cost: 0.2812
Epoch: 038/050 | Batch 250/891 | Cost: 0.0981
Epoch: 038/050 | Batch 300/891 | Cost: 0.1523
Epoch: 038/050 | Batch 350/891 | Cost: 0.2879
Epoch: 038/050 | Batch 400/891 | Cost: 0.2729
Epoch: 038/050 | Batch 450/891 | Cost: 0.0612
Epoch: 038/050 | Batch 500/891 | Cost: 0.1598
Epoch: 038/050 | Batch 550/891 | Cost: 0.0723
Epoch: 038/050 | Batch 600/891 | Cost: 0.2697
Epoch: 038/050 | Batch 650/891 | Cost: 0.1282
Epoch: 038/050 | Batch 700/891 | Cost: 0.1593
Epoch: 038/050 | Batch 750/891 | Cost: 0.0659
Epoch: 038/050 | Batch 800/891 | Cost: 0.1573
Epoch: 038/050 | Batch 850/891 | Cost: 0.1656
training accuracy: 97.69%
valid accuracy: 91.58%
Time elapsed: 19.84 min
Epoch: 039/050 | Batch 000/891 | Cost: 0.1314
Epoch: 039/050 | Batch 050/891 | Cost: 0.1625
Epoch: 039/050 | Batch 100/891 | Cost: 0.0831
Epoch: 039/050 | Batch 150/891 | Cost: 0.1587
Epoch: 039/050 | Batch 200/891 | Cost: 0.1787
Epoch: 039/050 | Batch 250/891 | Cost: 0.1757
Epoch: 039/050 | Batch 300/891 | Cost: 0.1766
Epoch: 039/050 | Batch 350/891 | Cost: 0.0869
Epoch: 039/050 | Batch 400/891 | Cost: 0.1955
Epoch: 039/050 | Batch 450/891 | Cost: 0.1461
Epoch: 039/050 | Batch 500/891 | Cost: 0.1332
Epoch: 039/050 | Batch 550/891 | Cost: 0.1721
Epoch: 039/050 | Batch 600/891 | Cost: 0.1060
Epoch: 039/050 | Batch 650/891 | Cost: 0.1121
Epoch: 039/050 | Batch 700/891 | Cost: 0.0702
Epoch: 039/050 | Batch 750/891 | Cost: 0.1067
Epoch: 039/050 | Batch 800/891 | Cost: 0.1447
Epoch: 039/050 | Batch 850/891 | Cost: 0.4161
training accuracy: 97.93%
valid accuracy: 91.78%
Time elapsed: 20.36 min
Epoch: 040/050 | Batch 000/891 | Cost: 0.2074
Epoch: 040/050 | Batch 050/891 | Cost: 0.1328
Epoch: 040/050 | Batch 100/891 | Cost: 0.4158
Epoch: 040/050 | Batch 150/891 | Cost: 0.1248
Epoch: 040/050 | Batch 200/891 | Cost: 0.1959
Epoch: 040/050 | Batch 250/891 | Cost: 0.0962
Epoch: 040/050 | Batch 300/891 | Cost: 0.1825
Epoch: 040/050 | Batch 350/891 | Cost: 0.1554
Epoch: 040/050 | Batch 400/891 | Cost: 0.1273
Epoch: 040/050 | Batch 450/891 | Cost: 0.1137
Epoch: 040/050 | Batch 500/891 | Cost: 0.1901
Epoch: 040/050 | Batch 550/891 | Cost: 0.0814
Epoch: 040/050 | Batch 600/891 | Cost: 0.1345
Epoch: 040/050 | Batch 650/891 | Cost: 0.2639
Epoch: 040/050 | Batch 700/891 | Cost: 0.1025
Epoch: 040/050 | Batch 750/891 | Cost: 0.1327
Epoch: 040/050 | Batch 800/891 | Cost: 0.1714
Epoch: 040/050 | Batch 850/891 | Cost: 0.1343
training accuracy: 97.81%
valid accuracy: 92.17%
Time elapsed: 20.89 min
Epoch: 041/050 | Batch 000/891 | Cost: 0.1353
Epoch: 041/050 | Batch 050/891 | Cost: 0.1946
Epoch: 041/050 | Batch 100/891 | Cost: 0.0811
Epoch: 041/050 | Batch 150/891 | Cost: 0.1745
Epoch: 041/050 | Batch 200/891 | Cost: 0.1002
Epoch: 041/050 | Batch 250/891 | Cost: 0.1357
Epoch: 041/050 | Batch 300/891 | Cost: 0.1622
Epoch: 041/050 | Batch 350/891 | Cost: 0.2214
Epoch: 041/050 | Batch 400/891 | Cost: 0.1607
Epoch: 041/050 | Batch 450/891 | Cost: 0.1431
Epoch: 041/050 | Batch 500/891 | Cost: 0.2578
Epoch: 041/050 | Batch 550/891 | Cost: 0.1356
Epoch: 041/050 | Batch 600/891 | Cost: 0.1696
Epoch: 041/050 | Batch 650/891 | Cost: 0.1122
Epoch: 041/050 | Batch 700/891 | Cost: 0.0957
Epoch: 041/050 | Batch 750/891 | Cost: 0.0836
Epoch: 041/050 | Batch 800/891 | Cost: 0.1506
Epoch: 041/050 | Batch 850/891 | Cost: 0.0962
training accuracy: 97.78%
valid accuracy: 91.58%
Time elapsed: 21.41 min
Epoch: 042/050 | Batch 000/891 | Cost: 0.2229
Epoch: 042/050 | Batch 050/891 | Cost: 0.1423
Epoch: 042/050 | Batch 100/891 | Cost: 0.1003
Epoch: 042/050 | Batch 150/891 | Cost: 0.0959
Epoch: 042/050 | Batch 200/891 | Cost: 0.1080
Epoch: 042/050 | Batch 250/891 | Cost: 0.1520
Epoch: 042/050 | Batch 300/891 | Cost: 0.0732
Epoch: 042/050 | Batch 350/891 | Cost: 0.1583
Epoch: 042/050 | Batch 400/891 | Cost: 0.1231
Epoch: 042/050 | Batch 450/891 | Cost: 0.2447
Epoch: 042/050 | Batch 500/891 | Cost: 0.0683
Epoch: 042/050 | Batch 550/891 | Cost: 0.1204
Epoch: 042/050 | Batch 600/891 | Cost: 0.1543
Epoch: 042/050 | Batch 650/891 | Cost: 0.1600
Epoch: 042/050 | Batch 700/891 | Cost: 0.0901
Epoch: 042/050 | Batch 750/891 | Cost: 0.1604
Epoch: 042/050 | Batch 800/891 | Cost: 0.1715
Epoch: 042/050 | Batch 850/891 | Cost: 0.2226
training accuracy: 97.77%
valid accuracy: 91.15%
Time elapsed: 21.94 min
Epoch: 043/050 | Batch 000/891 | Cost: 0.1232
Epoch: 043/050 | Batch 050/891 | Cost: 0.1437
Epoch: 043/050 | Batch 100/891 | Cost: 0.0858
Epoch: 043/050 | Batch 150/891 | Cost: 0.1087
Epoch: 043/050 | Batch 200/891 | Cost: 0.0706
Epoch: 043/050 | Batch 250/891 | Cost: 0.1048
Epoch: 043/050 | Batch 300/891 | Cost: 0.1699
Epoch: 043/050 | Batch 350/891 | Cost: 0.1475
Epoch: 043/050 | Batch 400/891 | Cost: 0.2350
Epoch: 043/050 | Batch 450/891 | Cost: 0.1415
Epoch: 043/050 | Batch 500/891 | Cost: 0.1563
Epoch: 043/050 | Batch 550/891 | Cost: 0.2188
Epoch: 043/050 | Batch 600/891 | Cost: 0.1957
Epoch: 043/050 | Batch 650/891 | Cost: 0.1960
Epoch: 043/050 | Batch 700/891 | Cost: 0.2074
Epoch: 043/050 | Batch 750/891 | Cost: 0.2902
Epoch: 043/050 | Batch 800/891 | Cost: 0.1978
Epoch: 043/050 | Batch 850/891 | Cost: 0.0669
training accuracy: 97.89%
valid accuracy: 91.38%
Time elapsed: 22.46 min
Epoch: 044/050 | Batch 000/891 | Cost: 0.2068
Epoch: 044/050 | Batch 050/891 | Cost: 0.1964
Epoch: 044/050 | Batch 100/891 | Cost: 0.1017
Epoch: 044/050 | Batch 150/891 | Cost: 0.0945
Epoch: 044/050 | Batch 200/891 | Cost: 0.1398
Epoch: 044/050 | Batch 250/891 | Cost: 0.1392
Epoch: 044/050 | Batch 300/891 | Cost: 0.1261
Epoch: 044/050 | Batch 350/891 | Cost: 0.2008
Epoch: 044/050 | Batch 400/891 | Cost: 0.2173
Epoch: 044/050 | Batch 450/891 | Cost: 0.0855
Epoch: 044/050 | Batch 500/891 | Cost: 0.0770
Epoch: 044/050 | Batch 550/891 | Cost: 0.1380
Epoch: 044/050 | Batch 600/891 | Cost: 0.3052
Epoch: 044/050 | Batch 650/891 | Cost: 0.0486
Epoch: 044/050 | Batch 700/891 | Cost: 0.1263
Epoch: 044/050 | Batch 750/891 | Cost: 0.1256
Epoch: 044/050 | Batch 800/891 | Cost: 0.1150
Epoch: 044/050 | Batch 850/891 | Cost: 0.0973
training accuracy: 97.76%
valid accuracy: 91.35%
Time elapsed: 22.98 min
Epoch: 045/050 | Batch 000/891 | Cost: 0.1594
Epoch: 045/050 | Batch 050/891 | Cost: 0.1549
Epoch: 045/050 | Batch 100/891 | Cost: 0.0711
Epoch: 045/050 | Batch 150/891 | Cost: 0.1032
Epoch: 045/050 | Batch 200/891 | Cost: 0.0720
Epoch: 045/050 | Batch 250/891 | Cost: 0.1090
Epoch: 045/050 | Batch 300/891 | Cost: 0.0773
Epoch: 045/050 | Batch 350/891 | Cost: 0.0606
Epoch: 045/050 | Batch 400/891 | Cost: 0.0950
Epoch: 045/050 | Batch 450/891 | Cost: 0.1379
Epoch: 045/050 | Batch 500/891 | Cost: 0.0536
Epoch: 045/050 | Batch 550/891 | Cost: 0.1675
Epoch: 045/050 | Batch 600/891 | Cost: 0.0619
Epoch: 045/050 | Batch 650/891 | Cost: 0.1666
Epoch: 045/050 | Batch 700/891 | Cost: 0.1070
Epoch: 045/050 | Batch 750/891 | Cost: 0.1447
Epoch: 045/050 | Batch 800/891 | Cost: 0.1363
Epoch: 045/050 | Batch 850/891 | Cost: 0.1717
training accuracy: 98.00%
valid accuracy: 91.67%
Time elapsed: 23.50 min
Epoch: 046/050 | Batch 000/891 | Cost: 0.0955
Epoch: 046/050 | Batch 050/891 | Cost: 0.0806
Epoch: 046/050 | Batch 100/891 | Cost: 0.0657
Epoch: 046/050 | Batch 150/891 | Cost: 0.2222
Epoch: 046/050 | Batch 200/891 | Cost: 0.0978
Epoch: 046/050 | Batch 250/891 | Cost: 0.0767
Epoch: 046/050 | Batch 300/891 | Cost: 0.1464
Epoch: 046/050 | Batch 350/891 | Cost: 0.1771
Epoch: 046/050 | Batch 400/891 | Cost: 0.2743
Epoch: 046/050 | Batch 450/891 | Cost: 0.1303
Epoch: 046/050 | Batch 500/891 | Cost: 0.2106
Epoch: 046/050 | Batch 550/891 | Cost: 0.0764
Epoch: 046/050 | Batch 600/891 | Cost: 0.0796
Epoch: 046/050 | Batch 650/891 | Cost: 0.0901
Epoch: 046/050 | Batch 700/891 | Cost: 0.2567
Epoch: 046/050 | Batch 750/891 | Cost: 0.1266
Epoch: 046/050 | Batch 800/891 | Cost: 0.0914
Epoch: 046/050 | Batch 850/891 | Cost: 0.1228
training accuracy: 97.94%
valid accuracy: 91.57%
Time elapsed: 24.02 min
Epoch: 047/050 | Batch 000/891 | Cost: 0.0675
Epoch: 047/050 | Batch 050/891 | Cost: 0.1272
Epoch: 047/050 | Batch 100/891 | Cost: 0.1254
Epoch: 047/050 | Batch 150/891 | Cost: 0.1105
Epoch: 047/050 | Batch 200/891 | Cost: 0.1292
Epoch: 047/050 | Batch 250/891 | Cost: 0.1707
Epoch: 047/050 | Batch 300/891 | Cost: 0.2328
Epoch: 047/050 | Batch 350/891 | Cost: 0.2123
Epoch: 047/050 | Batch 400/891 | Cost: 0.0974
Epoch: 047/050 | Batch 450/891 | Cost: 0.1456
Epoch: 047/050 | Batch 500/891 | Cost: 0.1195
Epoch: 047/050 | Batch 550/891 | Cost: 0.1078
Epoch: 047/050 | Batch 600/891 | Cost: 0.1064
Epoch: 047/050 | Batch 650/891 | Cost: 0.0680
Epoch: 047/050 | Batch 700/891 | Cost: 0.0793
Epoch: 047/050 | Batch 750/891 | Cost: 0.1284
Epoch: 047/050 | Batch 800/891 | Cost: 0.1557
Epoch: 047/050 | Batch 850/891 | Cost: 0.1397
training accuracy: 97.87%
valid accuracy: 91.40%
Time elapsed: 24.55 min
Epoch: 048/050 | Batch 000/891 | Cost: 0.0918
Epoch: 048/050 | Batch 050/891 | Cost: 0.1379
Epoch: 048/050 | Batch 100/891 | Cost: 0.2946
Epoch: 048/050 | Batch 150/891 | Cost: 0.1350
Epoch: 048/050 | Batch 200/891 | Cost: 0.1663
Epoch: 048/050 | Batch 250/891 | Cost: 0.0810
Epoch: 048/050 | Batch 300/891 | Cost: 0.1619
Epoch: 048/050 | Batch 350/891 | Cost: 0.0793
Epoch: 048/050 | Batch 400/891 | Cost: 0.0792
Epoch: 048/050 | Batch 450/891 | Cost: 0.1441
Epoch: 048/050 | Batch 500/891 | Cost: 0.3115
Epoch: 048/050 | Batch 550/891 | Cost: 0.0545
Epoch: 048/050 | Batch 600/891 | Cost: 0.0591
Epoch: 048/050 | Batch 650/891 | Cost: 0.0831
Epoch: 048/050 | Batch 700/891 | Cost: 0.1871
Epoch: 048/050 | Batch 750/891 | Cost: 0.0829
Epoch: 048/050 | Batch 800/891 | Cost: 0.2762
Epoch: 048/050 | Batch 850/891 | Cost: 0.1183
training accuracy: 98.02%
valid accuracy: 91.68%
Time elapsed: 25.07 min
Epoch: 049/050 | Batch 000/891 | Cost: 0.0937
Epoch: 049/050 | Batch 050/891 | Cost: 0.0760
Epoch: 049/050 | Batch 100/891 | Cost: 0.1527
Epoch: 049/050 | Batch 150/891 | Cost: 0.2894
Epoch: 049/050 | Batch 200/891 | Cost: 0.0581
Epoch: 049/050 | Batch 250/891 | Cost: 0.1349
Epoch: 049/050 | Batch 300/891 | Cost: 0.0351
Epoch: 049/050 | Batch 350/891 | Cost: 0.2301
Epoch: 049/050 | Batch 400/891 | Cost: 0.0575
Epoch: 049/050 | Batch 450/891 | Cost: 0.1455
Epoch: 049/050 | Batch 500/891 | Cost: 0.1668
Epoch: 049/050 | Batch 550/891 | Cost: 0.2178
Epoch: 049/050 | Batch 600/891 | Cost: 0.1040
Epoch: 049/050 | Batch 650/891 | Cost: 0.0888
Epoch: 049/050 | Batch 700/891 | Cost: 0.0934
Epoch: 049/050 | Batch 750/891 | Cost: 0.2147
Epoch: 049/050 | Batch 800/891 | Cost: 0.0826
Epoch: 049/050 | Batch 850/891 | Cost: 0.0803
training accuracy: 98.09%
valid accuracy: 91.72%
Time elapsed: 25.59 min
Epoch: 050/050 | Batch 000/891 | Cost: 0.0851
Epoch: 050/050 | Batch 050/891 | Cost: 0.0672
Epoch: 050/050 | Batch 100/891 | Cost: 0.1876
Epoch: 050/050 | Batch 150/891 | Cost: 0.1164
Epoch: 050/050 | Batch 200/891 | Cost: 0.0853
Epoch: 050/050 | Batch 250/891 | Cost: 0.1113
Epoch: 050/050 | Batch 300/891 | Cost: 0.1476
Epoch: 050/050 | Batch 350/891 | Cost: 0.2833
Epoch: 050/050 | Batch 400/891 | Cost: 0.0722
Epoch: 050/050 | Batch 450/891 | Cost: 0.1272
Epoch: 050/050 | Batch 500/891 | Cost: 0.0763
Epoch: 050/050 | Batch 550/891 | Cost: 0.1446
Epoch: 050/050 | Batch 600/891 | Cost: 0.1152
Epoch: 050/050 | Batch 650/891 | Cost: 0.2281
Epoch: 050/050 | Batch 700/891 | Cost: 0.2060
Epoch: 050/050 | Batch 750/891 | Cost: 0.1476
Epoch: 050/050 | Batch 800/891 | Cost: 0.0931
Epoch: 050/050 | Batch 850/891 | Cost: 0.0703
training accuracy: 98.20%
valid accuracy: 91.88%
Time elapsed: 26.11 min
Total Training Time: 26.11 min
Test accuracy: 90.87%

Evaluation

Evaluating on some new text that has been collected from recent news articles and is not part of the training or test sets.

In [25]:
import spacy
nlp = spacy.load('en')


map_dictionary = {
    0: "World",
    1: "Sports",
    2: "Business",
    3:"Sci/Tech",
}


def predict_class(model, sentence, min_len=4):
    # Somewhat based on
    # https://github.com/bentrevett/pytorch-sentiment-analysis/
    # blob/master/5%20-%20Multi-class%20Sentiment%20Analysis.ipynb
    model.eval()
    tokenized = [tok.text for tok in nlp.tokenizer(sentence)]
    if len(tokenized) < min_len:
        tokenized += ['<pad>'] * (min_len - len(tokenized))
    indexed = [TEXT.vocab.stoi[t] for t in tokenized]
    length = [len(indexed)]
    tensor = torch.LongTensor(indexed).to(DEVICE)
    tensor = tensor.unsqueeze(1)
    length_tensor = torch.LongTensor(length)
    preds = model(tensor, length_tensor)
    preds = torch.softmax(preds, dim=1)
    
    proba, class_label = preds.max(dim=1)
    return proba.item(), class_label.item()
In [26]:
text = """
The windfall follows a tender offer by Z Holdings, which is controlled by SoftBank’s domestic wireless unit, 
for half of Zozo’s shares this month.
"""

proba, pred_label = predict_class(model, text)

print(f'Class Label: {pred_label} -> {map_dictionary[pred_label]}')
print(f'Probability: {proba}')
Class Label: 2 -> Business
Probability: 0.6041601896286011
In [27]:
text = """
EU data regulator issues first-ever sanction of an EU institution, 
against the European parliament over its use of US-based NationBuilder to process voter data 
"""

proba, pred_label = predict_class(model, text)

print(f'Class Label: {pred_label} -> {map_dictionary[pred_label]}')
print(f'Probability: {proba}')
Class Label: 0 -> World
Probability: 0.932104229927063
In [28]:
text = """
LG announces CEO Jo Seong-jin will be replaced by Brian Kwon Dec. 1, amid 2020 
leadership shakeup and LG smartphone division's 18th straight quarterly loss
"""

proba, pred_label = predict_class(model, text)

print(f'Class Label: {pred_label} -> {map_dictionary[pred_label]}')
print(f'Probability: {proba}')
Class Label: 3 -> Sci/Tech
Probability: 0.5513855814933777
In [29]:
%watermark -iv
numpy     1.17.4
torchtext 0.4.0
torch     1.4.0
pandas    0.24.2
spacy     2.2.3

In [ ]: