seq2seqという新しい構造のニューラルネットワークを使って、AIに文章を書かせる。
"you say goodbye and I say hello."というコーパスで学習を行った言語モデルに対して、"I"という単語を与えるとする。 この場合、その言語モデルは"say"に最も高い確率を出力する言語モデルだとする。 この場合、新たな単語を生成するためにはどのようにすればよいか?
まず、Iに対して"say"が最も高い確率を示すことが分かる。 次に、"say"をモデルに入力することで、最も高い確率を示すものが選ばれる。ここでは、それが"hello"であるとする。
RnnlmGen
クラスの実装。
class RnnlmGen(Rnnlm):
def generate(self, start_id, skip_ids=None, sample_size=100):
word_ids = [start_id]
x = start_id
while len(word_ids) < sample_size:
x = np.array(x).reshape(1, 1)
score = self.predict(x)
p = softmax(score.flatten())
sampled = np.random.choice(len(p), size=1, p=p)
if (skip_ids is None) or (sampled not in skip_ids):
x = sampled
word_ids.append(int(x))
return word_ids
文章生成を行わせてみる。
# coding: utf-8
import sys
sys.path.append('..')
from rnnlm_gen import RnnlmGen
from dataset import ptb
corpus, word_to_id, id_to_word = ptb.load_data('train')
vocab_size = len(word_to_id)
corpus_size = len(corpus)
model = RnnlmGen()
model.load_params('../ch06/Rnnlm.pkl')
# start文字とskip文字の設定
start_word = 'you'
start_id = word_to_id[start_word]
skip_words = ['N', '<unk>', '$']
skip_ids = [word_to_id[w] for w in skip_words]
# 文章生成
word_ids = model.generate(start_id, skip_ids)
txt = ' '.join([id_to_word[i] for i in word_ids])
txt = txt.replace(' <eos>', '.\n')
print(txt)
you need surprised did n't work to wait this game. last news is pending. yesterday refcorp has run the remaining because of it proposal at the time for hurricane hugo production as always and lloyd. they canceled lots of smallest extremely fixed in their day at least half of damage patients. legislative calls growth while a small public remains designed to convince dialing americans to proponents of an constituents and secrets. coastal confirmed mr. stoltzman 's positive policy of orders clearly invites approved by the launch of time to recommend even a product sports four
次に、より良い言語モデルを使って文章生成を行ってみる。
# coding: utf-8
import sys
sys.path.append('..')
from common.np import *
from rnnlm_gen import BetterRnnlmGen
from dataset import ptb
corpus, word_to_id, id_to_word = ptb.load_data('train')
vocab_size = len(word_to_id)
corpus_size = len(corpus)
model = BetterRnnlmGen()
model.load_params('../ch06/BetterRnnlm.pkl')
# start文字とskip文字の設定
start_word = 'you'
start_id = word_to_id[start_word]
skip_words = ['N', '<unk>', '$']
skip_ids = [word_to_id[w] for w in skip_words]
# 文章生成
word_ids = model.generate(start_id, skip_ids)
txt = ' '.join([id_to_word[i] for i in word_ids])
txt = txt.replace(' <eos>', '.\n')
print(txt)
model.reset_state()
start_words = 'the meaning of life is'
start_ids = [word_to_id[w] for w in start_words.split(' ')]
for x in start_ids[:-1]:
x = np.array(x).reshape(1, 1)
model.predict(x)
word_ids = model.generate(start_ids[-1], skip_ids)
word_ids = start_ids[:-1] + word_ids
txt = ' '.join([id_to_word[i] for i in word_ids])
txt = txt.replace(' <eos>', '.\n')
print('-' * 50)
print(txt)
you talk to the senate. the house begins back instance much less carefully. it stopped the sound of welcome steel prices pollution programs such as containers says they had transferred women to rule for acquisitions. we could see that sometimes it 's going to buy or followed in the west states. the companies said it wants to even earn a recession publicly to replace additional lawsuits or that it expects a special work contest of daily care. nor would have a right grown to mr. jones. anyone will are lifted off. and any -------------------------------------------------- the meaning of life is either delivering and falls the fanfare against chemical with and a commitment to provide telephone data. none of the new ventures has iron and limited a tight british air event. most of the defendants have all time strongly on the ministry of coors depending on determination that in asset acceptance in the industry alex russell used. 's sugar and begin for every whole picks information and make such a conflict. along between the wisconsin guaranty and mail company and japan 's plane response to any outside properties ca n't be made in early the case
Encoder-Encoderモデルを使う。
「トイ・プロブレム(toy problem)」機械学習を評価するために作られた簡単なプログラム。
57+5
のような文字列を与え、"62"と正しく答えるように学習するトイ・プロブレム16+75 _91
52+607 _659
75+22 _97
63+22 _85
795+3 _798
706+796_1502
8+4 _12
84+317 _401
9+3 _12
6+2 _8
18+8 _26
85+52 _137
9+1 _10
8+20 _28
# coding: utf-8
import sys
sys.path.append('..')
from dataset import sequence
(x_train, t_train), (x_test, t_test) = \
sequence.load_data('addition.txt', seed=1984)
char_to_id, id_to_char = sequence.get_vocab()
print(x_train.shape, t_train.shape)
print(x_test.shape, t_test.shape)
# (45000, 7) (45000, 5)
# (5000, 7) (5000, 5)
print(x_train[0])
print(t_train[0])
# [ 3 0 2 0 0 11 5]
# [ 6 0 11 7 5]
print(''.join([id_to_char[c] for c in x_train[0]]))
print(''.join([id_to_char[c] for c in t_train[0]]))
# 71+118
# _189
(45000, 7) (45000, 5) (5000, 7) (5000, 5) [ 3 0 2 0 0 11 5] [ 6 0 11 7 5] 71+118 _189
# coding: utf-8
import sys
sys.path.append('..')
import numpy as np
import matplotlib.pyplot as plt
from dataset import sequence
from common.optimizer import Adam
from common.trainer import Trainer
from common.util import eval_seq2seq
from seq2seq import Seq2seq
from peeky_seq2seq import PeekySeq2seq
# データセットの読み込み
(x_train, t_train), (x_test, t_test) = sequence.load_data('addition.txt')
char_to_id, id_to_char = sequence.get_vocab()
# Reverse input? =================================================
is_reverse = False # True
if is_reverse:
x_train, x_test = x_train[:, ::-1], x_test[:, ::-1]
# ================================================================
# ハイパーパラメータの設定
vocab_size = len(char_to_id)
wordvec_size = 16
hideen_size = 128
batch_size = 128
max_epoch = 25
max_grad = 5.0
# Normal or Peeky? ==============================================
model = Seq2seq(vocab_size, wordvec_size, hideen_size)
# model = PeekySeq2seq(vocab_size, wordvec_size, hideen_size)
# ================================================================
optimizer = Adam()
trainer = Trainer(model, optimizer)
acc_list_seq2 = []
for epoch in range(max_epoch):
trainer.fit(x_train, t_train, max_epoch=1,
batch_size=batch_size, max_grad=max_grad)
correct_num = 0
for i in range(len(x_test)):
question, correct = x_test[[i]], t_test[[i]]
verbose = i < 10
correct_num += eval_seq2seq(model, question, correct,
id_to_char, verbose, is_reverse)
acc = float(correct_num) / len(x_test)
acc_list_seq2.append(acc)
print('val acc %.3f%%' % (acc * 100))
# グラフの描画
x_seq2 = np.arange(len(acc_list_seq2))
plt.plot(x_seq2, acc_list_seq2, marker='o')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.ylim(0, 1.0)
plt.show()
| epoch 1 | iter 1 / 351 | time 0[s] | loss 2.56 | epoch 1 | iter 21 / 351 | time 0[s] | loss 2.53 | epoch 1 | iter 41 / 351 | time 1[s] | loss 2.17 | epoch 1 | iter 61 / 351 | time 1[s] | loss 1.96 | epoch 1 | iter 81 / 351 | time 2[s] | loss 1.92 | epoch 1 | iter 101 / 351 | time 3[s] | loss 1.87 | epoch 1 | iter 121 / 351 | time 4[s] | loss 1.85 | epoch 1 | iter 141 / 351 | time 4[s] | loss 1.83 | epoch 1 | iter 161 / 351 | time 5[s] | loss 1.79 | epoch 1 | iter 181 / 351 | time 5[s] | loss 1.77 | epoch 1 | iter 201 / 351 | time 6[s] | loss 1.77 | epoch 1 | iter 221 / 351 | time 7[s] | loss 1.76 | epoch 1 | iter 241 / 351 | time 7[s] | loss 1.76 | epoch 1 | iter 261 / 351 | time 8[s] | loss 1.76 | epoch 1 | iter 281 / 351 | time 9[s] | loss 1.75 | epoch 1 | iter 301 / 351 | time 9[s] | loss 1.74 | epoch 1 | iter 321 / 351 | time 10[s] | loss 1.75 | epoch 1 | iter 341 / 351 | time 11[s] | loss 1.74 Q 77+85 T 162 ☒ 100 --- Q 975+164 T 1139 ☒ 1000 --- Q 582+84 T 666 ☒ 1000 --- Q 8+155 T 163 ☒ 100 --- Q 367+55 T 422 ☒ 1000 --- Q 600+257 T 857 ☒ 1000 --- Q 761+292 T 1053 ☒ 1000 --- Q 830+597 T 1427 ☒ 1000 --- Q 26+838 T 864 ☒ 1000 --- Q 143+93 T 236 ☒ 100 --- val acc 0.180% | epoch 2 | iter 1 / 351 | time 0[s] | loss 1.74 | epoch 2 | iter 21 / 351 | time 0[s] | loss 1.73 | epoch 2 | iter 41 / 351 | time 1[s] | loss 1.74 | epoch 2 | iter 61 / 351 | time 2[s] | loss 1.74 | epoch 2 | iter 81 / 351 | time 3[s] | loss 1.73 | epoch 2 | iter 101 / 351 | time 3[s] | loss 1.73 | epoch 2 | iter 121 / 351 | time 4[s] | loss 1.72 | epoch 2 | iter 141 / 351 | time 5[s] | loss 1.71 | epoch 2 | iter 161 / 351 | time 5[s] | loss 1.71 | epoch 2 | iter 181 / 351 | time 6[s] | loss 1.71 | epoch 2 | iter 201 / 351 | time 7[s] | loss 1.70 | epoch 2 | iter 221 / 351 | time 8[s] | loss 1.71 | epoch 2 | iter 241 / 351 | time 9[s] | loss 1.70 | epoch 2 | iter 261 / 351 | time 9[s] | loss 1.69 | epoch 2 | iter 281 / 351 | time 10[s] | loss 1.69 | epoch 2 | iter 301 / 351 | time 11[s] | loss 1.69 | epoch 2 | iter 321 / 351 | time 12[s] | loss 1.68 | epoch 2 | iter 341 / 351 | time 13[s] | loss 1.67 Q 77+85 T 162 ☒ 994 --- Q 975+164 T 1139 ☒ 1000 --- Q 582+84 T 666 ☒ 700 --- Q 8+155 T 163 ☒ 100 --- Q 367+55 T 422 ☒ 400 --- Q 600+257 T 857 ☒ 1000 --- Q 761+292 T 1053 ☒ 1000 --- Q 830+597 T 1427 ☒ 1544 --- Q 26+838 T 864 ☒ 400 --- Q 143+93 T 236 ☒ 400 --- val acc 0.220% | epoch 3 | iter 1 / 351 | time 0[s] | loss 1.66 | epoch 3 | iter 21 / 351 | time 0[s] | loss 1.66 | epoch 3 | iter 41 / 351 | time 1[s] | loss 1.65 | epoch 3 | iter 61 / 351 | time 2[s] | loss 1.63 | epoch 3 | iter 81 / 351 | time 2[s] | loss 1.62 | epoch 3 | iter 101 / 351 | time 3[s] | loss 1.62 | epoch 3 | iter 121 / 351 | time 4[s] | loss 1.60 | epoch 3 | iter 141 / 351 | time 4[s] | loss 1.59 | epoch 3 | iter 161 / 351 | time 5[s] | loss 1.57 | epoch 3 | iter 181 / 351 | time 6[s] | loss 1.57 | epoch 3 | iter 201 / 351 | time 6[s] | loss 1.56 | epoch 3 | iter 221 / 351 | time 7[s] | loss 1.54 | epoch 3 | iter 241 / 351 | time 8[s] | loss 1.52 | epoch 3 | iter 261 / 351 | time 9[s] | loss 1.52 | epoch 3 | iter 281 / 351 | time 10[s] | loss 1.52 | epoch 3 | iter 301 / 351 | time 10[s] | loss 1.50 | epoch 3 | iter 321 / 351 | time 11[s] | loss 1.49 | epoch 3 | iter 341 / 351 | time 12[s] | loss 1.48 Q 77+85 T 162 ☒ 108 --- Q 975+164 T 1139 ☒ 1001 --- Q 582+84 T 666 ☒ 648 --- Q 8+155 T 163 ☒ 138 --- Q 367+55 T 422 ☒ 448 --- Q 600+257 T 857 ☒ 848 --- Q 761+292 T 1053 ☒ 1011 --- Q 830+597 T 1427 ☒ 1373 --- Q 26+838 T 864 ☒ 868 --- Q 143+93 T 236 ☒ 348 --- val acc 0.560% | epoch 4 | iter 1 / 351 | time 0[s] | loss 1.47 | epoch 4 | iter 21 / 351 | time 0[s] | loss 1.46 | epoch 4 | iter 41 / 351 | time 1[s] | loss 1.44 | epoch 4 | iter 61 / 351 | time 2[s] | loss 1.43 | epoch 4 | iter 81 / 351 | time 3[s] | loss 1.42 | epoch 4 | iter 101 / 351 | time 3[s] | loss 1.41 | epoch 4 | iter 121 / 351 | time 5[s] | loss 1.40 | epoch 4 | iter 141 / 351 | time 5[s] | loss 1.40 | epoch 4 | iter 161 / 351 | time 6[s] | loss 1.38 | epoch 4 | iter 181 / 351 | time 7[s] | loss 1.38 | epoch 4 | iter 201 / 351 | time 8[s] | loss 1.37 | epoch 4 | iter 221 / 351 | time 9[s] | loss 1.35 | epoch 4 | iter 241 / 351 | time 9[s] | loss 1.33 | epoch 4 | iter 261 / 351 | time 10[s] | loss 1.33 | epoch 4 | iter 281 / 351 | time 11[s] | loss 1.33 | epoch 4 | iter 301 / 351 | time 11[s] | loss 1.32 | epoch 4 | iter 321 / 351 | time 12[s] | loss 1.31 | epoch 4 | iter 341 / 351 | time 13[s] | loss 1.30 Q 77+85 T 162 ☒ 146 --- Q 975+164 T 1139 ☒ 1189 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 432 --- Q 600+257 T 857 ☒ 866 --- Q 761+292 T 1053 ☒ 1002 --- Q 830+597 T 1427 ☒ 1406 --- Q 26+838 T 864 ☒ 862 --- Q 143+93 T 236 ☒ 202 --- val acc 1.060% | epoch 5 | iter 1 / 351 | time 0[s] | loss 1.28 | epoch 5 | iter 21 / 351 | time 0[s] | loss 1.29 | epoch 5 | iter 41 / 351 | time 1[s] | loss 1.28 | epoch 5 | iter 61 / 351 | time 2[s] | loss 1.27 | epoch 5 | iter 81 / 351 | time 2[s] | loss 1.27 | epoch 5 | iter 101 / 351 | time 3[s] | loss 1.26 | epoch 5 | iter 121 / 351 | time 4[s] | loss 1.26 | epoch 5 | iter 141 / 351 | time 4[s] | loss 1.27 | epoch 5 | iter 161 / 351 | time 5[s] | loss 1.26 | epoch 5 | iter 181 / 351 | time 6[s] | loss 1.25 | epoch 5 | iter 201 / 351 | time 6[s] | loss 1.23 | epoch 5 | iter 221 / 351 | time 7[s] | loss 1.22 | epoch 5 | iter 241 / 351 | time 8[s] | loss 1.21 | epoch 5 | iter 261 / 351 | time 8[s] | loss 1.21 | epoch 5 | iter 281 / 351 | time 9[s] | loss 1.21 | epoch 5 | iter 301 / 351 | time 10[s] | loss 1.20 | epoch 5 | iter 321 / 351 | time 11[s] | loss 1.19 | epoch 5 | iter 341 / 351 | time 11[s] | loss 1.18 Q 77+85 T 162 ☒ 145 --- Q 975+164 T 1139 ☒ 1168 --- Q 582+84 T 666 ☒ 665 --- Q 8+155 T 163 ☒ 192 --- Q 367+55 T 422 ☒ 431 --- Q 600+257 T 857 ☒ 895 --- Q 761+292 T 1053 ☒ 1015 --- Q 830+597 T 1427 ☒ 1493 --- Q 26+838 T 864 ☒ 891 --- Q 143+93 T 236 ☒ 221 --- val acc 2.260% | epoch 6 | iter 1 / 351 | time 0[s] | loss 1.17 | epoch 6 | iter 21 / 351 | time 0[s] | loss 1.17 | epoch 6 | iter 41 / 351 | time 1[s] | loss 1.18 | epoch 6 | iter 61 / 351 | time 2[s] | loss 1.17 | epoch 6 | iter 81 / 351 | time 2[s] | loss 1.16 | epoch 6 | iter 101 / 351 | time 3[s] | loss 1.16 | epoch 6 | iter 121 / 351 | time 4[s] | loss 1.16 | epoch 6 | iter 141 / 351 | time 4[s] | loss 1.14 | epoch 6 | iter 161 / 351 | time 5[s] | loss 1.14 | epoch 6 | iter 181 / 351 | time 6[s] | loss 1.13 | epoch 6 | iter 201 / 351 | time 6[s] | loss 1.15 | epoch 6 | iter 221 / 351 | time 7[s] | loss 1.13 | epoch 6 | iter 241 / 351 | time 8[s] | loss 1.13 | epoch 6 | iter 261 / 351 | time 8[s] | loss 1.16 | epoch 6 | iter 281 / 351 | time 9[s] | loss 1.19 | epoch 6 | iter 301 / 351 | time 10[s] | loss 1.16 | epoch 6 | iter 321 / 351 | time 11[s] | loss 1.13 | epoch 6 | iter 341 / 351 | time 11[s] | loss 1.10 Q 77+85 T 162 ☒ 166 --- Q 975+164 T 1139 ☒ 1169 --- Q 582+84 T 666 ☒ 660 --- Q 8+155 T 163 ☒ 174 --- Q 367+55 T 422 ☒ 412 --- Q 600+257 T 857 ☒ 846 --- Q 761+292 T 1053 ☒ 1011 --- Q 830+597 T 1427 ☒ 1412 --- Q 26+838 T 864 ☒ 846 --- Q 143+93 T 236 ☒ 207 --- val acc 2.800% | epoch 7 | iter 1 / 351 | time 0[s] | loss 1.11 | epoch 7 | iter 21 / 351 | time 0[s] | loss 1.11 | epoch 7 | iter 41 / 351 | time 1[s] | loss 1.11 | epoch 7 | iter 61 / 351 | time 2[s] | loss 1.10 | epoch 7 | iter 81 / 351 | time 2[s] | loss 1.09 | epoch 7 | iter 101 / 351 | time 3[s] | loss 1.08 | epoch 7 | iter 121 / 351 | time 4[s] | loss 1.08 | epoch 7 | iter 141 / 351 | time 4[s] | loss 1.08 | epoch 7 | iter 161 / 351 | time 5[s] | loss 1.08 | epoch 7 | iter 181 / 351 | time 6[s] | loss 1.09 | epoch 7 | iter 201 / 351 | time 6[s] | loss 1.07 | epoch 7 | iter 221 / 351 | time 7[s] | loss 1.08 | epoch 7 | iter 241 / 351 | time 8[s] | loss 1.06 | epoch 7 | iter 261 / 351 | time 8[s] | loss 1.06 | epoch 7 | iter 281 / 351 | time 9[s] | loss 1.06 | epoch 7 | iter 301 / 351 | time 10[s] | loss 1.06 | epoch 7 | iter 321 / 351 | time 11[s] | loss 1.06 | epoch 7 | iter 341 / 351 | time 11[s] | loss 1.04 Q 77+85 T 162 ☒ 166 --- Q 975+164 T 1139 ☒ 1160 --- Q 582+84 T 666 ☒ 655 --- Q 8+155 T 163 ☒ 161 --- Q 367+55 T 422 ☒ 409 --- Q 600+257 T 857 ☒ 892 --- Q 761+292 T 1053 ☒ 1039 --- Q 830+597 T 1427 ☒ 1444 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 228 --- val acc 4.160% | epoch 8 | iter 1 / 351 | time 0[s] | loss 1.04 | epoch 8 | iter 21 / 351 | time 0[s] | loss 1.03 | epoch 8 | iter 41 / 351 | time 1[s] | loss 1.03 | epoch 8 | iter 61 / 351 | time 2[s] | loss 1.04 | epoch 8 | iter 81 / 351 | time 2[s] | loss 1.03 | epoch 8 | iter 101 / 351 | time 3[s] | loss 1.05 | epoch 8 | iter 121 / 351 | time 4[s] | loss 1.05 | epoch 8 | iter 141 / 351 | time 4[s] | loss 1.04 | epoch 8 | iter 161 / 351 | time 5[s] | loss 1.05 | epoch 8 | iter 181 / 351 | time 6[s] | loss 1.03 | epoch 8 | iter 201 / 351 | time 6[s] | loss 1.03 | epoch 8 | iter 221 / 351 | time 7[s] | loss 1.02 | epoch 8 | iter 241 / 351 | time 8[s] | loss 1.02 | epoch 8 | iter 261 / 351 | time 8[s] | loss 1.01 | epoch 8 | iter 281 / 351 | time 9[s] | loss 1.03 | epoch 8 | iter 301 / 351 | time 10[s] | loss 1.01 | epoch 8 | iter 321 / 351 | time 10[s] | loss 1.05 | epoch 8 | iter 341 / 351 | time 11[s] | loss 1.00 Q 77+85 T 162 ☒ 158 --- Q 975+164 T 1139 ☒ 1109 --- Q 582+84 T 666 ☒ 656 --- Q 8+155 T 163 ☒ 156 --- Q 367+55 T 422 ☒ 431 --- Q 600+257 T 857 ☒ 838 --- Q 761+292 T 1053 ☒ 1009 --- Q 830+597 T 1427 ☒ 1411 --- Q 26+838 T 864 ☒ 861 --- Q 143+93 T 236 ☒ 227 --- val acc 3.460% | epoch 9 | iter 1 / 351 | time 0[s] | loss 0.99 | epoch 9 | iter 21 / 351 | time 0[s] | loss 1.01 | epoch 9 | iter 41 / 351 | time 1[s] | loss 1.00 | epoch 9 | iter 61 / 351 | time 2[s] | loss 1.00 | epoch 9 | iter 81 / 351 | time 2[s] | loss 1.00 | epoch 9 | iter 101 / 351 | time 3[s] | loss 0.99 | epoch 9 | iter 121 / 351 | time 4[s] | loss 1.04 | epoch 9 | iter 141 / 351 | time 4[s] | loss 1.04 | epoch 9 | iter 161 / 351 | time 5[s] | loss 1.04 | epoch 9 | iter 181 / 351 | time 6[s] | loss 1.01 | epoch 9 | iter 201 / 351 | time 7[s] | loss 0.99 | epoch 9 | iter 221 / 351 | time 7[s] | loss 0.98 | epoch 9 | iter 241 / 351 | time 8[s] | loss 0.98 | epoch 9 | iter 261 / 351 | time 9[s] | loss 1.00 | epoch 9 | iter 281 / 351 | time 10[s] | loss 0.99 | epoch 9 | iter 301 / 351 | time 10[s] | loss 0.99 | epoch 9 | iter 321 / 351 | time 11[s] | loss 0.99 | epoch 9 | iter 341 / 351 | time 12[s] | loss 1.00 Q 77+85 T 162 ☒ 157 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☒ 655 --- Q 8+155 T 163 ☒ 172 --- Q 367+55 T 422 ☒ 418 --- Q 600+257 T 857 ☒ 846 --- Q 761+292 T 1053 ☒ 1039 --- Q 830+597 T 1427 ☒ 1449 --- Q 26+838 T 864 ☒ 861 --- Q 143+93 T 236 ☒ 218 --- val acc 3.680% | epoch 10 | iter 1 / 351 | time 0[s] | loss 0.97 | epoch 10 | iter 21 / 351 | time 0[s] | loss 1.01 | epoch 10 | iter 41 / 351 | time 1[s] | loss 1.00 | epoch 10 | iter 61 / 351 | time 2[s] | loss 1.00 | epoch 10 | iter 81 / 351 | time 3[s] | loss 1.00 | epoch 10 | iter 101 / 351 | time 3[s] | loss 0.97 | epoch 10 | iter 121 / 351 | time 4[s] | loss 0.96 | epoch 10 | iter 141 / 351 | time 5[s] | loss 0.97 | epoch 10 | iter 161 / 351 | time 5[s] | loss 0.96 | epoch 10 | iter 181 / 351 | time 6[s] | loss 0.97 | epoch 10 | iter 201 / 351 | time 7[s] | loss 0.96 | epoch 10 | iter 221 / 351 | time 8[s] | loss 0.96 | epoch 10 | iter 241 / 351 | time 8[s] | loss 0.95 | epoch 10 | iter 261 / 351 | time 9[s] | loss 0.96 | epoch 10 | iter 281 / 351 | time 10[s] | loss 0.95 | epoch 10 | iter 301 / 351 | time 11[s] | loss 0.95 | epoch 10 | iter 321 / 351 | time 11[s] | loss 0.96 | epoch 10 | iter 341 / 351 | time 12[s] | loss 0.95 Q 77+85 T 162 ☒ 157 --- Q 975+164 T 1139 ☒ 1160 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 160 --- Q 367+55 T 422 ☒ 407 --- Q 600+257 T 857 ☒ 856 --- Q 761+292 T 1053 ☒ 1039 --- Q 830+597 T 1427 ☒ 1460 --- Q 26+838 T 864 ☒ 867 --- Q 143+93 T 236 ☒ 237 --- val acc 4.640% | epoch 11 | iter 1 / 351 | time 0[s] | loss 0.99 | epoch 11 | iter 21 / 351 | time 0[s] | loss 0.99 | epoch 11 | iter 41 / 351 | time 1[s] | loss 0.98 | epoch 11 | iter 61 / 351 | time 2[s] | loss 0.95 | epoch 11 | iter 81 / 351 | time 2[s] | loss 0.95 | epoch 11 | iter 101 / 351 | time 3[s] | loss 0.96 | epoch 11 | iter 121 / 351 | time 4[s] | loss 0.94 | epoch 11 | iter 141 / 351 | time 5[s] | loss 0.94 | epoch 11 | iter 161 / 351 | time 6[s] | loss 0.94 | epoch 11 | iter 181 / 351 | time 6[s] | loss 0.98 | epoch 11 | iter 201 / 351 | time 7[s] | loss 0.97 | epoch 11 | iter 221 / 351 | time 8[s] | loss 0.96 | epoch 11 | iter 241 / 351 | time 9[s] | loss 0.96 | epoch 11 | iter 261 / 351 | time 10[s] | loss 0.95 | epoch 11 | iter 281 / 351 | time 11[s] | loss 0.92 | epoch 11 | iter 301 / 351 | time 12[s] | loss 0.93 | epoch 11 | iter 321 / 351 | time 12[s] | loss 0.94 | epoch 11 | iter 341 / 351 | time 13[s] | loss 0.94 Q 77+85 T 162 ☒ 158 --- Q 975+164 T 1139 ☒ 1107 --- Q 582+84 T 666 ☒ 668 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 419 --- Q 600+257 T 857 ☒ 875 --- Q 761+292 T 1053 ☒ 1038 --- Q 830+597 T 1427 ☒ 1418 --- Q 26+838 T 864 ☒ 861 --- Q 143+93 T 236 ☒ 238 --- val acc 6.300% | epoch 12 | iter 1 / 351 | time 0[s] | loss 0.93 | epoch 12 | iter 21 / 351 | time 0[s] | loss 0.92 | epoch 12 | iter 41 / 351 | time 1[s] | loss 0.92 | epoch 12 | iter 61 / 351 | time 2[s] | loss 0.93 | epoch 12 | iter 81 / 351 | time 2[s] | loss 0.96 | epoch 12 | iter 101 / 351 | time 3[s] | loss 0.94 | epoch 12 | iter 121 / 351 | time 4[s] | loss 0.91 | epoch 12 | iter 141 / 351 | time 5[s] | loss 0.92 | epoch 12 | iter 161 / 351 | time 5[s] | loss 0.97 | epoch 12 | iter 181 / 351 | time 6[s] | loss 0.94 | epoch 12 | iter 201 / 351 | time 7[s] | loss 0.92 | epoch 12 | iter 221 / 351 | time 8[s] | loss 0.94 | epoch 12 | iter 241 / 351 | time 8[s] | loss 0.92 | epoch 12 | iter 261 / 351 | time 9[s] | loss 0.90 | epoch 12 | iter 281 / 351 | time 10[s] | loss 0.91 | epoch 12 | iter 301 / 351 | time 11[s] | loss 0.91 | epoch 12 | iter 321 / 351 | time 11[s] | loss 0.91 | epoch 12 | iter 341 / 351 | time 12[s] | loss 0.90 Q 77+85 T 162 ☒ 157 --- Q 975+164 T 1139 ☒ 1129 --- Q 582+84 T 666 ☒ 670 --- Q 8+155 T 163 ☒ 155 --- Q 367+55 T 422 ☒ 419 --- Q 600+257 T 857 ☒ 846 --- Q 761+292 T 1053 ☒ 1039 --- Q 830+597 T 1427 ☒ 1404 --- Q 26+838 T 864 ☒ 849 --- Q 143+93 T 236 ☒ 227 --- val acc 4.440% | epoch 13 | iter 1 / 351 | time 0[s] | loss 0.96 | epoch 13 | iter 21 / 351 | time 0[s] | loss 0.97 | epoch 13 | iter 41 / 351 | time 1[s] | loss 0.93 | epoch 13 | iter 61 / 351 | time 2[s] | loss 0.93 | epoch 13 | iter 81 / 351 | time 3[s] | loss 0.90 | epoch 13 | iter 101 / 351 | time 3[s] | loss 0.89 | epoch 13 | iter 121 / 351 | time 4[s] | loss 0.89 | epoch 13 | iter 141 / 351 | time 5[s] | loss 0.90 | epoch 13 | iter 161 / 351 | time 5[s] | loss 0.90 | epoch 13 | iter 181 / 351 | time 6[s] | loss 0.90 | epoch 13 | iter 201 / 351 | time 7[s] | loss 0.93 | epoch 13 | iter 221 / 351 | time 8[s] | loss 0.94 | epoch 13 | iter 241 / 351 | time 9[s] | loss 0.92 | epoch 13 | iter 261 / 351 | time 9[s] | loss 0.92 | epoch 13 | iter 281 / 351 | time 10[s] | loss 0.92 | epoch 13 | iter 301 / 351 | time 11[s] | loss 0.91 | epoch 13 | iter 321 / 351 | time 12[s] | loss 0.91 | epoch 13 | iter 341 / 351 | time 13[s] | loss 0.91 Q 77+85 T 162 ☒ 161 --- Q 975+164 T 1139 ☒ 1110 --- Q 582+84 T 666 ☒ 659 --- Q 8+155 T 163 ☒ 164 --- Q 367+55 T 422 ☒ 411 --- Q 600+257 T 857 ☒ 841 --- Q 761+292 T 1053 ☒ 1031 --- Q 830+597 T 1427 ☒ 1394 --- Q 26+838 T 864 ☒ 859 --- Q 143+93 T 236 ☒ 239 --- val acc 4.300% | epoch 14 | iter 1 / 351 | time 0[s] | loss 0.96 | epoch 14 | iter 21 / 351 | time 0[s] | loss 0.92 | epoch 14 | iter 41 / 351 | time 1[s] | loss 0.91 | epoch 14 | iter 61 / 351 | time 2[s] | loss 0.90 | epoch 14 | iter 81 / 351 | time 2[s] | loss 0.89 | epoch 14 | iter 101 / 351 | time 3[s] | loss 0.89 | epoch 14 | iter 121 / 351 | time 4[s] | loss 0.89 | epoch 14 | iter 141 / 351 | time 5[s] | loss 0.88 | epoch 14 | iter 161 / 351 | time 5[s] | loss 0.89 | epoch 14 | iter 181 / 351 | time 6[s] | loss 0.88 | epoch 14 | iter 201 / 351 | time 7[s] | loss 0.92 | epoch 14 | iter 221 / 351 | time 8[s] | loss 0.88 | epoch 14 | iter 241 / 351 | time 9[s] | loss 0.90 | epoch 14 | iter 261 / 351 | time 9[s] | loss 0.88 | epoch 14 | iter 281 / 351 | time 10[s] | loss 0.88 | epoch 14 | iter 301 / 351 | time 11[s] | loss 0.90 | epoch 14 | iter 321 / 351 | time 12[s] | loss 0.92 | epoch 14 | iter 341 / 351 | time 13[s] | loss 0.90 Q 77+85 T 162 ☒ 164 --- Q 975+164 T 1139 ☒ 1128 --- Q 582+84 T 666 ☒ 685 --- Q 8+155 T 163 ☒ 175 --- Q 367+55 T 422 ☒ 428 --- Q 600+257 T 857 ☒ 859 --- Q 761+292 T 1053 ☒ 1072 --- Q 830+597 T 1427 ☒ 1418 --- Q 26+838 T 864 ☒ 875 --- Q 143+93 T 236 ☒ 245 --- val acc 4.460% | epoch 15 | iter 1 / 351 | time 0[s] | loss 0.89 | epoch 15 | iter 21 / 351 | time 0[s] | loss 0.89 | epoch 15 | iter 41 / 351 | time 1[s] | loss 0.89 | epoch 15 | iter 61 / 351 | time 2[s] | loss 0.89 | epoch 15 | iter 81 / 351 | time 3[s] | loss 0.89 | epoch 15 | iter 101 / 351 | time 4[s] | loss 0.90 | epoch 15 | iter 121 / 351 | time 4[s] | loss 0.89 | epoch 15 | iter 141 / 351 | time 5[s] | loss 0.89 | epoch 15 | iter 161 / 351 | time 6[s] | loss 0.91 | epoch 15 | iter 181 / 351 | time 7[s] | loss 0.90 | epoch 15 | iter 201 / 351 | time 8[s] | loss 0.90 | epoch 15 | iter 221 / 351 | time 9[s] | loss 0.90 | epoch 15 | iter 241 / 351 | time 9[s] | loss 0.88 | epoch 15 | iter 261 / 351 | time 10[s] | loss 0.87 | epoch 15 | iter 281 / 351 | time 11[s] | loss 0.88 | epoch 15 | iter 301 / 351 | time 12[s] | loss 0.87 | epoch 15 | iter 321 / 351 | time 12[s] | loss 0.86 | epoch 15 | iter 341 / 351 | time 13[s] | loss 0.86 Q 77+85 T 162 ☒ 161 --- Q 975+164 T 1139 ☒ 1107 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 158 --- Q 367+55 T 422 ☒ 412 --- Q 600+257 T 857 ☒ 849 --- Q 761+292 T 1053 ☒ 1039 --- Q 830+597 T 1427 ☒ 1414 --- Q 26+838 T 864 ☒ 868 --- Q 143+93 T 236 ☒ 232 --- val acc 6.020% | epoch 16 | iter 1 / 351 | time 0[s] | loss 0.86 | epoch 16 | iter 21 / 351 | time 0[s] | loss 0.86 | epoch 16 | iter 41 / 351 | time 1[s] | loss 0.85 | epoch 16 | iter 61 / 351 | time 2[s] | loss 0.86 | epoch 16 | iter 81 / 351 | time 3[s] | loss 0.87 | epoch 16 | iter 101 / 351 | time 3[s] | loss 0.87 | epoch 16 | iter 121 / 351 | time 4[s] | loss 0.87 | epoch 16 | iter 141 / 351 | time 5[s] | loss 0.86 | epoch 16 | iter 161 / 351 | time 6[s] | loss 0.86 | epoch 16 | iter 181 / 351 | time 7[s] | loss 0.87 | epoch 16 | iter 201 / 351 | time 8[s] | loss 0.86 | epoch 16 | iter 221 / 351 | time 8[s] | loss 0.87 | epoch 16 | iter 241 / 351 | time 9[s] | loss 0.84 | epoch 16 | iter 261 / 351 | time 10[s] | loss 0.87 | epoch 16 | iter 281 / 351 | time 10[s] | loss 0.86 | epoch 16 | iter 301 / 351 | time 12[s] | loss 0.86 | epoch 16 | iter 321 / 351 | time 12[s] | loss 0.87 | epoch 16 | iter 341 / 351 | time 13[s] | loss 0.84 Q 77+85 T 162 ☒ 161 --- Q 975+164 T 1139 ☒ 1120 --- Q 582+84 T 666 ☒ 662 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☒ 420 --- Q 600+257 T 857 ☒ 859 --- Q 761+292 T 1053 ☒ 1039 --- Q 830+597 T 1427 ☒ 1414 --- Q 26+838 T 864 ☒ 868 --- Q 143+93 T 236 ☒ 237 --- val acc 8.960% | epoch 17 | iter 1 / 351 | time 0[s] | loss 0.83 | epoch 17 | iter 21 / 351 | time 0[s] | loss 0.87 | epoch 17 | iter 41 / 351 | time 1[s] | loss 0.87 | epoch 17 | iter 61 / 351 | time 2[s] | loss 0.85 | epoch 17 | iter 81 / 351 | time 3[s] | loss 0.85 | epoch 17 | iter 101 / 351 | time 4[s] | loss 0.91 | epoch 17 | iter 121 / 351 | time 5[s] | loss 0.89 | epoch 17 | iter 141 / 351 | time 6[s] | loss 0.90 | epoch 17 | iter 161 / 351 | time 6[s] | loss 0.83 | epoch 17 | iter 181 / 351 | time 7[s] | loss 0.84 | epoch 17 | iter 201 / 351 | time 8[s] | loss 0.84 | epoch 17 | iter 221 / 351 | time 9[s] | loss 0.82 | epoch 17 | iter 241 / 351 | time 10[s] | loss 0.88 | epoch 17 | iter 261 / 351 | time 11[s] | loss 0.86 | epoch 17 | iter 281 / 351 | time 11[s] | loss 0.86 | epoch 17 | iter 301 / 351 | time 12[s] | loss 0.83 | epoch 17 | iter 321 / 351 | time 13[s] | loss 0.84 | epoch 17 | iter 341 / 351 | time 14[s] | loss 0.83 Q 77+85 T 162 ☒ 161 --- Q 975+164 T 1139 ☒ 1128 --- Q 582+84 T 666 ☒ 659 --- Q 8+155 T 163 ☒ 158 --- Q 367+55 T 422 ☒ 419 --- Q 600+257 T 857 ☒ 849 --- Q 761+292 T 1053 ☒ 1039 --- Q 830+597 T 1427 ☒ 1414 --- Q 26+838 T 864 ☒ 861 --- Q 143+93 T 236 ☒ 239 --- val acc 8.440% | epoch 18 | iter 1 / 351 | time 0[s] | loss 0.84 | epoch 18 | iter 21 / 351 | time 1[s] | loss 0.82 | epoch 18 | iter 41 / 351 | time 2[s] | loss 0.83 | epoch 18 | iter 61 / 351 | time 3[s] | loss 0.82 | epoch 18 | iter 81 / 351 | time 4[s] | loss 0.83 | epoch 18 | iter 101 / 351 | time 5[s] | loss 0.82 | epoch 18 | iter 121 / 351 | time 7[s] | loss 0.83 | epoch 18 | iter 141 / 351 | time 8[s] | loss 0.88 | epoch 18 | iter 161 / 351 | time 9[s] | loss 0.85 | epoch 18 | iter 181 / 351 | time 10[s] | loss 0.85 | epoch 18 | iter 201 / 351 | time 11[s] | loss 0.88 | epoch 18 | iter 221 / 351 | time 11[s] | loss 0.89 | epoch 18 | iter 241 / 351 | time 13[s] | loss 0.83 | epoch 18 | iter 261 / 351 | time 14[s] | loss 0.82 | epoch 18 | iter 281 / 351 | time 15[s] | loss 0.82 | epoch 18 | iter 301 / 351 | time 16[s] | loss 0.83 | epoch 18 | iter 321 / 351 | time 17[s] | loss 0.82 | epoch 18 | iter 341 / 351 | time 18[s] | loss 0.82 Q 77+85 T 162 ☒ 164 --- Q 975+164 T 1139 ☒ 1138 --- Q 582+84 T 666 ☒ 662 --- Q 8+155 T 163 ☒ 164 --- Q 367+55 T 422 ☒ 417 --- Q 600+257 T 857 ☒ 849 --- Q 761+292 T 1053 ☒ 1073 --- Q 830+597 T 1427 ☒ 1414 --- Q 26+838 T 864 ☒ 867 --- Q 143+93 T 236 ☒ 239 --- val acc 8.800% | epoch 19 | iter 1 / 351 | time 0[s] | loss 0.83 | epoch 19 | iter 21 / 351 | time 1[s] | loss 0.83 | epoch 19 | iter 41 / 351 | time 1[s] | loss 0.82 | epoch 19 | iter 61 / 351 | time 2[s] | loss 0.83 | epoch 19 | iter 81 / 351 | time 3[s] | loss 0.86 | epoch 19 | iter 101 / 351 | time 4[s] | loss 0.85 | epoch 19 | iter 121 / 351 | time 5[s] | loss 0.84 | epoch 19 | iter 141 / 351 | time 6[s] | loss 0.86 | epoch 19 | iter 161 / 351 | time 7[s] | loss 0.83 | epoch 19 | iter 181 / 351 | time 8[s] | loss 0.82 | epoch 19 | iter 201 / 351 | time 9[s] | loss 0.80 | epoch 19 | iter 221 / 351 | time 10[s] | loss 0.82 | epoch 19 | iter 241 / 351 | time 11[s] | loss 0.82 | epoch 19 | iter 261 / 351 | time 12[s] | loss 0.86 | epoch 19 | iter 281 / 351 | time 13[s] | loss 0.84 | epoch 19 | iter 301 / 351 | time 14[s] | loss 0.84 | epoch 19 | iter 321 / 351 | time 15[s] | loss 0.83 | epoch 19 | iter 341 / 351 | time 16[s] | loss 0.83 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1160 --- Q 582+84 T 666 ☒ 672 --- Q 8+155 T 163 ☒ 167 --- Q 367+55 T 422 ☒ 419 --- Q 600+257 T 857 ☒ 856 --- Q 761+292 T 1053 ☒ 1049 --- Q 830+597 T 1427 ☒ 1424 --- Q 26+838 T 864 ☒ 867 --- Q 143+93 T 236 ☒ 239 --- val acc 10.500% | epoch 20 | iter 1 / 351 | time 0[s] | loss 0.79 | epoch 20 | iter 21 / 351 | time 0[s] | loss 0.83 | epoch 20 | iter 41 / 351 | time 1[s] | loss 0.82 | epoch 20 | iter 61 / 351 | time 2[s] | loss 0.85 | epoch 20 | iter 81 / 351 | time 4[s] | loss 0.83 | epoch 20 | iter 101 / 351 | time 5[s] | loss 0.80 | epoch 20 | iter 121 / 351 | time 6[s] | loss 0.82 | epoch 20 | iter 141 / 351 | time 8[s] | loss 0.80 | epoch 20 | iter 161 / 351 | time 9[s] | loss 0.84 | epoch 20 | iter 181 / 351 | time 10[s] | loss 0.83 | epoch 20 | iter 201 / 351 | time 11[s] | loss 0.82 | epoch 20 | iter 221 / 351 | time 12[s] | loss 0.82 | epoch 20 | iter 241 / 351 | time 13[s] | loss 0.84 | epoch 20 | iter 261 / 351 | time 14[s] | loss 0.83 | epoch 20 | iter 281 / 351 | time 15[s] | loss 0.82 | epoch 20 | iter 301 / 351 | time 16[s] | loss 0.82 | epoch 20 | iter 321 / 351 | time 17[s] | loss 0.84 | epoch 20 | iter 341 / 351 | time 18[s] | loss 0.79 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1131 --- Q 582+84 T 666 ☒ 667 --- Q 8+155 T 163 ☒ 164 --- Q 367+55 T 422 ☒ 412 --- Q 600+257 T 857 ☒ 849 --- Q 761+292 T 1053 ☒ 1039 --- Q 830+597 T 1427 ☒ 1424 --- Q 26+838 T 864 ☒ 862 --- Q 143+93 T 236 ☒ 239 --- val acc 10.140% | epoch 21 | iter 1 / 351 | time 0[s] | loss 0.78 | epoch 21 | iter 21 / 351 | time 0[s] | loss 0.80 | epoch 21 | iter 41 / 351 | time 1[s] | loss 0.79 | epoch 21 | iter 61 / 351 | time 2[s] | loss 0.81 | epoch 21 | iter 81 / 351 | time 3[s] | loss 0.80 | epoch 21 | iter 101 / 351 | time 4[s] | loss 0.84 | epoch 21 | iter 121 / 351 | time 5[s] | loss 0.80 | epoch 21 | iter 141 / 351 | time 6[s] | loss 0.82 | epoch 21 | iter 161 / 351 | time 7[s] | loss 0.80 | epoch 21 | iter 181 / 351 | time 8[s] | loss 0.79 | epoch 21 | iter 201 / 351 | time 9[s] | loss 0.79 | epoch 21 | iter 221 / 351 | time 10[s] | loss 0.79 | epoch 21 | iter 241 / 351 | time 11[s] | loss 0.83 | epoch 21 | iter 261 / 351 | time 12[s] | loss 0.81 | epoch 21 | iter 281 / 351 | time 13[s] | loss 0.79 | epoch 21 | iter 301 / 351 | time 14[s] | loss 0.82 | epoch 21 | iter 321 / 351 | time 15[s] | loss 0.82 | epoch 21 | iter 341 / 351 | time 16[s] | loss 0.82 Q 77+85 T 162 ☒ 160 --- Q 975+164 T 1139 ☒ 1126 --- Q 582+84 T 666 ☒ 662 --- Q 8+155 T 163 ☒ 164 --- Q 367+55 T 422 ☒ 417 --- Q 600+257 T 857 ☒ 846 --- Q 761+292 T 1053 ☒ 1051 --- Q 830+597 T 1427 ☒ 1404 --- Q 26+838 T 864 ☒ 857 --- Q 143+93 T 236 ☒ 237 --- val acc 6.700% | epoch 22 | iter 1 / 351 | time 0[s] | loss 0.84 | epoch 22 | iter 21 / 351 | time 1[s] | loss 0.80 | epoch 22 | iter 41 / 351 | time 2[s] | loss 0.81 | epoch 22 | iter 61 / 351 | time 2[s] | loss 0.81 | epoch 22 | iter 81 / 351 | time 3[s] | loss 0.81 | epoch 22 | iter 101 / 351 | time 4[s] | loss 0.80 | epoch 22 | iter 121 / 351 | time 5[s] | loss 0.79 | epoch 22 | iter 141 / 351 | time 6[s] | loss 0.81 | epoch 22 | iter 161 / 351 | time 7[s] | loss 0.80 | epoch 22 | iter 181 / 351 | time 8[s] | loss 0.84 | epoch 22 | iter 201 / 351 | time 9[s] | loss 0.79 | epoch 22 | iter 221 / 351 | time 10[s] | loss 0.81 | epoch 22 | iter 241 / 351 | time 11[s] | loss 0.80 | epoch 22 | iter 261 / 351 | time 12[s] | loss 0.78 | epoch 22 | iter 281 / 351 | time 13[s] | loss 0.79 | epoch 22 | iter 301 / 351 | time 14[s] | loss 0.83 | epoch 22 | iter 321 / 351 | time 15[s] | loss 0.82 | epoch 22 | iter 341 / 351 | time 16[s] | loss 0.81 Q 77+85 T 162 ☒ 160 --- Q 975+164 T 1139 ☒ 1129 --- Q 582+84 T 666 ☒ 657 --- Q 8+155 T 163 ☒ 157 --- Q 367+55 T 422 ☒ 419 --- Q 600+257 T 857 ☒ 855 --- Q 761+292 T 1053 ☒ 1059 --- Q 830+597 T 1427 ☒ 1418 --- Q 26+838 T 864 ☒ 867 --- Q 143+93 T 236 ☒ 232 --- val acc 8.980% | epoch 23 | iter 1 / 351 | time 0[s] | loss 0.76 | epoch 23 | iter 21 / 351 | time 0[s] | loss 0.80 | epoch 23 | iter 41 / 351 | time 1[s] | loss 0.78 | epoch 23 | iter 61 / 351 | time 2[s] | loss 0.78 | epoch 23 | iter 81 / 351 | time 3[s] | loss 0.79 | epoch 23 | iter 101 / 351 | time 4[s] | loss 0.79 | epoch 23 | iter 121 / 351 | time 5[s] | loss 0.80 | epoch 23 | iter 141 / 351 | time 6[s] | loss 0.79 | epoch 23 | iter 161 / 351 | time 7[s] | loss 0.79 | epoch 23 | iter 181 / 351 | time 8[s] | loss 0.80 | epoch 23 | iter 201 / 351 | time 8[s] | loss 0.78 | epoch 23 | iter 221 / 351 | time 9[s] | loss 0.78 | epoch 23 | iter 241 / 351 | time 10[s] | loss 0.77 | epoch 23 | iter 261 / 351 | time 11[s] | loss 0.78 | epoch 23 | iter 281 / 351 | time 12[s] | loss 0.81 | epoch 23 | iter 301 / 351 | time 13[s] | loss 0.78 | epoch 23 | iter 321 / 351 | time 13[s] | loss 0.76 | epoch 23 | iter 341 / 351 | time 14[s] | loss 0.77 Q 77+85 T 162 ☒ 158 --- Q 975+164 T 1139 ☒ 1107 --- Q 582+84 T 666 ☒ 665 --- Q 8+155 T 163 ☒ 165 --- Q 367+55 T 422 ☒ 420 --- Q 600+257 T 857 ☒ 856 --- Q 761+292 T 1053 ☒ 1039 --- Q 830+597 T 1427 ☒ 1404 --- Q 26+838 T 864 ☒ 861 --- Q 143+93 T 236 ☒ 235 --- val acc 7.680% | epoch 24 | iter 1 / 351 | time 0[s] | loss 0.77 | epoch 24 | iter 21 / 351 | time 1[s] | loss 0.77 | epoch 24 | iter 41 / 351 | time 1[s] | loss 0.79 | epoch 24 | iter 61 / 351 | time 2[s] | loss 0.78 | epoch 24 | iter 81 / 351 | time 3[s] | loss 0.78 | epoch 24 | iter 101 / 351 | time 4[s] | loss 0.82 | epoch 24 | iter 121 / 351 | time 5[s] | loss 0.80 | epoch 24 | iter 141 / 351 | time 6[s] | loss 0.80 | epoch 24 | iter 161 / 351 | time 7[s] | loss 0.80 | epoch 24 | iter 181 / 351 | time 8[s] | loss 0.78 | epoch 24 | iter 201 / 351 | time 9[s] | loss 0.77 | epoch 24 | iter 221 / 351 | time 9[s] | loss 0.77 | epoch 24 | iter 241 / 351 | time 10[s] | loss 0.78 | epoch 24 | iter 261 / 351 | time 11[s] | loss 0.78 | epoch 24 | iter 281 / 351 | time 12[s] | loss 0.76 | epoch 24 | iter 301 / 351 | time 13[s] | loss 0.76 | epoch 24 | iter 321 / 351 | time 14[s] | loss 0.80 | epoch 24 | iter 341 / 351 | time 15[s] | loss 0.77 Q 77+85 T 162 ☒ 159 --- Q 975+164 T 1139 ☒ 1127 --- Q 582+84 T 666 ☒ 661 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☒ 417 --- Q 600+257 T 857 ☒ 841 --- Q 761+292 T 1053 ☒ 1049 --- Q 830+597 T 1427 ☒ 1420 --- Q 26+838 T 864 ☒ 859 --- Q 143+93 T 236 ☒ 229 --- val acc 4.920% | epoch 25 | iter 1 / 351 | time 0[s] | loss 0.84 | epoch 25 | iter 21 / 351 | time 1[s] | loss 0.79 | epoch 25 | iter 41 / 351 | time 1[s] | loss 0.78 | epoch 25 | iter 61 / 351 | time 3[s] | loss 0.78 | epoch 25 | iter 81 / 351 | time 3[s] | loss 0.78 | epoch 25 | iter 101 / 351 | time 4[s] | loss 0.77 | epoch 25 | iter 121 / 351 | time 5[s] | loss 0.78 | epoch 25 | iter 141 / 351 | time 6[s] | loss 0.77 | epoch 25 | iter 161 / 351 | time 7[s] | loss 0.77 | epoch 25 | iter 181 / 351 | time 8[s] | loss 0.77 | epoch 25 | iter 201 / 351 | time 9[s] | loss 0.76 | epoch 25 | iter 221 / 351 | time 10[s] | loss 0.77 | epoch 25 | iter 241 / 351 | time 11[s] | loss 0.75 | epoch 25 | iter 261 / 351 | time 11[s] | loss 0.75 | epoch 25 | iter 281 / 351 | time 12[s] | loss 0.76 | epoch 25 | iter 301 / 351 | time 13[s] | loss 0.80 | epoch 25 | iter 321 / 351 | time 14[s] | loss 0.85 | epoch 25 | iter 341 / 351 | time 15[s] | loss 0.75 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1146 --- Q 582+84 T 666 ☒ 659 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 419 --- Q 600+257 T 857 ☒ 859 --- Q 761+292 T 1053 ☒ 1061 --- Q 830+597 T 1427 ☒ 1418 --- Q 26+838 T 864 ☒ 862 --- Q 143+93 T 236 ☑ 236 --- val acc 10.640%
# coding: utf-8
import sys
sys.path.append('..')
import numpy as np
import matplotlib.pyplot as plt
from dataset import sequence
from common.optimizer import Adam
from common.trainer import Trainer
from common.util import eval_seq2seq
from seq2seq import Seq2seq
from peeky_seq2seq import PeekySeq2seq
# データセットの読み込み
(x_train, t_train), (x_test, t_test) = sequence.load_data('addition.txt')
char_to_id, id_to_char = sequence.get_vocab()
# Reverse input? =================================================
is_reverse = True
if is_reverse:
x_train, x_test = x_train[:, ::-1], x_test[:, ::-1]
# ================================================================
# ハイパーパラメータの設定
vocab_size = len(char_to_id)
wordvec_size = 16
hideen_size = 128
batch_size = 128
max_epoch = 25
max_grad = 5.0
# Normal or Peeky? ==============================================
model = Seq2seq(vocab_size, wordvec_size, hideen_size)
# model = PeekySeq2seq(vocab_size, wordvec_size, hideen_size)
# ================================================================
optimizer = Adam()
trainer = Trainer(model, optimizer)
acc_list_rev = []
for epoch in range(max_epoch):
trainer.fit(x_train, t_train, max_epoch=1,
batch_size=batch_size, max_grad=max_grad)
correct_num = 0
for i in range(len(x_test)):
question, correct = x_test[[i]], t_test[[i]]
verbose = i < 10
correct_num += eval_seq2seq(model, question, correct,
id_to_char, verbose, is_reverse)
acc = float(correct_num) / len(x_test)
acc_list_rev.append(acc)
print('val acc %.3f%%' % (acc * 100))
# グラフの描画
x_rev = np.arange(len(acc_list))
plt.plot(x_rev, acc_list_rev, marker='o')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.ylim(0, 1.0)
plt.show()
| epoch 1 | iter 1 / 351 | time 0[s] | loss 2.56 | epoch 1 | iter 21 / 351 | time 0[s] | loss 2.52 | epoch 1 | iter 41 / 351 | time 1[s] | loss 2.17 | epoch 1 | iter 61 / 351 | time 2[s] | loss 1.96 | epoch 1 | iter 81 / 351 | time 3[s] | loss 1.91 | epoch 1 | iter 101 / 351 | time 3[s] | loss 1.87 | epoch 1 | iter 121 / 351 | time 4[s] | loss 1.86 | epoch 1 | iter 141 / 351 | time 5[s] | loss 1.84 | epoch 1 | iter 161 / 351 | time 5[s] | loss 1.80 | epoch 1 | iter 181 / 351 | time 6[s] | loss 1.78 | epoch 1 | iter 201 / 351 | time 7[s] | loss 1.77 | epoch 1 | iter 221 / 351 | time 8[s] | loss 1.77 | epoch 1 | iter 241 / 351 | time 9[s] | loss 1.76 | epoch 1 | iter 261 / 351 | time 9[s] | loss 1.75 | epoch 1 | iter 281 / 351 | time 10[s] | loss 1.74 | epoch 1 | iter 301 / 351 | time 11[s] | loss 1.74 | epoch 1 | iter 321 / 351 | time 11[s] | loss 1.74 | epoch 1 | iter 341 / 351 | time 12[s] | loss 1.73 Q 77+85 T 162 ☒ 100 --- Q 975+164 T 1139 ☒ 1000 --- Q 582+84 T 666 ☒ 1001 --- Q 8+155 T 163 ☒ 100 --- Q 367+55 T 422 ☒ 1001 --- Q 600+257 T 857 ☒ 1000 --- Q 761+292 T 1053 ☒ 1000 --- Q 830+597 T 1427 ☒ 1000 --- Q 26+838 T 864 ☒ 1001 --- Q 143+93 T 236 ☒ 703 --- val acc 0.120% | epoch 2 | iter 1 / 351 | time 0[s] | loss 1.73 | epoch 2 | iter 21 / 351 | time 0[s] | loss 1.72 | epoch 2 | iter 41 / 351 | time 1[s] | loss 1.72 | epoch 2 | iter 61 / 351 | time 2[s] | loss 1.72 | epoch 2 | iter 81 / 351 | time 2[s] | loss 1.70 | epoch 2 | iter 101 / 351 | time 3[s] | loss 1.70 | epoch 2 | iter 121 / 351 | time 4[s] | loss 1.69 | epoch 2 | iter 141 / 351 | time 4[s] | loss 1.68 | epoch 2 | iter 161 / 351 | time 5[s] | loss 1.67 | epoch 2 | iter 181 / 351 | time 6[s] | loss 1.66 | epoch 2 | iter 201 / 351 | time 6[s] | loss 1.66 | epoch 2 | iter 221 / 351 | time 7[s] | loss 1.65 | epoch 2 | iter 241 / 351 | time 8[s] | loss 1.63 | epoch 2 | iter 261 / 351 | time 8[s] | loss 1.62 | epoch 2 | iter 281 / 351 | time 9[s] | loss 1.61 | epoch 2 | iter 301 / 351 | time 10[s] | loss 1.60 | epoch 2 | iter 321 / 351 | time 11[s] | loss 1.58 | epoch 2 | iter 341 / 351 | time 11[s] | loss 1.56 Q 77+85 T 162 ☒ 100 --- Q 975+164 T 1139 ☒ 1000 --- Q 582+84 T 666 ☒ 690 --- Q 8+155 T 163 ☒ 1000 --- Q 367+55 T 422 ☒ 470 --- Q 600+257 T 857 ☒ 700 --- Q 761+292 T 1053 ☒ 1000 --- Q 830+597 T 1427 ☒ 1444 --- Q 26+838 T 864 ☒ 700 --- Q 143+93 T 236 ☒ 370 --- val acc 0.400% | epoch 3 | iter 1 / 351 | time 0[s] | loss 1.52 | epoch 3 | iter 21 / 351 | time 0[s] | loss 1.53 | epoch 3 | iter 41 / 351 | time 1[s] | loss 1.51 | epoch 3 | iter 61 / 351 | time 2[s] | loss 1.49 | epoch 3 | iter 81 / 351 | time 2[s] | loss 1.47 | epoch 3 | iter 101 / 351 | time 3[s] | loss 1.45 | epoch 3 | iter 121 / 351 | time 4[s] | loss 1.44 | epoch 3 | iter 141 / 351 | time 5[s] | loss 1.42 | epoch 3 | iter 161 / 351 | time 5[s] | loss 1.40 | epoch 3 | iter 181 / 351 | time 6[s] | loss 1.38 | epoch 3 | iter 201 / 351 | time 7[s] | loss 1.37 | epoch 3 | iter 221 / 351 | time 7[s] | loss 1.35 | epoch 3 | iter 241 / 351 | time 8[s] | loss 1.33 | epoch 3 | iter 261 / 351 | time 9[s] | loss 1.32 | epoch 3 | iter 281 / 351 | time 10[s] | loss 1.30 | epoch 3 | iter 301 / 351 | time 10[s] | loss 1.29 | epoch 3 | iter 321 / 351 | time 11[s] | loss 1.28 | epoch 3 | iter 341 / 351 | time 12[s] | loss 1.27 Q 77+85 T 162 ☒ 158 --- Q 975+164 T 1139 ☒ 1148 --- Q 582+84 T 666 ☒ 662 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 382 --- Q 600+257 T 857 ☒ 818 --- Q 761+292 T 1053 ☒ 1008 --- Q 830+597 T 1427 ☒ 1434 --- Q 26+838 T 864 ☒ 838 --- Q 143+93 T 236 ☒ 202 --- val acc 1.940% | epoch 4 | iter 1 / 351 | time 0[s] | loss 1.26 | epoch 4 | iter 21 / 351 | time 0[s] | loss 1.25 | epoch 4 | iter 41 / 351 | time 1[s] | loss 1.23 | epoch 4 | iter 61 / 351 | time 2[s] | loss 1.22 | epoch 4 | iter 81 / 351 | time 3[s] | loss 1.20 | epoch 4 | iter 101 / 351 | time 3[s] | loss 1.19 | epoch 4 | iter 121 / 351 | time 4[s] | loss 1.18 | epoch 4 | iter 141 / 351 | time 5[s] | loss 1.17 | epoch 4 | iter 161 / 351 | time 5[s] | loss 1.15 | epoch 4 | iter 181 / 351 | time 6[s] | loss 1.13 | epoch 4 | iter 201 / 351 | time 7[s] | loss 1.12 | epoch 4 | iter 221 / 351 | time 8[s] | loss 1.11 | epoch 4 | iter 241 / 351 | time 8[s] | loss 1.09 | epoch 4 | iter 261 / 351 | time 9[s] | loss 1.08 | epoch 4 | iter 281 / 351 | time 10[s] | loss 1.07 | epoch 4 | iter 301 / 351 | time 11[s] | loss 1.07 | epoch 4 | iter 321 / 351 | time 11[s] | loss 1.05 | epoch 4 | iter 341 / 351 | time 12[s] | loss 1.03 Q 77+85 T 162 ☒ 166 --- Q 975+164 T 1139 ☒ 1196 --- Q 582+84 T 666 ☒ 668 --- Q 8+155 T 163 ☒ 166 --- Q 367+55 T 422 ☒ 419 --- Q 600+257 T 857 ☒ 896 --- Q 761+292 T 1053 ☒ 1010 --- Q 830+597 T 1427 ☒ 1496 --- Q 26+838 T 864 ☒ 868 --- Q 143+93 T 236 ☒ 239 --- val acc 5.780% | epoch 5 | iter 1 / 351 | time 0[s] | loss 1.01 | epoch 5 | iter 21 / 351 | time 0[s] | loss 1.01 | epoch 5 | iter 41 / 351 | time 1[s] | loss 1.00 | epoch 5 | iter 61 / 351 | time 2[s] | loss 0.99 | epoch 5 | iter 81 / 351 | time 2[s] | loss 0.97 | epoch 5 | iter 101 / 351 | time 3[s] | loss 0.95 | epoch 5 | iter 121 / 351 | time 4[s] | loss 0.95 | epoch 5 | iter 141 / 351 | time 5[s] | loss 0.94 | epoch 5 | iter 161 / 351 | time 5[s] | loss 0.93 | epoch 5 | iter 181 / 351 | time 6[s] | loss 0.93 | epoch 5 | iter 201 / 351 | time 7[s] | loss 0.91 | epoch 5 | iter 221 / 351 | time 7[s] | loss 0.89 | epoch 5 | iter 241 / 351 | time 8[s] | loss 0.89 | epoch 5 | iter 261 / 351 | time 9[s] | loss 0.88 | epoch 5 | iter 281 / 351 | time 10[s] | loss 0.87 | epoch 5 | iter 301 / 351 | time 10[s] | loss 0.86 | epoch 5 | iter 321 / 351 | time 11[s] | loss 0.85 | epoch 5 | iter 341 / 351 | time 12[s] | loss 0.84 Q 77+85 T 162 ☒ 161 --- Q 975+164 T 1139 ☒ 1192 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 166 --- Q 367+55 T 422 ☒ 421 --- Q 600+257 T 857 ☒ 860 --- Q 761+292 T 1053 ☒ 1066 --- Q 830+597 T 1427 ☒ 1414 --- Q 26+838 T 864 ☒ 865 --- Q 143+93 T 236 ☒ 232 --- val acc 12.460% | epoch 6 | iter 1 / 351 | time 0[s] | loss 0.86 | epoch 6 | iter 21 / 351 | time 0[s] | loss 0.82 | epoch 6 | iter 41 / 351 | time 1[s] | loss 0.82 | epoch 6 | iter 61 / 351 | time 2[s] | loss 0.81 | epoch 6 | iter 81 / 351 | time 3[s] | loss 0.80 | epoch 6 | iter 101 / 351 | time 3[s] | loss 0.80 | epoch 6 | iter 121 / 351 | time 4[s] | loss 0.79 | epoch 6 | iter 141 / 351 | time 5[s] | loss 0.78 | epoch 6 | iter 161 / 351 | time 5[s] | loss 0.77 | epoch 6 | iter 181 / 351 | time 6[s] | loss 0.77 | epoch 6 | iter 201 / 351 | time 7[s] | loss 0.78 | epoch 6 | iter 221 / 351 | time 7[s] | loss 0.76 | epoch 6 | iter 241 / 351 | time 8[s] | loss 0.75 | epoch 6 | iter 261 / 351 | time 9[s] | loss 0.74 | epoch 6 | iter 281 / 351 | time 10[s] | loss 0.73 | epoch 6 | iter 301 / 351 | time 10[s] | loss 0.73 | epoch 6 | iter 321 / 351 | time 11[s] | loss 0.72 | epoch 6 | iter 341 / 351 | time 12[s] | loss 0.72 Q 77+85 T 162 ☒ 161 --- Q 975+164 T 1139 ☒ 1141 --- Q 582+84 T 666 ☒ 661 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 851 --- Q 761+292 T 1053 ☒ 1061 --- Q 830+597 T 1427 ☒ 1391 --- Q 26+838 T 864 ☒ 866 --- Q 143+93 T 236 ☒ 234 --- val acc 14.260% | epoch 7 | iter 1 / 351 | time 0[s] | loss 0.71 | epoch 7 | iter 21 / 351 | time 0[s] | loss 0.71 | epoch 7 | iter 41 / 351 | time 1[s] | loss 0.70 | epoch 7 | iter 61 / 351 | time 2[s] | loss 0.70 | epoch 7 | iter 81 / 351 | time 3[s] | loss 0.68 | epoch 7 | iter 101 / 351 | time 3[s] | loss 0.68 | epoch 7 | iter 121 / 351 | time 4[s] | loss 0.67 | epoch 7 | iter 141 / 351 | time 5[s] | loss 0.67 | epoch 7 | iter 161 / 351 | time 5[s] | loss 0.67 | epoch 7 | iter 181 / 351 | time 6[s] | loss 0.66 | epoch 7 | iter 201 / 351 | time 7[s] | loss 0.66 | epoch 7 | iter 221 / 351 | time 7[s] | loss 0.66 | epoch 7 | iter 241 / 351 | time 8[s] | loss 0.64 | epoch 7 | iter 261 / 351 | time 9[s] | loss 0.65 | epoch 7 | iter 281 / 351 | time 10[s] | loss 0.64 | epoch 7 | iter 301 / 351 | time 10[s] | loss 0.63 | epoch 7 | iter 321 / 351 | time 11[s] | loss 0.63 | epoch 7 | iter 341 / 351 | time 12[s] | loss 0.62 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1142 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 859 --- Q 761+292 T 1053 ☒ 1144 --- Q 830+597 T 1427 ☒ 1431 --- Q 26+838 T 864 ☒ 866 --- Q 143+93 T 236 ☒ 239 --- val acc 17.500% | epoch 8 | iter 1 / 351 | time 0[s] | loss 0.66 | epoch 8 | iter 21 / 351 | time 0[s] | loss 0.61 | epoch 8 | iter 41 / 351 | time 1[s] | loss 0.62 | epoch 8 | iter 61 / 351 | time 2[s] | loss 0.61 | epoch 8 | iter 81 / 351 | time 3[s] | loss 0.61 | epoch 8 | iter 101 / 351 | time 3[s] | loss 0.61 | epoch 8 | iter 121 / 351 | time 4[s] | loss 0.60 | epoch 8 | iter 141 / 351 | time 5[s] | loss 0.60 | epoch 8 | iter 161 / 351 | time 5[s] | loss 0.59 | epoch 8 | iter 181 / 351 | time 6[s] | loss 0.58 | epoch 8 | iter 201 / 351 | time 7[s] | loss 0.59 | epoch 8 | iter 221 / 351 | time 7[s] | loss 0.60 | epoch 8 | iter 241 / 351 | time 8[s] | loss 0.59 | epoch 8 | iter 261 / 351 | time 9[s] | loss 0.58 | epoch 8 | iter 281 / 351 | time 10[s] | loss 0.59 | epoch 8 | iter 301 / 351 | time 10[s] | loss 0.58 | epoch 8 | iter 321 / 351 | time 11[s] | loss 0.57 | epoch 8 | iter 341 / 351 | time 12[s] | loss 0.57 Q 77+85 T 162 ☒ 163 --- Q 975+164 T 1139 ☒ 1134 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 423 --- Q 600+257 T 857 ☒ 759 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☒ 1431 --- Q 26+838 T 864 ☒ 866 --- Q 143+93 T 236 ☒ 238 --- val acc 23.080% | epoch 9 | iter 1 / 351 | time 0[s] | loss 0.55 | epoch 9 | iter 21 / 351 | time 0[s] | loss 0.56 | epoch 9 | iter 41 / 351 | time 1[s] | loss 0.56 | epoch 9 | iter 61 / 351 | time 2[s] | loss 0.55 | epoch 9 | iter 81 / 351 | time 2[s] | loss 0.54 | epoch 9 | iter 101 / 351 | time 3[s] | loss 0.55 | epoch 9 | iter 121 / 351 | time 4[s] | loss 0.55 | epoch 9 | iter 141 / 351 | time 4[s] | loss 0.54 | epoch 9 | iter 161 / 351 | time 5[s] | loss 0.55 | epoch 9 | iter 181 / 351 | time 6[s] | loss 0.53 | epoch 9 | iter 201 / 351 | time 7[s] | loss 0.54 | epoch 9 | iter 221 / 351 | time 8[s] | loss 0.54 | epoch 9 | iter 241 / 351 | time 8[s] | loss 0.53 | epoch 9 | iter 261 / 351 | time 9[s] | loss 0.53 | epoch 9 | iter 281 / 351 | time 10[s] | loss 0.54 | epoch 9 | iter 301 / 351 | time 10[s] | loss 0.54 | epoch 9 | iter 321 / 351 | time 11[s] | loss 0.53 | epoch 9 | iter 341 / 351 | time 12[s] | loss 0.53 Q 77+85 T 162 ☒ 158 --- Q 975+164 T 1139 ☒ 1142 --- Q 582+84 T 666 ☒ 664 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 854 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☒ 1428 --- Q 26+838 T 864 ☒ 862 --- Q 143+93 T 236 ☒ 238 --- val acc 26.540% | epoch 10 | iter 1 / 351 | time 0[s] | loss 0.50 | epoch 10 | iter 21 / 351 | time 0[s] | loss 0.51 | epoch 10 | iter 41 / 351 | time 1[s] | loss 0.52 | epoch 10 | iter 61 / 351 | time 2[s] | loss 0.55 | epoch 10 | iter 81 / 351 | time 2[s] | loss 0.52 | epoch 10 | iter 101 / 351 | time 3[s] | loss 0.51 | epoch 10 | iter 121 / 351 | time 4[s] | loss 0.50 | epoch 10 | iter 141 / 351 | time 5[s] | loss 0.51 | epoch 10 | iter 161 / 351 | time 5[s] | loss 0.52 | epoch 10 | iter 181 / 351 | time 6[s] | loss 0.53 | epoch 10 | iter 201 / 351 | time 7[s] | loss 0.50 | epoch 10 | iter 221 / 351 | time 7[s] | loss 0.50 | epoch 10 | iter 241 / 351 | time 8[s] | loss 0.50 | epoch 10 | iter 261 / 351 | time 9[s] | loss 0.50 | epoch 10 | iter 281 / 351 | time 10[s] | loss 0.49 | epoch 10 | iter 301 / 351 | time 10[s] | loss 0.48 | epoch 10 | iter 321 / 351 | time 11[s] | loss 0.48 | epoch 10 | iter 341 / 351 | time 12[s] | loss 0.48 Q 77+85 T 162 ☒ 163 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☒ 664 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 421 --- Q 600+257 T 857 ☒ 859 --- Q 761+292 T 1053 ☒ 1054 --- Q 830+597 T 1427 ☒ 1431 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 235 --- val acc 29.820% | epoch 11 | iter 1 / 351 | time 0[s] | loss 0.47 | epoch 11 | iter 21 / 351 | time 0[s] | loss 0.48 | epoch 11 | iter 41 / 351 | time 1[s] | loss 0.48 | epoch 11 | iter 61 / 351 | time 2[s] | loss 0.48 | epoch 11 | iter 81 / 351 | time 2[s] | loss 0.47 | epoch 11 | iter 101 / 351 | time 3[s] | loss 0.47 | epoch 11 | iter 121 / 351 | time 4[s] | loss 0.47 | epoch 11 | iter 141 / 351 | time 4[s] | loss 0.47 | epoch 11 | iter 161 / 351 | time 5[s] | loss 0.48 | epoch 11 | iter 181 / 351 | time 6[s] | loss 0.48 | epoch 11 | iter 201 / 351 | time 6[s] | loss 0.47 | epoch 11 | iter 221 / 351 | time 7[s] | loss 0.47 | epoch 11 | iter 241 / 351 | time 8[s] | loss 0.46 | epoch 11 | iter 261 / 351 | time 8[s] | loss 0.46 | epoch 11 | iter 281 / 351 | time 9[s] | loss 0.46 | epoch 11 | iter 301 / 351 | time 10[s] | loss 0.48 | epoch 11 | iter 321 / 351 | time 11[s] | loss 0.45 | epoch 11 | iter 341 / 351 | time 11[s] | loss 0.45 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1140 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 421 --- Q 600+257 T 857 ☒ 859 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☒ 1426 --- Q 26+838 T 864 ☒ 866 --- Q 143+93 T 236 ☒ 238 --- val acc 28.480% | epoch 12 | iter 1 / 351 | time 0[s] | loss 0.46 | epoch 12 | iter 21 / 351 | time 0[s] | loss 0.45 | epoch 12 | iter 41 / 351 | time 1[s] | loss 0.45 | epoch 12 | iter 61 / 351 | time 2[s] | loss 0.46 | epoch 12 | iter 81 / 351 | time 2[s] | loss 0.45 | epoch 12 | iter 101 / 351 | time 3[s] | loss 0.46 | epoch 12 | iter 121 / 351 | time 4[s] | loss 0.46 | epoch 12 | iter 141 / 351 | time 4[s] | loss 0.46 | epoch 12 | iter 161 / 351 | time 5[s] | loss 0.45 | epoch 12 | iter 181 / 351 | time 6[s] | loss 0.44 | epoch 12 | iter 201 / 351 | time 7[s] | loss 0.45 | epoch 12 | iter 221 / 351 | time 7[s] | loss 0.44 | epoch 12 | iter 241 / 351 | time 8[s] | loss 0.43 | epoch 12 | iter 261 / 351 | time 9[s] | loss 0.43 | epoch 12 | iter 281 / 351 | time 9[s] | loss 0.44 | epoch 12 | iter 301 / 351 | time 10[s] | loss 0.45 | epoch 12 | iter 321 / 351 | time 11[s] | loss 0.44 | epoch 12 | iter 341 / 351 | time 12[s] | loss 0.43 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☒ 667 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☒ 1051 --- Q 830+597 T 1427 ☒ 1428 --- Q 26+838 T 864 ☒ 862 --- Q 143+93 T 236 ☒ 237 --- val acc 36.640% | epoch 13 | iter 1 / 351 | time 0[s] | loss 0.41 | epoch 13 | iter 21 / 351 | time 0[s] | loss 0.43 | epoch 13 | iter 41 / 351 | time 1[s] | loss 0.42 | epoch 13 | iter 61 / 351 | time 2[s] | loss 0.42 | epoch 13 | iter 81 / 351 | time 2[s] | loss 0.42 | epoch 13 | iter 101 / 351 | time 3[s] | loss 0.44 | epoch 13 | iter 121 / 351 | time 4[s] | loss 0.43 | epoch 13 | iter 141 / 351 | time 4[s] | loss 0.41 | epoch 13 | iter 161 / 351 | time 5[s] | loss 0.42 | epoch 13 | iter 181 / 351 | time 6[s] | loss 0.42 | epoch 13 | iter 201 / 351 | time 7[s] | loss 0.42 | epoch 13 | iter 221 / 351 | time 7[s] | loss 0.43 | epoch 13 | iter 241 / 351 | time 8[s] | loss 0.43 | epoch 13 | iter 261 / 351 | time 9[s] | loss 0.41 | epoch 13 | iter 281 / 351 | time 9[s] | loss 0.42 | epoch 13 | iter 301 / 351 | time 10[s] | loss 0.41 | epoch 13 | iter 321 / 351 | time 11[s] | loss 0.43 | epoch 13 | iter 341 / 351 | time 11[s] | loss 0.40 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1140 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 424 --- Q 600+257 T 857 ☒ 856 --- Q 761+292 T 1053 ☒ 1054 --- Q 830+597 T 1427 ☒ 1429 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 239 --- val acc 39.420% | epoch 14 | iter 1 / 351 | time 0[s] | loss 0.41 | epoch 14 | iter 21 / 351 | time 0[s] | loss 0.41 | epoch 14 | iter 41 / 351 | time 1[s] | loss 0.41 | epoch 14 | iter 61 / 351 | time 2[s] | loss 0.40 | epoch 14 | iter 81 / 351 | time 2[s] | loss 0.40 | epoch 14 | iter 101 / 351 | time 3[s] | loss 0.41 | epoch 14 | iter 121 / 351 | time 4[s] | loss 0.39 | epoch 14 | iter 141 / 351 | time 4[s] | loss 0.39 | epoch 14 | iter 161 / 351 | time 5[s] | loss 0.38 | epoch 14 | iter 181 / 351 | time 6[s] | loss 0.38 | epoch 14 | iter 201 / 351 | time 7[s] | loss 0.38 | epoch 14 | iter 221 / 351 | time 7[s] | loss 0.38 | epoch 14 | iter 241 / 351 | time 8[s] | loss 0.39 | epoch 14 | iter 261 / 351 | time 9[s] | loss 0.40 | epoch 14 | iter 281 / 351 | time 9[s] | loss 0.41 | epoch 14 | iter 301 / 351 | time 10[s] | loss 0.39 | epoch 14 | iter 321 / 351 | time 11[s] | loss 0.39 | epoch 14 | iter 341 / 351 | time 11[s] | loss 0.39 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1137 --- Q 582+84 T 666 ☒ 667 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 858 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☒ 1426 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 235 --- val acc 36.680% | epoch 15 | iter 1 / 351 | time 0[s] | loss 0.38 | epoch 15 | iter 21 / 351 | time 0[s] | loss 0.39 | epoch 15 | iter 41 / 351 | time 1[s] | loss 0.39 | epoch 15 | iter 61 / 351 | time 2[s] | loss 0.38 | epoch 15 | iter 81 / 351 | time 2[s] | loss 0.38 | epoch 15 | iter 101 / 351 | time 3[s] | loss 0.38 | epoch 15 | iter 121 / 351 | time 4[s] | loss 0.38 | epoch 15 | iter 141 / 351 | time 5[s] | loss 0.38 | epoch 15 | iter 161 / 351 | time 5[s] | loss 0.38 | epoch 15 | iter 181 / 351 | time 6[s] | loss 0.38 | epoch 15 | iter 201 / 351 | time 7[s] | loss 0.38 | epoch 15 | iter 221 / 351 | time 8[s] | loss 0.39 | epoch 15 | iter 241 / 351 | time 8[s] | loss 0.38 | epoch 15 | iter 261 / 351 | time 9[s] | loss 0.37 | epoch 15 | iter 281 / 351 | time 10[s] | loss 0.37 | epoch 15 | iter 301 / 351 | time 11[s] | loss 0.39 | epoch 15 | iter 321 / 351 | time 11[s] | loss 0.39 | epoch 15 | iter 341 / 351 | time 12[s] | loss 0.37 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1137 --- Q 582+84 T 666 ☒ 667 --- Q 8+155 T 163 ☒ 164 --- Q 367+55 T 422 ☒ 420 --- Q 600+257 T 857 ☒ 856 --- Q 761+292 T 1053 ☒ 1052 --- Q 830+597 T 1427 ☒ 1431 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 237 --- val acc 41.100% | epoch 16 | iter 1 / 351 | time 0[s] | loss 0.36 | epoch 16 | iter 21 / 351 | time 0[s] | loss 0.36 | epoch 16 | iter 41 / 351 | time 1[s] | loss 0.36 | epoch 16 | iter 61 / 351 | time 2[s] | loss 0.36 | epoch 16 | iter 81 / 351 | time 2[s] | loss 0.37 | epoch 16 | iter 101 / 351 | time 3[s] | loss 0.36 | epoch 16 | iter 121 / 351 | time 4[s] | loss 0.37 | epoch 16 | iter 141 / 351 | time 4[s] | loss 0.36 | epoch 16 | iter 161 / 351 | time 5[s] | loss 0.37 | epoch 16 | iter 181 / 351 | time 6[s] | loss 0.36 | epoch 16 | iter 201 / 351 | time 7[s] | loss 0.38 | epoch 16 | iter 221 / 351 | time 7[s] | loss 0.38 | epoch 16 | iter 241 / 351 | time 8[s] | loss 0.36 | epoch 16 | iter 261 / 351 | time 9[s] | loss 0.35 | epoch 16 | iter 281 / 351 | time 9[s] | loss 0.35 | epoch 16 | iter 301 / 351 | time 10[s] | loss 0.35 | epoch 16 | iter 321 / 351 | time 11[s] | loss 0.35 | epoch 16 | iter 341 / 351 | time 11[s] | loss 0.37 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1142 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 858 --- Q 761+292 T 1053 ☒ 1054 --- Q 830+597 T 1427 ☒ 1430 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 237 --- val acc 42.700% | epoch 17 | iter 1 / 351 | time 0[s] | loss 0.34 | epoch 17 | iter 21 / 351 | time 0[s] | loss 0.36 | epoch 17 | iter 41 / 351 | time 1[s] | loss 0.36 | epoch 17 | iter 61 / 351 | time 2[s] | loss 0.35 | epoch 17 | iter 81 / 351 | time 2[s] | loss 0.36 | epoch 17 | iter 101 / 351 | time 3[s] | loss 0.34 | epoch 17 | iter 121 / 351 | time 4[s] | loss 0.34 | epoch 17 | iter 141 / 351 | time 5[s] | loss 0.34 | epoch 17 | iter 161 / 351 | time 5[s] | loss 0.34 | epoch 17 | iter 181 / 351 | time 6[s] | loss 0.35 | epoch 17 | iter 201 / 351 | time 7[s] | loss 0.35 | epoch 17 | iter 221 / 351 | time 7[s] | loss 0.34 | epoch 17 | iter 241 / 351 | time 8[s] | loss 0.35 | epoch 17 | iter 261 / 351 | time 9[s] | loss 0.36 | epoch 17 | iter 281 / 351 | time 9[s] | loss 0.37 | epoch 17 | iter 301 / 351 | time 10[s] | loss 0.37 | epoch 17 | iter 321 / 351 | time 11[s] | loss 0.37 | epoch 17 | iter 341 / 351 | time 11[s] | loss 0.36 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1138 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 856 --- Q 761+292 T 1053 ☒ 1051 --- Q 830+597 T 1427 ☒ 1429 --- Q 26+838 T 864 ☒ 865 --- Q 143+93 T 236 ☑ 236 --- val acc 42.860% | epoch 18 | iter 1 / 351 | time 0[s] | loss 0.34 | epoch 18 | iter 21 / 351 | time 0[s] | loss 0.36 | epoch 18 | iter 41 / 351 | time 1[s] | loss 0.35 | epoch 18 | iter 61 / 351 | time 2[s] | loss 0.35 | epoch 18 | iter 81 / 351 | time 2[s] | loss 0.34 | epoch 18 | iter 101 / 351 | time 3[s] | loss 0.34 | epoch 18 | iter 121 / 351 | time 4[s] | loss 0.33 | epoch 18 | iter 141 / 351 | time 4[s] | loss 0.33 | epoch 18 | iter 161 / 351 | time 5[s] | loss 0.34 | epoch 18 | iter 181 / 351 | time 6[s] | loss 0.33 | epoch 18 | iter 201 / 351 | time 7[s] | loss 0.33 | epoch 18 | iter 221 / 351 | time 7[s] | loss 0.33 | epoch 18 | iter 241 / 351 | time 8[s] | loss 0.33 | epoch 18 | iter 261 / 351 | time 9[s] | loss 0.33 | epoch 18 | iter 281 / 351 | time 9[s] | loss 0.33 | epoch 18 | iter 301 / 351 | time 10[s] | loss 0.33 | epoch 18 | iter 321 / 351 | time 11[s] | loss 0.35 | epoch 18 | iter 341 / 351 | time 11[s] | loss 0.35 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1138 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 423 --- Q 600+257 T 857 ☒ 856 --- Q 761+292 T 1053 ☒ 1052 --- Q 830+597 T 1427 ☒ 1425 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 237 --- val acc 40.640% | epoch 19 | iter 1 / 351 | time 0[s] | loss 0.34 | epoch 19 | iter 21 / 351 | time 0[s] | loss 0.34 | epoch 19 | iter 41 / 351 | time 1[s] | loss 0.34 | epoch 19 | iter 61 / 351 | time 2[s] | loss 0.34 | epoch 19 | iter 81 / 351 | time 2[s] | loss 0.34 | epoch 19 | iter 101 / 351 | time 3[s] | loss 0.32 | epoch 19 | iter 121 / 351 | time 4[s] | loss 0.31 | epoch 19 | iter 141 / 351 | time 4[s] | loss 0.33 | epoch 19 | iter 161 / 351 | time 5[s] | loss 0.31 | epoch 19 | iter 181 / 351 | time 6[s] | loss 0.30 | epoch 19 | iter 201 / 351 | time 7[s] | loss 0.31 | epoch 19 | iter 221 / 351 | time 7[s] | loss 0.31 | epoch 19 | iter 241 / 351 | time 8[s] | loss 0.32 | epoch 19 | iter 261 / 351 | time 9[s] | loss 0.33 | epoch 19 | iter 281 / 351 | time 10[s] | loss 0.33 | epoch 19 | iter 301 / 351 | time 10[s] | loss 0.32 | epoch 19 | iter 321 / 351 | time 11[s] | loss 0.33 | epoch 19 | iter 341 / 351 | time 12[s] | loss 0.33 Q 77+85 T 162 ☒ 161 --- Q 975+164 T 1139 ☒ 1140 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 858 --- Q 761+292 T 1053 ☒ 1052 --- Q 830+597 T 1427 ☒ 1430 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 238 --- val acc 47.560% | epoch 20 | iter 1 / 351 | time 0[s] | loss 0.32 | epoch 20 | iter 21 / 351 | time 0[s] | loss 0.35 | epoch 20 | iter 41 / 351 | time 1[s] | loss 0.33 | epoch 20 | iter 61 / 351 | time 2[s] | loss 0.33 | epoch 20 | iter 81 / 351 | time 2[s] | loss 0.31 | epoch 20 | iter 101 / 351 | time 3[s] | loss 0.31 | epoch 20 | iter 121 / 351 | time 4[s] | loss 0.33 | epoch 20 | iter 141 / 351 | time 4[s] | loss 0.32 | epoch 20 | iter 161 / 351 | time 5[s] | loss 0.33 | epoch 20 | iter 181 / 351 | time 6[s] | loss 0.31 | epoch 20 | iter 201 / 351 | time 7[s] | loss 0.30 | epoch 20 | iter 221 / 351 | time 7[s] | loss 0.32 | epoch 20 | iter 241 / 351 | time 8[s] | loss 0.33 | epoch 20 | iter 261 / 351 | time 9[s] | loss 0.35 | epoch 20 | iter 281 / 351 | time 9[s] | loss 0.36 | epoch 20 | iter 301 / 351 | time 10[s] | loss 0.34 | epoch 20 | iter 321 / 351 | time 11[s] | loss 0.32 | epoch 20 | iter 341 / 351 | time 12[s] | loss 0.32 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☒ 665 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 421 --- Q 600+257 T 857 ☒ 859 --- Q 761+292 T 1053 ☒ 1051 --- Q 830+597 T 1427 ☒ 1428 --- Q 26+838 T 864 ☒ 862 --- Q 143+93 T 236 ☑ 236 --- val acc 50.020% | epoch 21 | iter 1 / 351 | time 0[s] | loss 0.31 | epoch 21 | iter 21 / 351 | time 0[s] | loss 0.31 | epoch 21 | iter 41 / 351 | time 1[s] | loss 0.30 | epoch 21 | iter 61 / 351 | time 2[s] | loss 0.30 | epoch 21 | iter 81 / 351 | time 2[s] | loss 0.32 | epoch 21 | iter 101 / 351 | time 3[s] | loss 0.32 | epoch 21 | iter 121 / 351 | time 4[s] | loss 0.31 | epoch 21 | iter 141 / 351 | time 5[s] | loss 0.30 | epoch 21 | iter 161 / 351 | time 6[s] | loss 0.31 | epoch 21 | iter 181 / 351 | time 6[s] | loss 0.31 | epoch 21 | iter 201 / 351 | time 7[s] | loss 0.31 | epoch 21 | iter 221 / 351 | time 8[s] | loss 0.32 | epoch 21 | iter 241 / 351 | time 8[s] | loss 0.31 | epoch 21 | iter 261 / 351 | time 9[s] | loss 0.29 | epoch 21 | iter 281 / 351 | time 10[s] | loss 0.30 | epoch 21 | iter 301 / 351 | time 10[s] | loss 0.29 | epoch 21 | iter 321 / 351 | time 11[s] | loss 0.29 | epoch 21 | iter 341 / 351 | time 12[s] | loss 0.29 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☒ 420 --- Q 600+257 T 857 ☒ 859 --- Q 761+292 T 1053 ☒ 1052 --- Q 830+597 T 1427 ☒ 1428 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 50.860% | epoch 22 | iter 1 / 351 | time 0[s] | loss 0.32 | epoch 22 | iter 21 / 351 | time 0[s] | loss 0.29 | epoch 22 | iter 41 / 351 | time 1[s] | loss 0.28 | epoch 22 | iter 61 / 351 | time 2[s] | loss 0.29 | epoch 22 | iter 81 / 351 | time 2[s] | loss 0.29 | epoch 22 | iter 101 / 351 | time 3[s] | loss 0.28 | epoch 22 | iter 121 / 351 | time 4[s] | loss 0.29 | epoch 22 | iter 141 / 351 | time 5[s] | loss 0.31 | epoch 22 | iter 161 / 351 | time 5[s] | loss 0.31 | epoch 22 | iter 181 / 351 | time 6[s] | loss 0.30 | epoch 22 | iter 201 / 351 | time 7[s] | loss 0.28 | epoch 22 | iter 221 / 351 | time 7[s] | loss 0.32 | epoch 22 | iter 241 / 351 | time 8[s] | loss 0.33 | epoch 22 | iter 261 / 351 | time 9[s] | loss 0.32 | epoch 22 | iter 281 / 351 | time 10[s] | loss 0.31 | epoch 22 | iter 301 / 351 | time 10[s] | loss 0.30 | epoch 22 | iter 321 / 351 | time 11[s] | loss 0.29 | epoch 22 | iter 341 / 351 | time 12[s] | loss 0.31 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☒ 423 --- Q 600+257 T 857 ☒ 856 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☒ 1430 --- Q 26+838 T 864 ☒ 865 --- Q 143+93 T 236 ☒ 238 --- val acc 47.720% | epoch 23 | iter 1 / 351 | time 0[s] | loss 0.31 | epoch 23 | iter 21 / 351 | time 0[s] | loss 0.31 | epoch 23 | iter 41 / 351 | time 1[s] | loss 0.28 | epoch 23 | iter 61 / 351 | time 2[s] | loss 0.28 | epoch 23 | iter 81 / 351 | time 2[s] | loss 0.29 | epoch 23 | iter 101 / 351 | time 3[s] | loss 0.27 | epoch 23 | iter 121 / 351 | time 4[s] | loss 0.28 | epoch 23 | iter 141 / 351 | time 4[s] | loss 0.28 | epoch 23 | iter 161 / 351 | time 5[s] | loss 0.30 | epoch 23 | iter 181 / 351 | time 6[s] | loss 0.30 | epoch 23 | iter 201 / 351 | time 7[s] | loss 0.29 | epoch 23 | iter 221 / 351 | time 7[s] | loss 0.29 | epoch 23 | iter 241 / 351 | time 8[s] | loss 0.28 | epoch 23 | iter 261 / 351 | time 9[s] | loss 0.28 | epoch 23 | iter 281 / 351 | time 9[s] | loss 0.30 | epoch 23 | iter 301 / 351 | time 10[s] | loss 0.29 | epoch 23 | iter 321 / 351 | time 11[s] | loss 0.29 | epoch 23 | iter 341 / 351 | time 11[s] | loss 0.28 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1142 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 858 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☒ 1429 --- Q 26+838 T 864 ☒ 866 --- Q 143+93 T 236 ☒ 238 --- val acc 45.180% | epoch 24 | iter 1 / 351 | time 0[s] | loss 0.31 | epoch 24 | iter 21 / 351 | time 0[s] | loss 0.29 | epoch 24 | iter 41 / 351 | time 1[s] | loss 0.28 | epoch 24 | iter 61 / 351 | time 2[s] | loss 0.29 | epoch 24 | iter 81 / 351 | time 2[s] | loss 0.30 | epoch 24 | iter 101 / 351 | time 3[s] | loss 0.29 | epoch 24 | iter 121 / 351 | time 4[s] | loss 0.29 | epoch 24 | iter 141 / 351 | time 4[s] | loss 0.29 | epoch 24 | iter 161 / 351 | time 5[s] | loss 0.28 | epoch 24 | iter 181 / 351 | time 6[s] | loss 0.29 | epoch 24 | iter 201 / 351 | time 7[s] | loss 0.28 | epoch 24 | iter 221 / 351 | time 7[s] | loss 0.28 | epoch 24 | iter 241 / 351 | time 8[s] | loss 0.29 | epoch 24 | iter 261 / 351 | time 9[s] | loss 0.29 | epoch 24 | iter 281 / 351 | time 9[s] | loss 0.29 | epoch 24 | iter 301 / 351 | time 10[s] | loss 0.28 | epoch 24 | iter 321 / 351 | time 11[s] | loss 0.27 | epoch 24 | iter 341 / 351 | time 11[s] | loss 0.29 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☒ 421 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☒ 1054 --- Q 830+597 T 1427 ☒ 1428 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 237 --- val acc 51.920% | epoch 25 | iter 1 / 351 | time 0[s] | loss 0.29 | epoch 25 | iter 21 / 351 | time 0[s] | loss 0.29 | epoch 25 | iter 41 / 351 | time 1[s] | loss 0.28 | epoch 25 | iter 61 / 351 | time 2[s] | loss 0.26 | epoch 25 | iter 81 / 351 | time 2[s] | loss 0.26 | epoch 25 | iter 101 / 351 | time 3[s] | loss 0.27 | epoch 25 | iter 121 / 351 | time 4[s] | loss 0.29 | epoch 25 | iter 141 / 351 | time 4[s] | loss 0.28 | epoch 25 | iter 161 / 351 | time 5[s] | loss 0.28 | epoch 25 | iter 181 / 351 | time 6[s] | loss 0.28 | epoch 25 | iter 201 / 351 | time 7[s] | loss 0.27 | epoch 25 | iter 221 / 351 | time 7[s] | loss 0.29 | epoch 25 | iter 241 / 351 | time 8[s] | loss 0.27 | epoch 25 | iter 261 / 351 | time 9[s] | loss 0.28 | epoch 25 | iter 281 / 351 | time 9[s] | loss 0.28 | epoch 25 | iter 301 / 351 | time 10[s] | loss 0.27 | epoch 25 | iter 321 / 351 | time 11[s] | loss 0.28 | epoch 25 | iter 341 / 351 | time 11[s] | loss 0.28 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1140 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 856 --- Q 761+292 T 1053 ☒ 1052 --- Q 830+597 T 1427 ☒ 1426 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 54.200%
# coding: utf-8
import sys
sys.path.append('..')
import numpy as np
import matplotlib.pyplot as plt
from dataset import sequence
from common.optimizer import Adam
from common.trainer import Trainer
from common.util import eval_seq2seq
from seq2seq import Seq2seq
from peeky_seq2seq import PeekySeq2seq
# データセットの読み込み
(x_train, t_train), (x_test, t_test) = sequence.load_data('addition.txt')
char_to_id, id_to_char = sequence.get_vocab()
# Reverse input? =================================================
is_reverse = True
if is_reverse:
x_train, x_test = x_train[:, ::-1], x_test[:, ::-1]
# ================================================================
# ハイパーパラメータの設定
vocab_size = len(char_to_id)
wordvec_size = 16
hideen_size = 128
batch_size = 128
max_epoch = 25
max_grad = 5.0
# Normal or Peeky? ==============================================
# model = Seq2seq(vocab_size, wordvec_size, hideen_size)
model = PeekySeq2seq(vocab_size, wordvec_size, hideen_size)
# ================================================================
optimizer = Adam()
trainer = Trainer(model, optimizer)
acc_list_peeky = []
for epoch in range(max_epoch):
trainer.fit(x_train, t_train, max_epoch=1,
batch_size=batch_size, max_grad=max_grad)
correct_num = 0
for i in range(len(x_test)):
question, correct = x_test[[i]], t_test[[i]]
verbose = i < 10
correct_num += eval_seq2seq(model, question, correct,
id_to_char, verbose, is_reverse)
acc = float(correct_num) / len(x_test)
acc_list_peeky.append(acc)
print('val acc %.3f%%' % (acc * 100))
# グラフの描画
x_peeky = np.arange(len(acc_list))
plt.plot(x_peeky, acc_list_peeky, marker='o')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.ylim(0, 1.0)
plt.show()
| epoch 1 | iter 1 / 351 | time 0[s] | loss 2.57 | epoch 1 | iter 21 / 351 | time 0[s] | loss 2.48 | epoch 1 | iter 41 / 351 | time 1[s] | loss 2.20 | epoch 1 | iter 61 / 351 | time 2[s] | loss 1.99 | epoch 1 | iter 81 / 351 | time 2[s] | loss 1.89 | epoch 1 | iter 101 / 351 | time 3[s] | loss 1.82 | epoch 1 | iter 121 / 351 | time 4[s] | loss 1.82 | epoch 1 | iter 141 / 351 | time 4[s] | loss 1.80 | epoch 1 | iter 161 / 351 | time 5[s] | loss 1.79 | epoch 1 | iter 181 / 351 | time 6[s] | loss 1.78 | epoch 1 | iter 201 / 351 | time 6[s] | loss 1.77 | epoch 1 | iter 221 / 351 | time 7[s] | loss 1.76 | epoch 1 | iter 241 / 351 | time 8[s] | loss 1.76 | epoch 1 | iter 261 / 351 | time 8[s] | loss 1.75 | epoch 1 | iter 281 / 351 | time 9[s] | loss 1.74 | epoch 1 | iter 301 / 351 | time 10[s] | loss 1.74 | epoch 1 | iter 321 / 351 | time 11[s] | loss 1.73 | epoch 1 | iter 341 / 351 | time 11[s] | loss 1.73 Q 77+85 T 162 ☒ 100 --- Q 975+164 T 1139 ☒ 1013 --- Q 582+84 T 666 ☒ 102 --- Q 8+155 T 163 ☒ 100 --- Q 367+55 T 422 ☒ 1023 --- Q 600+257 T 857 ☒ 1023 --- Q 761+292 T 1053 ☒ 1023 --- Q 830+597 T 1427 ☒ 1111 --- Q 26+838 T 864 ☒ 102 --- Q 143+93 T 236 ☒ 102 --- val acc 0.280% | epoch 2 | iter 1 / 351 | time 0[s] | loss 1.71 | epoch 2 | iter 21 / 351 | time 0[s] | loss 1.71 | epoch 2 | iter 41 / 351 | time 1[s] | loss 1.71 | epoch 2 | iter 61 / 351 | time 2[s] | loss 1.71 | epoch 2 | iter 81 / 351 | time 2[s] | loss 1.70 | epoch 2 | iter 101 / 351 | time 3[s] | loss 1.68 | epoch 2 | iter 121 / 351 | time 4[s] | loss 1.69 | epoch 2 | iter 141 / 351 | time 4[s] | loss 1.68 | epoch 2 | iter 161 / 351 | time 5[s] | loss 1.67 | epoch 2 | iter 181 / 351 | time 6[s] | loss 1.67 | epoch 2 | iter 201 / 351 | time 6[s] | loss 1.65 | epoch 2 | iter 221 / 351 | time 7[s] | loss 1.65 | epoch 2 | iter 241 / 351 | time 8[s] | loss 1.65 | epoch 2 | iter 261 / 351 | time 9[s] | loss 1.63 | epoch 2 | iter 281 / 351 | time 9[s] | loss 1.62 | epoch 2 | iter 301 / 351 | time 10[s] | loss 1.61 | epoch 2 | iter 321 / 351 | time 11[s] | loss 1.61 | epoch 2 | iter 341 / 351 | time 12[s] | loss 1.60 Q 77+85 T 162 ☒ 100 --- Q 975+164 T 1139 ☒ 1200 --- Q 582+84 T 666 ☒ 690 --- Q 8+155 T 163 ☒ 100 --- Q 367+55 T 422 ☒ 690 --- Q 600+257 T 857 ☒ 999 --- Q 761+292 T 1053 ☒ 1029 --- Q 830+597 T 1427 ☒ 1240 --- Q 26+838 T 864 ☒ 792 --- Q 143+93 T 236 ☒ 290 --- val acc 0.400% | epoch 3 | iter 1 / 351 | time 0[s] | loss 1.58 | epoch 3 | iter 21 / 351 | time 0[s] | loss 1.59 | epoch 3 | iter 41 / 351 | time 1[s] | loss 1.58 | epoch 3 | iter 61 / 351 | time 2[s] | loss 1.56 | epoch 3 | iter 81 / 351 | time 3[s] | loss 1.55 | epoch 3 | iter 101 / 351 | time 3[s] | loss 1.53 | epoch 3 | iter 121 / 351 | time 4[s] | loss 1.51 | epoch 3 | iter 141 / 351 | time 5[s] | loss 1.50 | epoch 3 | iter 161 / 351 | time 6[s] | loss 1.49 | epoch 3 | iter 181 / 351 | time 7[s] | loss 1.47 | epoch 3 | iter 201 / 351 | time 7[s] | loss 1.46 | epoch 3 | iter 221 / 351 | time 8[s] | loss 1.43 | epoch 3 | iter 241 / 351 | time 9[s] | loss 1.42 | epoch 3 | iter 261 / 351 | time 10[s] | loss 1.41 | epoch 3 | iter 281 / 351 | time 11[s] | loss 1.39 | epoch 3 | iter 301 / 351 | time 11[s] | loss 1.37 | epoch 3 | iter 321 / 351 | time 12[s] | loss 1.36 | epoch 3 | iter 341 / 351 | time 13[s] | loss 1.35 Q 77+85 T 162 ☒ 154 --- Q 975+164 T 1139 ☒ 1033 --- Q 582+84 T 666 ☒ 644 --- Q 8+155 T 163 ☒ 161 --- Q 367+55 T 422 ☒ 433 --- Q 600+257 T 857 ☒ 818 --- Q 761+292 T 1053 ☒ 1018 --- Q 830+597 T 1427 ☒ 1344 --- Q 26+838 T 864 ☒ 834 --- Q 143+93 T 236 ☒ 211 --- val acc 1.600% | epoch 4 | iter 1 / 351 | time 0[s] | loss 1.32 | epoch 4 | iter 21 / 351 | time 0[s] | loss 1.32 | epoch 4 | iter 41 / 351 | time 1[s] | loss 1.30 | epoch 4 | iter 61 / 351 | time 2[s] | loss 1.30 | epoch 4 | iter 81 / 351 | time 2[s] | loss 1.28 | epoch 4 | iter 101 / 351 | time 3[s] | loss 1.27 | epoch 4 | iter 121 / 351 | time 4[s] | loss 1.25 | epoch 4 | iter 141 / 351 | time 4[s] | loss 1.24 | epoch 4 | iter 161 / 351 | time 5[s] | loss 1.22 | epoch 4 | iter 181 / 351 | time 6[s] | loss 1.21 | epoch 4 | iter 201 / 351 | time 7[s] | loss 1.20 | epoch 4 | iter 221 / 351 | time 7[s] | loss 1.20 | epoch 4 | iter 241 / 351 | time 8[s] | loss 1.17 | epoch 4 | iter 261 / 351 | time 9[s] | loss 1.16 | epoch 4 | iter 281 / 351 | time 9[s] | loss 1.14 | epoch 4 | iter 301 / 351 | time 10[s] | loss 1.12 | epoch 4 | iter 321 / 351 | time 11[s] | loss 1.11 | epoch 4 | iter 341 / 351 | time 12[s] | loss 1.10 Q 77+85 T 162 ☒ 158 --- Q 975+164 T 1139 ☒ 1123 --- Q 582+84 T 666 ☒ 657 --- Q 8+155 T 163 ☒ 165 --- Q 367+55 T 422 ☒ 423 --- Q 600+257 T 857 ☒ 777 --- Q 761+292 T 1053 ☒ 1023 --- Q 830+597 T 1427 ☒ 1388 --- Q 26+838 T 864 ☒ 887 --- Q 143+93 T 236 ☒ 223 --- val acc 5.140% | epoch 5 | iter 1 / 351 | time 0[s] | loss 1.08 | epoch 5 | iter 21 / 351 | time 0[s] | loss 1.07 | epoch 5 | iter 41 / 351 | time 1[s] | loss 1.05 | epoch 5 | iter 61 / 351 | time 2[s] | loss 1.04 | epoch 5 | iter 81 / 351 | time 2[s] | loss 1.02 | epoch 5 | iter 101 / 351 | time 3[s] | loss 1.01 | epoch 5 | iter 121 / 351 | time 4[s] | loss 1.00 | epoch 5 | iter 141 / 351 | time 4[s] | loss 0.99 | epoch 5 | iter 161 / 351 | time 5[s] | loss 0.99 | epoch 5 | iter 181 / 351 | time 6[s] | loss 0.96 | epoch 5 | iter 201 / 351 | time 7[s] | loss 0.95 | epoch 5 | iter 221 / 351 | time 7[s] | loss 0.94 | epoch 5 | iter 241 / 351 | time 8[s] | loss 0.92 | epoch 5 | iter 261 / 351 | time 9[s] | loss 0.91 | epoch 5 | iter 281 / 351 | time 9[s] | loss 0.90 | epoch 5 | iter 301 / 351 | time 10[s] | loss 0.89 | epoch 5 | iter 321 / 351 | time 11[s] | loss 0.88 | epoch 5 | iter 341 / 351 | time 11[s] | loss 0.87 Q 77+85 T 162 ☒ 160 --- Q 975+164 T 1139 ☒ 1135 --- Q 582+84 T 666 ☒ 668 --- Q 8+155 T 163 ☒ 169 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 861 --- Q 761+292 T 1053 ☒ 1045 --- Q 830+597 T 1427 ☒ 1324 --- Q 26+838 T 864 ☒ 861 --- Q 143+93 T 236 ☒ 239 --- val acc 9.380% | epoch 6 | iter 1 / 351 | time 0[s] | loss 0.90 | epoch 6 | iter 21 / 351 | time 0[s] | loss 0.86 | epoch 6 | iter 41 / 351 | time 1[s] | loss 0.83 | epoch 6 | iter 61 / 351 | time 2[s] | loss 0.84 | epoch 6 | iter 81 / 351 | time 2[s] | loss 0.82 | epoch 6 | iter 101 / 351 | time 3[s] | loss 0.81 | epoch 6 | iter 121 / 351 | time 4[s] | loss 0.80 | epoch 6 | iter 141 / 351 | time 5[s] | loss 0.79 | epoch 6 | iter 161 / 351 | time 5[s] | loss 0.78 | epoch 6 | iter 181 / 351 | time 6[s] | loss 0.77 | epoch 6 | iter 201 / 351 | time 7[s] | loss 0.76 | epoch 6 | iter 221 / 351 | time 8[s] | loss 0.76 | epoch 6 | iter 241 / 351 | time 9[s] | loss 0.74 | epoch 6 | iter 261 / 351 | time 10[s] | loss 0.74 | epoch 6 | iter 281 / 351 | time 11[s] | loss 0.73 | epoch 6 | iter 301 / 351 | time 12[s] | loss 0.72 | epoch 6 | iter 321 / 351 | time 12[s] | loss 0.72 | epoch 6 | iter 341 / 351 | time 13[s] | loss 0.71 Q 77+85 T 162 ☒ 163 --- Q 975+164 T 1139 ☒ 1138 --- Q 582+84 T 666 ☒ 668 --- Q 8+155 T 163 ☒ 166 --- Q 367+55 T 422 ☒ 423 --- Q 600+257 T 857 ☒ 858 --- Q 761+292 T 1053 ☒ 1048 --- Q 830+597 T 1427 ☒ 1428 --- Q 26+838 T 864 ☒ 873 --- Q 143+93 T 236 ☒ 239 --- val acc 15.040% | epoch 7 | iter 1 / 351 | time 0[s] | loss 0.68 | epoch 7 | iter 21 / 351 | time 1[s] | loss 0.69 | epoch 7 | iter 41 / 351 | time 2[s] | loss 0.67 | epoch 7 | iter 61 / 351 | time 3[s] | loss 0.66 | epoch 7 | iter 81 / 351 | time 4[s] | loss 0.66 | epoch 7 | iter 101 / 351 | time 5[s] | loss 0.65 | epoch 7 | iter 121 / 351 | time 6[s] | loss 0.65 | epoch 7 | iter 141 / 351 | time 7[s] | loss 0.64 | epoch 7 | iter 161 / 351 | time 7[s] | loss 0.63 | epoch 7 | iter 181 / 351 | time 8[s] | loss 0.61 | epoch 7 | iter 201 / 351 | time 9[s] | loss 0.61 | epoch 7 | iter 221 / 351 | time 10[s] | loss 0.60 | epoch 7 | iter 241 / 351 | time 11[s] | loss 0.57 | epoch 7 | iter 261 / 351 | time 12[s] | loss 0.57 | epoch 7 | iter 281 / 351 | time 12[s] | loss 0.57 | epoch 7 | iter 301 / 351 | time 13[s] | loss 0.55 | epoch 7 | iter 321 / 351 | time 14[s] | loss 0.54 | epoch 7 | iter 341 / 351 | time 15[s] | loss 0.53 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☒ 665 --- Q 8+155 T 163 ☒ 156 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☒ 858 --- Q 761+292 T 1053 ☒ 1052 --- Q 830+597 T 1427 ☒ 1428 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☒ 235 --- val acc 39.100% | epoch 8 | iter 1 / 351 | time 0[s] | loss 0.51 | epoch 8 | iter 21 / 351 | time 0[s] | loss 0.50 | epoch 8 | iter 41 / 351 | time 1[s] | loss 0.49 | epoch 8 | iter 61 / 351 | time 2[s] | loss 0.48 | epoch 8 | iter 81 / 351 | time 3[s] | loss 0.47 | epoch 8 | iter 101 / 351 | time 3[s] | loss 0.46 | epoch 8 | iter 121 / 351 | time 4[s] | loss 0.46 | epoch 8 | iter 141 / 351 | time 5[s] | loss 0.44 | epoch 8 | iter 161 / 351 | time 5[s] | loss 0.41 | epoch 8 | iter 181 / 351 | time 6[s] | loss 0.42 | epoch 8 | iter 201 / 351 | time 7[s] | loss 0.41 | epoch 8 | iter 221 / 351 | time 8[s] | loss 0.40 | epoch 8 | iter 241 / 351 | time 9[s] | loss 0.39 | epoch 8 | iter 261 / 351 | time 10[s] | loss 0.37 | epoch 8 | iter 281 / 351 | time 11[s] | loss 0.36 | epoch 8 | iter 301 / 351 | time 12[s] | loss 0.36 | epoch 8 | iter 321 / 351 | time 12[s] | loss 0.35 | epoch 8 | iter 341 / 351 | time 13[s] | loss 0.34 Q 77+85 T 162 ☒ 161 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☒ 657 --- Q 8+155 T 163 ☒ 155 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☒ 1438 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 65.060% | epoch 9 | iter 1 / 351 | time 0[s] | loss 0.32 | epoch 9 | iter 21 / 351 | time 0[s] | loss 0.31 | epoch 9 | iter 41 / 351 | time 1[s] | loss 0.31 | epoch 9 | iter 61 / 351 | time 2[s] | loss 0.31 | epoch 9 | iter 81 / 351 | time 3[s] | loss 0.29 | epoch 9 | iter 101 / 351 | time 3[s] | loss 0.29 | epoch 9 | iter 121 / 351 | time 4[s] | loss 0.29 | epoch 9 | iter 141 / 351 | time 5[s] | loss 0.27 | epoch 9 | iter 161 / 351 | time 5[s] | loss 0.27 | epoch 9 | iter 181 / 351 | time 6[s] | loss 0.26 | epoch 9 | iter 201 / 351 | time 7[s] | loss 0.25 | epoch 9 | iter 221 / 351 | time 8[s] | loss 0.25 | epoch 9 | iter 241 / 351 | time 8[s] | loss 0.24 | epoch 9 | iter 261 / 351 | time 9[s] | loss 0.24 | epoch 9 | iter 281 / 351 | time 10[s] | loss 0.23 | epoch 9 | iter 301 / 351 | time 11[s] | loss 0.22 | epoch 9 | iter 321 / 351 | time 11[s] | loss 0.22 | epoch 9 | iter 341 / 351 | time 12[s] | loss 0.21 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☒ 1140 --- Q 582+84 T 666 ☒ 657 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 83.280% | epoch 10 | iter 1 / 351 | time 0[s] | loss 0.22 | epoch 10 | iter 21 / 351 | time 0[s] | loss 0.20 | epoch 10 | iter 41 / 351 | time 1[s] | loss 0.20 | epoch 10 | iter 61 / 351 | time 2[s] | loss 0.20 | epoch 10 | iter 81 / 351 | time 3[s] | loss 0.18 | epoch 10 | iter 101 / 351 | time 4[s] | loss 0.17 | epoch 10 | iter 121 / 351 | time 5[s] | loss 0.18 | epoch 10 | iter 141 / 351 | time 6[s] | loss 0.17 | epoch 10 | iter 161 / 351 | time 6[s] | loss 0.17 | epoch 10 | iter 181 / 351 | time 7[s] | loss 0.17 | epoch 10 | iter 201 / 351 | time 8[s] | loss 0.17 | epoch 10 | iter 221 / 351 | time 9[s] | loss 0.16 | epoch 10 | iter 241 / 351 | time 9[s] | loss 0.15 | epoch 10 | iter 261 / 351 | time 10[s] | loss 0.15 | epoch 10 | iter 281 / 351 | time 11[s] | loss 0.15 | epoch 10 | iter 301 / 351 | time 12[s] | loss 0.15 | epoch 10 | iter 321 / 351 | time 13[s] | loss 0.14 | epoch 10 | iter 341 / 351 | time 13[s] | loss 0.14 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☒ 656 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 88.400% | epoch 11 | iter 1 / 351 | time 0[s] | loss 0.13 | epoch 11 | iter 21 / 351 | time 0[s] | loss 0.13 | epoch 11 | iter 41 / 351 | time 1[s] | loss 0.13 | epoch 11 | iter 61 / 351 | time 2[s] | loss 0.12 | epoch 11 | iter 81 / 351 | time 3[s] | loss 0.12 | epoch 11 | iter 101 / 351 | time 3[s] | loss 0.12 | epoch 11 | iter 121 / 351 | time 4[s] | loss 0.11 | epoch 11 | iter 141 / 351 | time 5[s] | loss 0.12 | epoch 11 | iter 161 / 351 | time 6[s] | loss 0.11 | epoch 11 | iter 181 / 351 | time 7[s] | loss 0.11 | epoch 11 | iter 201 / 351 | time 7[s] | loss 0.12 | epoch 11 | iter 221 / 351 | time 8[s] | loss 0.11 | epoch 11 | iter 241 / 351 | time 9[s] | loss 0.11 | epoch 11 | iter 261 / 351 | time 10[s] | loss 0.10 | epoch 11 | iter 281 / 351 | time 11[s] | loss 0.10 | epoch 11 | iter 301 / 351 | time 11[s] | loss 0.10 | epoch 11 | iter 321 / 351 | time 12[s] | loss 0.09 | epoch 11 | iter 341 / 351 | time 13[s] | loss 0.09 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 90.940% | epoch 12 | iter 1 / 351 | time 0[s] | loss 0.09 | epoch 12 | iter 21 / 351 | time 0[s] | loss 0.09 | epoch 12 | iter 41 / 351 | time 1[s] | loss 0.09 | epoch 12 | iter 61 / 351 | time 2[s] | loss 0.09 | epoch 12 | iter 81 / 351 | time 3[s] | loss 0.09 | epoch 12 | iter 101 / 351 | time 3[s] | loss 0.08 | epoch 12 | iter 121 / 351 | time 4[s] | loss 0.08 | epoch 12 | iter 141 / 351 | time 5[s] | loss 0.08 | epoch 12 | iter 161 / 351 | time 5[s] | loss 0.08 | epoch 12 | iter 181 / 351 | time 6[s] | loss 0.08 | epoch 12 | iter 201 / 351 | time 7[s] | loss 0.08 | epoch 12 | iter 221 / 351 | time 7[s] | loss 0.09 | epoch 12 | iter 241 / 351 | time 8[s] | loss 0.09 | epoch 12 | iter 261 / 351 | time 9[s] | loss 0.09 | epoch 12 | iter 281 / 351 | time 10[s] | loss 0.08 | epoch 12 | iter 301 / 351 | time 10[s] | loss 0.08 | epoch 12 | iter 321 / 351 | time 11[s] | loss 0.07 | epoch 12 | iter 341 / 351 | time 12[s] | loss 0.08 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 92.220% | epoch 13 | iter 1 / 351 | time 0[s] | loss 0.07 | epoch 13 | iter 21 / 351 | time 0[s] | loss 0.07 | epoch 13 | iter 41 / 351 | time 1[s] | loss 0.07 | epoch 13 | iter 61 / 351 | time 2[s] | loss 0.07 | epoch 13 | iter 81 / 351 | time 2[s] | loss 0.06 | epoch 13 | iter 101 / 351 | time 3[s] | loss 0.06 | epoch 13 | iter 121 / 351 | time 4[s] | loss 0.07 | epoch 13 | iter 141 / 351 | time 5[s] | loss 0.06 | epoch 13 | iter 161 / 351 | time 5[s] | loss 0.06 | epoch 13 | iter 181 / 351 | time 6[s] | loss 0.06 | epoch 13 | iter 201 / 351 | time 7[s] | loss 0.06 | epoch 13 | iter 221 / 351 | time 8[s] | loss 0.06 | epoch 13 | iter 241 / 351 | time 8[s] | loss 0.06 | epoch 13 | iter 261 / 351 | time 9[s] | loss 0.06 | epoch 13 | iter 281 / 351 | time 10[s] | loss 0.06 | epoch 13 | iter 301 / 351 | time 11[s] | loss 0.05 | epoch 13 | iter 321 / 351 | time 11[s] | loss 0.05 | epoch 13 | iter 341 / 351 | time 12[s] | loss 0.06 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 94.380% | epoch 14 | iter 1 / 351 | time 0[s] | loss 0.05 | epoch 14 | iter 21 / 351 | time 1[s] | loss 0.05 | epoch 14 | iter 41 / 351 | time 1[s] | loss 0.05 | epoch 14 | iter 61 / 351 | time 2[s] | loss 0.05 | epoch 14 | iter 81 / 351 | time 3[s] | loss 0.05 | epoch 14 | iter 101 / 351 | time 3[s] | loss 0.05 | epoch 14 | iter 121 / 351 | time 4[s] | loss 0.05 | epoch 14 | iter 141 / 351 | time 5[s] | loss 0.05 | epoch 14 | iter 161 / 351 | time 6[s] | loss 0.05 | epoch 14 | iter 181 / 351 | time 6[s] | loss 0.05 | epoch 14 | iter 201 / 351 | time 7[s] | loss 0.05 | epoch 14 | iter 221 / 351 | time 8[s] | loss 0.06 | epoch 14 | iter 241 / 351 | time 9[s] | loss 0.06 | epoch 14 | iter 261 / 351 | time 9[s] | loss 0.07 | epoch 14 | iter 281 / 351 | time 10[s] | loss 0.06 | epoch 14 | iter 301 / 351 | time 11[s] | loss 0.06 | epoch 14 | iter 321 / 351 | time 12[s] | loss 0.05 | epoch 14 | iter 341 / 351 | time 12[s] | loss 0.05 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 94.880% | epoch 15 | iter 1 / 351 | time 0[s] | loss 0.05 | epoch 15 | iter 21 / 351 | time 0[s] | loss 0.05 | epoch 15 | iter 41 / 351 | time 1[s] | loss 0.04 | epoch 15 | iter 61 / 351 | time 2[s] | loss 0.04 | epoch 15 | iter 81 / 351 | time 2[s] | loss 0.04 | epoch 15 | iter 101 / 351 | time 3[s] | loss 0.05 | epoch 15 | iter 121 / 351 | time 4[s] | loss 0.04 | epoch 15 | iter 141 / 351 | time 5[s] | loss 0.04 | epoch 15 | iter 161 / 351 | time 6[s] | loss 0.05 | epoch 15 | iter 181 / 351 | time 6[s] | loss 0.06 | epoch 15 | iter 201 / 351 | time 7[s] | loss 0.05 | epoch 15 | iter 221 / 351 | time 8[s] | loss 0.04 | epoch 15 | iter 241 / 351 | time 9[s] | loss 0.04 | epoch 15 | iter 261 / 351 | time 10[s] | loss 0.05 | epoch 15 | iter 281 / 351 | time 11[s] | loss 0.05 | epoch 15 | iter 301 / 351 | time 12[s] | loss 0.05 | epoch 15 | iter 321 / 351 | time 12[s] | loss 0.05 | epoch 15 | iter 341 / 351 | time 13[s] | loss 0.05 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 94.660% | epoch 16 | iter 1 / 351 | time 0[s] | loss 0.03 | epoch 16 | iter 21 / 351 | time 0[s] | loss 0.05 | epoch 16 | iter 41 / 351 | time 1[s] | loss 0.04 | epoch 16 | iter 61 / 351 | time 2[s] | loss 0.03 | epoch 16 | iter 81 / 351 | time 3[s] | loss 0.03 | epoch 16 | iter 101 / 351 | time 3[s] | loss 0.03 | epoch 16 | iter 121 / 351 | time 4[s] | loss 0.04 | epoch 16 | iter 141 / 351 | time 5[s] | loss 0.04 | epoch 16 | iter 161 / 351 | time 5[s] | loss 0.03 | epoch 16 | iter 181 / 351 | time 6[s] | loss 0.04 | epoch 16 | iter 201 / 351 | time 7[s] | loss 0.03 | epoch 16 | iter 221 / 351 | time 8[s] | loss 0.03 | epoch 16 | iter 241 / 351 | time 8[s] | loss 0.03 | epoch 16 | iter 261 / 351 | time 9[s] | loss 0.03 | epoch 16 | iter 281 / 351 | time 10[s] | loss 0.03 | epoch 16 | iter 301 / 351 | time 11[s] | loss 0.03 | epoch 16 | iter 321 / 351 | time 11[s] | loss 0.03 | epoch 16 | iter 341 / 351 | time 12[s] | loss 0.04 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 95.280% | epoch 17 | iter 1 / 351 | time 0[s] | loss 0.04 | epoch 17 | iter 21 / 351 | time 0[s] | loss 0.03 | epoch 17 | iter 41 / 351 | time 1[s] | loss 0.04 | epoch 17 | iter 61 / 351 | time 2[s] | loss 0.03 | epoch 17 | iter 81 / 351 | time 3[s] | loss 0.03 | epoch 17 | iter 101 / 351 | time 3[s] | loss 0.04 | epoch 17 | iter 121 / 351 | time 4[s] | loss 0.04 | epoch 17 | iter 141 / 351 | time 5[s] | loss 0.05 | epoch 17 | iter 161 / 351 | time 6[s] | loss 0.05 | epoch 17 | iter 181 / 351 | time 7[s] | loss 0.05 | epoch 17 | iter 201 / 351 | time 7[s] | loss 0.05 | epoch 17 | iter 221 / 351 | time 8[s] | loss 0.04 | epoch 17 | iter 241 / 351 | time 9[s] | loss 0.03 | epoch 17 | iter 261 / 351 | time 10[s] | loss 0.03 | epoch 17 | iter 281 / 351 | time 10[s] | loss 0.03 | epoch 17 | iter 301 / 351 | time 11[s] | loss 0.03 | epoch 17 | iter 321 / 351 | time 12[s] | loss 0.03 | epoch 17 | iter 341 / 351 | time 12[s] | loss 0.04 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 94.500% | epoch 18 | iter 1 / 351 | time 0[s] | loss 0.04 | epoch 18 | iter 21 / 351 | time 0[s] | loss 0.03 | epoch 18 | iter 41 / 351 | time 1[s] | loss 0.02 | epoch 18 | iter 61 / 351 | time 2[s] | loss 0.02 | epoch 18 | iter 81 / 351 | time 3[s] | loss 0.03 | epoch 18 | iter 101 / 351 | time 3[s] | loss 0.02 | epoch 18 | iter 121 / 351 | time 4[s] | loss 0.02 | epoch 18 | iter 141 / 351 | time 5[s] | loss 0.02 | epoch 18 | iter 161 / 351 | time 6[s] | loss 0.02 | epoch 18 | iter 181 / 351 | time 7[s] | loss 0.02 | epoch 18 | iter 201 / 351 | time 8[s] | loss 0.02 | epoch 18 | iter 221 / 351 | time 8[s] | loss 0.02 | epoch 18 | iter 241 / 351 | time 9[s] | loss 0.03 | epoch 18 | iter 261 / 351 | time 10[s] | loss 0.04 | epoch 18 | iter 281 / 351 | time 11[s] | loss 0.05 | epoch 18 | iter 301 / 351 | time 12[s] | loss 0.04 | epoch 18 | iter 321 / 351 | time 12[s] | loss 0.04 | epoch 18 | iter 341 / 351 | time 13[s] | loss 0.03 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 94.660% | epoch 19 | iter 1 / 351 | time 0[s] | loss 0.04 | epoch 19 | iter 21 / 351 | time 0[s] | loss 0.04 | epoch 19 | iter 41 / 351 | time 1[s] | loss 0.07 | epoch 19 | iter 61 / 351 | time 2[s] | loss 0.06 | epoch 19 | iter 81 / 351 | time 3[s] | loss 0.05 | epoch 19 | iter 101 / 351 | time 3[s] | loss 0.04 | epoch 19 | iter 121 / 351 | time 4[s] | loss 0.03 | epoch 19 | iter 141 / 351 | time 5[s] | loss 0.03 | epoch 19 | iter 161 / 351 | time 6[s] | loss 0.03 | epoch 19 | iter 181 / 351 | time 6[s] | loss 0.03 | epoch 19 | iter 201 / 351 | time 7[s] | loss 0.02 | epoch 19 | iter 221 / 351 | time 8[s] | loss 0.02 | epoch 19 | iter 241 / 351 | time 9[s] | loss 0.02 | epoch 19 | iter 261 / 351 | time 10[s] | loss 0.02 | epoch 19 | iter 281 / 351 | time 10[s] | loss 0.02 | epoch 19 | iter 301 / 351 | time 11[s] | loss 0.02 | epoch 19 | iter 321 / 351 | time 12[s] | loss 0.02 | epoch 19 | iter 341 / 351 | time 12[s] | loss 0.02 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 98.640% | epoch 20 | iter 1 / 351 | time 0[s] | loss 0.01 | epoch 20 | iter 21 / 351 | time 0[s] | loss 0.01 | epoch 20 | iter 41 / 351 | time 1[s] | loss 0.01 | epoch 20 | iter 61 / 351 | time 2[s] | loss 0.01 | epoch 20 | iter 81 / 351 | time 3[s] | loss 0.01 | epoch 20 | iter 101 / 351 | time 3[s] | loss 0.01 | epoch 20 | iter 121 / 351 | time 4[s] | loss 0.01 | epoch 20 | iter 141 / 351 | time 5[s] | loss 0.01 | epoch 20 | iter 161 / 351 | time 6[s] | loss 0.02 | epoch 20 | iter 181 / 351 | time 6[s] | loss 0.02 | epoch 20 | iter 201 / 351 | time 7[s] | loss 0.02 | epoch 20 | iter 221 / 351 | time 8[s] | loss 0.02 | epoch 20 | iter 241 / 351 | time 9[s] | loss 0.02 | epoch 20 | iter 261 / 351 | time 10[s] | loss 0.02 | epoch 20 | iter 281 / 351 | time 10[s] | loss 0.02 | epoch 20 | iter 301 / 351 | time 11[s] | loss 0.02 | epoch 20 | iter 321 / 351 | time 12[s] | loss 0.02 | epoch 20 | iter 341 / 351 | time 13[s] | loss 0.03 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☒ 162 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☒ 1426 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 95.200% | epoch 21 | iter 1 / 351 | time 0[s] | loss 0.06 | epoch 21 | iter 21 / 351 | time 0[s] | loss 0.03 | epoch 21 | iter 41 / 351 | time 1[s] | loss 0.03 | epoch 21 | iter 61 / 351 | time 2[s] | loss 0.03 | epoch 21 | iter 81 / 351 | time 3[s] | loss 0.03 | epoch 21 | iter 101 / 351 | time 3[s] | loss 0.04 | epoch 21 | iter 121 / 351 | time 4[s] | loss 0.05 | epoch 21 | iter 141 / 351 | time 5[s] | loss 0.06 | epoch 21 | iter 161 / 351 | time 5[s] | loss 0.04 | epoch 21 | iter 181 / 351 | time 6[s] | loss 0.03 | epoch 21 | iter 201 / 351 | time 7[s] | loss 0.03 | epoch 21 | iter 221 / 351 | time 8[s] | loss 0.03 | epoch 21 | iter 241 / 351 | time 8[s] | loss 0.02 | epoch 21 | iter 261 / 351 | time 9[s] | loss 0.03 | epoch 21 | iter 281 / 351 | time 10[s] | loss 0.03 | epoch 21 | iter 301 / 351 | time 11[s] | loss 0.02 | epoch 21 | iter 321 / 351 | time 11[s] | loss 0.02 | epoch 21 | iter 341 / 351 | time 12[s] | loss 0.02 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 97.720% | epoch 22 | iter 1 / 351 | time 0[s] | loss 0.02 | epoch 22 | iter 21 / 351 | time 0[s] | loss 0.01 | epoch 22 | iter 41 / 351 | time 1[s] | loss 0.02 | epoch 22 | iter 61 / 351 | time 2[s] | loss 0.01 | epoch 22 | iter 81 / 351 | time 2[s] | loss 0.01 | epoch 22 | iter 101 / 351 | time 3[s] | loss 0.01 | epoch 22 | iter 121 / 351 | time 4[s] | loss 0.01 | epoch 22 | iter 141 / 351 | time 5[s] | loss 0.01 | epoch 22 | iter 161 / 351 | time 5[s] | loss 0.01 | epoch 22 | iter 181 / 351 | time 6[s] | loss 0.01 | epoch 22 | iter 201 / 351 | time 7[s] | loss 0.01 | epoch 22 | iter 221 / 351 | time 8[s] | loss 0.01 | epoch 22 | iter 241 / 351 | time 8[s] | loss 0.01 | epoch 22 | iter 261 / 351 | time 9[s] | loss 0.01 | epoch 22 | iter 281 / 351 | time 10[s] | loss 0.01 | epoch 22 | iter 301 / 351 | time 11[s] | loss 0.01 | epoch 22 | iter 321 / 351 | time 11[s] | loss 0.01 | epoch 22 | iter 341 / 351 | time 12[s] | loss 0.02 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 97.860% | epoch 23 | iter 1 / 351 | time 0[s] | loss 0.01 | epoch 23 | iter 21 / 351 | time 0[s] | loss 0.01 | epoch 23 | iter 41 / 351 | time 1[s] | loss 0.01 | epoch 23 | iter 61 / 351 | time 2[s] | loss 0.01 | epoch 23 | iter 81 / 351 | time 2[s] | loss 0.01 | epoch 23 | iter 101 / 351 | time 3[s] | loss 0.01 | epoch 23 | iter 121 / 351 | time 4[s] | loss 0.01 | epoch 23 | iter 141 / 351 | time 5[s] | loss 0.02 | epoch 23 | iter 161 / 351 | time 5[s] | loss 0.02 | epoch 23 | iter 181 / 351 | time 6[s] | loss 0.02 | epoch 23 | iter 201 / 351 | time 7[s] | loss 0.02 | epoch 23 | iter 221 / 351 | time 8[s] | loss 0.02 | epoch 23 | iter 241 / 351 | time 8[s] | loss 0.03 | epoch 23 | iter 261 / 351 | time 9[s] | loss 0.05 | epoch 23 | iter 281 / 351 | time 10[s] | loss 0.09 | epoch 23 | iter 301 / 351 | time 11[s] | loss 0.08 | epoch 23 | iter 321 / 351 | time 12[s] | loss 0.05 | epoch 23 | iter 341 / 351 | time 12[s] | loss 0.04 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 96.600% | epoch 24 | iter 1 / 351 | time 0[s] | loss 0.02 | epoch 24 | iter 21 / 351 | time 0[s] | loss 0.02 | epoch 24 | iter 41 / 351 | time 1[s] | loss 0.02 | epoch 24 | iter 61 / 351 | time 2[s] | loss 0.02 | epoch 24 | iter 81 / 351 | time 3[s] | loss 0.01 | epoch 24 | iter 101 / 351 | time 3[s] | loss 0.01 | epoch 24 | iter 121 / 351 | time 4[s] | loss 0.01 | epoch 24 | iter 141 / 351 | time 5[s] | loss 0.01 | epoch 24 | iter 161 / 351 | time 6[s] | loss 0.01 | epoch 24 | iter 181 / 351 | time 6[s] | loss 0.01 | epoch 24 | iter 201 / 351 | time 7[s] | loss 0.01 | epoch 24 | iter 221 / 351 | time 8[s] | loss 0.01 | epoch 24 | iter 241 / 351 | time 9[s] | loss 0.01 | epoch 24 | iter 261 / 351 | time 9[s] | loss 0.01 | epoch 24 | iter 281 / 351 | time 10[s] | loss 0.01 | epoch 24 | iter 301 / 351 | time 11[s] | loss 0.01 | epoch 24 | iter 321 / 351 | time 12[s] | loss 0.01 | epoch 24 | iter 341 / 351 | time 13[s] | loss 0.01 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 99.480% | epoch 25 | iter 1 / 351 | time 0[s] | loss 0.01 | epoch 25 | iter 21 / 351 | time 0[s] | loss 0.01 | epoch 25 | iter 41 / 351 | time 1[s] | loss 0.01 | epoch 25 | iter 61 / 351 | time 2[s] | loss 0.01 | epoch 25 | iter 81 / 351 | time 2[s] | loss 0.01 | epoch 25 | iter 101 / 351 | time 3[s] | loss 0.01 | epoch 25 | iter 121 / 351 | time 4[s] | loss 0.01 | epoch 25 | iter 141 / 351 | time 5[s] | loss 0.01 | epoch 25 | iter 161 / 351 | time 6[s] | loss 0.01 | epoch 25 | iter 181 / 351 | time 6[s] | loss 0.01 | epoch 25 | iter 201 / 351 | time 7[s] | loss 0.01 | epoch 25 | iter 221 / 351 | time 8[s] | loss 0.01 | epoch 25 | iter 241 / 351 | time 9[s] | loss 0.01 | epoch 25 | iter 261 / 351 | time 9[s] | loss 0.01 | epoch 25 | iter 281 / 351 | time 10[s] | loss 0.01 | epoch 25 | iter 301 / 351 | time 11[s] | loss 0.01 | epoch 25 | iter 321 / 351 | time 12[s] | loss 0.01 | epoch 25 | iter 341 / 351 | time 13[s] | loss 0.01 Q 77+85 T 162 ☑ 162 --- Q 975+164 T 1139 ☑ 1139 --- Q 582+84 T 666 ☑ 666 --- Q 8+155 T 163 ☑ 163 --- Q 367+55 T 422 ☑ 422 --- Q 600+257 T 857 ☑ 857 --- Q 761+292 T 1053 ☑ 1053 --- Q 830+597 T 1427 ☑ 1427 --- Q 26+838 T 864 ☑ 864 --- Q 143+93 T 236 ☑ 236 --- val acc 98.840%
plt.plot(x_peeky, acc_list_peeky, marker='o')
plt.plot(x_rev, acc_list_rev, marker='x')
plt.plot(x_seq2, acc_list_seq2, marker='.')
plt.show()