Dogs vs Cats Redux

  • Practical Deep Learning for Coders のLesson1をやってみた
  • utils.pyvgg16.pyは使わずにスクラッチからKeras APIだけで書く(参考にはした)
  • Python3 と Keras2 (Tensorflow backend) を使って再実装する
  • VGG16のFine-tuningによる犬猫認識(2017/1/8)ではサブセットのみしか使っていなかったが今回は全データ使う

データのセットアップ

  • Kaggleの Dogs vs. Cats Redux: Kernels Edition からtrain.ziptest.zipをダウンロードして解答
  • 次のような状態になっているとする

    lesson1/
      redux.ipynb
      data/
          redux/
              train/
                  cat.437.jpg
                  dog.9924.jpg
                  cat.1029.jpg
                  dog.4374.jpg
              test/
                  231.jpg
                  325.jpg
                  1235.jpg
                  9923.jpg
  • どのような手順でデータを作成したか記録するために、データの作成もすべてJupyter Notebook上でやることが推奨されている

  • 一度作成したあとにもう一度同じセルを実行しないように注意が必要!
  • Notebookを分けた方がよいかも
  • trainデータの一部をvalidデータとする
In [1]:
import os
current_dir = os.getcwd()
data_dir = current_dir + '/data/redux'
In [16]:
# ディレクトリを作成
%cd $data_dir
%mkdir valid
%mkdir -p test/unknown
/Users/koichiro.mori/Notebooks/fastai/blog/data/redux
In [20]:
# 訓練データからランダムに選んだ2000画像をvalidationデータとする
%cd $data_dir/train

from glob import glob
import numpy as np
g = glob('*.jpg')
shuf = np.random.permutation(g)
for i in range(2000):
    os.rename(shuf[i], data_dir + '/valid/' + shuf[i])
/Users/koichiro.mori/Notebooks/fastai/blog/data/redux/train
  • Kerasで扱いやすいようにクラス別にサブディレクトリを作成する
  • flow_from_directory(directory)で画像を読み込むときにサブディレクトリをクラスとして自動認識してくれる
  • Kaggleのtestデータは正解ラベルがついていないためunknownというサブディレクトリに入れておく
  • なんらかのサブディレクトリを作っておけば共通に扱える
  • 正解ラベルがついているテストデータなら同じように同じようにサブディレクトリを作っておけばテスト精度評価は楽
In [21]:
# train
%cd $data_dir/train
%mkdir cats dogs
%mv cat.*.jpg cats/
%mv dog.*.jpg dogs/

# valid
%cd $data_dir/valid
%mkdir cats dogs
%mv cat.*.jpg cats/
%mv dog.*.jpg dogs/
/Users/koichiro.mori/Notebooks/fastai/blog/data/redux/train
/Users/koichiro.mori/Notebooks/fastai/blog/data/redux/valid
In [22]:
# test
%cd $data_dir/test
%mv *.jpg unknown/
/Users/koichiro.mori/Notebooks/fastai/blog/data/redux/test

VGG16モデルにDropoutを加える

In [59]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from keras.applications.vgg16 import VGG16, preprocess_input
from keras.models import Model
from keras.layers import Input, Dense, Flatten, Dropout
from keras.optimizers import SGD
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import CSVLogger, ModelCheckpoint
In [2]:
vgg = VGG16(weights='imagenet', include_top=True)
In [3]:
vgg.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________
In [4]:
# レイヤを保存
fc1 = vgg.layers[-3]
fc2 = vgg.layers[-2]
predictions = vgg.layers[-1]

# Dropoutレイヤを作成
dropout1 = Dropout(0.5)
dropout2 = Dropout(0.5)

# レイヤを付け替える
x = dropout1(fc1.output)
x = fc2(x)
x = dropout2(x)

# モデルを構築
model = Model(inputs=vgg.input, outputs=predictions(x))
In [5]:
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
dropout_1 (Dropout)          (None, 4096)              0         
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
dropout_2 (Dropout)          (None, 4096)              0         
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________

犬猫分類ができるように出力層を置き換える

  • デフォルトのVGG16はImageNetの1000クラス分類用
  • 犬・猫の2クラス分類ができるように出力層を付け替える
  • 出力層以外の重みは固定する
In [6]:
# 最後の出力層を取り除く
model.layers.pop()
Out[6]:
<keras.layers.core.Dense at 0x11f9b72b0>
In [7]:
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
dropout_1 (Dropout)          (None, 4096)              0         
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
dropout_2 (Dropout)          (None, 4096)              0         
=================================================================
Total params: 134,260,544
Trainable params: 134,260,544
Non-trainable params: 0
_________________________________________________________________
In [8]:
# 全層の重みをフリーズする
for layer in model.layers:
    layer.trainable = False
In [9]:
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
dropout_1 (Dropout)          (None, 4096)              0         
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
dropout_2 (Dropout)          (None, 4096)              0         
=================================================================
Total params: 134,260,544
Trainable params: 0
Non-trainable params: 134,260,544
_________________________________________________________________
In [11]:
# 2クラス分類する出力層を追加
# 残念ながらadd()はできないのでFunctionalAPIを使う
x = model.layers[-1].output
predictions = Dense(2, activation='softmax', name='predictions')
model = Model(inputs=model.input, outputs=predictions(x))
In [12]:
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
dropout_1 (Dropout)          (None, 4096)              0         
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
dropout_2 (Dropout)          (None, 4096)              0         
_________________________________________________________________
predictions (Dense)          (None, 2)                 8194      
=================================================================
Total params: 134,268,738
Trainable params: 8,194
Non-trainable params: 134,260,544
_________________________________________________________________
In [17]:
# 忘れずにコンパイルする
model.compile(optimizer=SGD(lr=0.01, momentum=0.9, decay=1e-6, nesterov=True),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

データジェネレータの作成

  • VGG16はImageNetで学習しているため画像を入力する際に特殊な前処理が必要
  • keras.applications.vgg16.preprocess_inputで提供されているがこれは4Dテンソルが入力
  • ImageDataGeneratorで使うためには3Dテンソルで入力しないといけないためラッパーを作る
  • fast.aiではVGG16は独自に構築しているため前処理関数を実行するLambda層を挟んでいる。その場合、Lambda層が自動で前処理するためImageDataGeneratorで前処理は不要

    model = self.model = Sequential()
    model.add(Lambda(vgg_preprocess, input_shape=(3,224,224), output_shape=(3,224,224)))
    
In [25]:
def vgg_preprocess(x):
    """
    ImageDataGeneratorの前処理用の関数
    x: 画像 (3D tensor)
    return 前処理済み画像 (3D tensor)
    """
    x = np.expand_dims(x, axis=0)  # 4D tensor
    x = preprocess_input(x)
    return x[0]
In [26]:
gen = ImageDataGenerator(preprocessing_function=vgg_preprocess)
In [70]:
batch_size = 64
train_batches = gen.flow_from_directory('data/redux/train/',
                                        target_size=(224, 224),
                                        class_mode='categorical',
                                        shuffle=True,
                                        batch_size=batch_size)
Found 23000 images belonging to 2 classes.
In [71]:
val_batches = gen.flow_from_directory('data/redux/valid/',
                                      target_size=(224, 224),
                                      class_mode='categorical',
                                      shuffle=True,
                                      batch_size=batch_size)
Found 2000 images belonging to 2 classes.
  • 本当にデータが生成できるか確認
  • 画像は前処理済みなのでimshow()で表示しても色が変
  • train_batchesIteratorなのでreset()で戻しておく
In [64]:
data, labels = train_batches.next()
print(train_batches.batch_index)
print(data.shape)
print(labels[:5])
1
(64, 224, 224, 3)
[[ 0.  1.]
 [ 0.  1.]
 [ 0.  1.]
 [ 0.  1.]
 [ 1.  0.]]
In [65]:
train_batches.reset()

Callbackを作成

In [68]:
# エポックのloss/acc/val_loss/val_accをファイルに出力する
logger = CSVLogger('history.log')

# monitorしている指標が改善したらモデル(構造・重み含む)をファイルに出力する
checkpoint = ModelCheckpoint(
    'vgg.{epoch:02d}-{val_loss:.3f}-{val_acc:.3f}.h5',
    monitor='val_loss',
    verbose=1,
    save_best_only=True,
    mode='auto')

モデル訓練

  • この処理はGPUがないときつい
  • Keras2からsteps_per_epochでバッチ数を指定する必要がある
  • np.ceil()で端数を切り上げておかないとあまりのサンプルが捨てられてしまう?
In [ ]:
model.fit_generator(
    train_batches,
    steps_per_epoch=int(np.ceil(train_batches.samples / batch_size)),
    epochs=5,
    validation_data=val_batches,
    validation_steps=int(np.ceil(train_batches.samples / batch_size)),
    callbacks=[logger, checkpoint])

モデルの保存

  • Keras2では構造も重みもまとめて保存できるようになった
  • 最終結果を保存したければ下のコマンドでファイルに保存できる
  • ただし、今回はModelCheckpointを使ったので一番精度が高いモデルを使う
  • VGGのモデルはパラメータ数が多く、500MB近くあるので注意!
In [ ]:
model.save('model_dogs_vs_cats.h5')

学習済みのモデルのロード

In [78]:
from keras.models import load_model
model = load_model('vgg.00-0.129-0.986.h5')

テストデータでの予測

  • predict_generator()を使うとテストデータに対して予測ができる
  • evaluate_generator()を使えばテスト精度が求められるけどKaggleのテストデータは正解ラベルがないので今回は使えない
  • 順伝播処理とは言え、VGG16は大量の畳み込みがあるので重い
  • GPUがないときつい
  • NumPy arrayはbcolzで保存するとよい
  • flow_from_directory()のshuffleオプションにバグある?
In [88]:
test_batches = gen.flow_from_directory('data/redux/test/',
                                       target_size=(224, 224),
                                       class_mode='categorical',
                                       shuffle=False,
                                       batch_size=batch_size)
Found 12500 images belonging to 1 classes.
In [ ]:
preds = model.predict_generator(test_batches, int(np.ceil(test_batches.samples / batch_size)))
In [79]:
import bcolz  # pip install bcolz
def save_array(fname, arr):
    c = bcolz.carray(arr, rootdir=fname, mode='w')
    c.flush()
def load_array(fname):
    return bcolz.open(fname)[:]
In [ ]:
# 一応、filenamesも保存しておく
# predsとfilenamesは対応あり
save_array('test_preds.dat', preds)
save_array('filenames.dat', test_batches.filenames)

Kaggleに送信するファイルを作る

In [107]:
preds = load_array('test_preds.dat')
filenames = load_array('filenames.dat')
In [108]:
print(preds.shape)
print(filenames.shape)
print(preds[:5])
print(filenames[:5])
(12500, 2)
(12500,)
[[  1.00000000e+00   3.22960868e-27]
 [  2.87068018e-04   9.99712884e-01]
 [  1.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   1.00000000e+00]
 [  0.00000000e+00   1.00000000e+00]]
['unknown/8250.jpg' 'unknown/11194.jpg' 'unknown/9368.jpg'
 'unknown/3037.jpg' 'unknown/2378.jpg']
In [109]:
# 犬である確率のみ使う
isdog = preds[:, 1]

# 予測確率が0.0や1.0など極端な値で外すとLogLossが大きいのでクリッピング
isdog = isdog.clip(min=0.05, max=0.95)

# ファイル名からIDを取得
# unknown/8250.jpg => 8250
ids = np.array([int(f[8:f.find('.')]) for f in filenames])
print(ids[:5])
[ 8250 11194  9368  3037  2378]
In [110]:
# IDと予測確率を列方向に結合
subm = np.stack([ids, isdog], axis=1)
print(subm[:5])
[[  8.25000000e+03   5.00000007e-02]
 [  1.11940000e+04   9.49999988e-01]
 [  9.36800000e+03   5.00000007e-02]
 [  3.03700000e+03   9.49999988e-01]
 [  2.37800000e+03   9.49999988e-01]]
In [111]:
# ファイル出力
np.savetxt('submission1.csv', subm, fmt='%d,%.5f', header='id,label', comments='')
In [114]:
# ファイルを確認
# 遠隔でJupyter Notebookを使っているときはとても便利
from IPython.display import FileLink
FileLink('submission1.csv')
Out[114]:
In [118]:
# 評価指標であるLog Lossを描画
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import log_loss

def logloss(true_label, predicted, eps=1e-15):
    p = np.clip(predicted, eps, 1 - eps)
    if true_label == 1:
        return -np.log(p)
    else:
        return -np.log(1 - p)

class_labels = [0, 1]  # 0=cats, 1=dogs
x = [i * 0.0001 for i in range(1, 10000)]
y = [logloss(1, i * 0.0001, eps=1e-15) for i in range(1, 10000)]

plt.plot(x, y)
plt.axis([-0.05, 1.1, -0.8, 10])
plt.title('Log Loss when true label = 1')
plt.xlabel('predicted probability')
plt.ylabel('log loss')
plt.grid()

plt.show()
  • スコアは0.10323

バリデーションデータでモデル検証

  • Kaggleのテストデータにはラベルがついていないので検証が難しい
  • 正解ラベルがついているバリデーションデータを使ってモデルを分析する
In [119]:
val_batches = gen.flow_from_directory('data/redux/valid/',
                                      target_size=(224, 224),
                                      class_mode='categorical',
                                      shuffle=False,
                                      batch_size=batch_size)
Found 2000 images belonging to 2 classes.
In [186]:
class_indices = val_batches.class_indices
print(class_indices)
idx2class = {v:k for k, v in class_indices.items()}
print(idx2class)
{'cats': 0, 'dogs': 1}
{0: 'cats', 1: 'dogs'}
In [121]:
# 学習済みモデルのロード
from keras.models import load_model
model = load_model('vgg.00-0.129-0.986.h5')

# バリデーションデータで精度を求める
batch_size = 64
loss, acc = model.evaluate_generator(val_batches,
                                     int(np.ceil(val_batches.samples / batch_size)))
print('loss:', loss)
print('acc:', acc)
loss: 0.178096582163
acc: 0.9835
In [161]:
# 予測結果を求めてファイルに保存
val_batches.reset()  # すでに上のセルで使っているため必ず入れる! 
preds = model.predict_generator(val_batches,
                                int(np.ceil(val_batches.samples / batch_size)))
save_array('val_preds.dat', preds)
save_array('val_filenames.dat', val_batches.filenames)
In [162]:
# 結果をファイルからロード
preds = load_array('val_preds.dat')
filenames = load_array('val_filenames.dat')
print(preds.shape)
print(filenames.shape)
(2000, 2)
(2000,)
In [163]:
expected_labels = val_batches.classes
In [164]:
# 正解ラベル(0 = cats, 1 = dogs)
expected_labels[:10]
Out[164]:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)
In [165]:
filenames[:10]
Out[165]:
array(['cats/cat.1000.jpg', 'cats/cat.10014.jpg', 'cats/cat.10017.jpg',
       'cats/cat.10020.jpg', 'cats/cat.10028.jpg', 'cats/cat.1004.jpg',
       'cats/cat.10042.jpg', 'cats/cat.10073.jpg', 'cats/cat.10078.jpg',
       'cats/cat.10105.jpg'], 
      dtype='<U18')
In [211]:
# 猫と予測した確率
our_predictions = preds[:, 0]

# 予測ラベル (0 = cats, 1 = dogs)
# 確率が高い方のインデックスを分類結果とする
our_labels = np.argmax(preds, axis=1)
In [167]:
our_labels[:10]
Out[167]:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
In [168]:
# 予測ラベルと正解ラベルの形は同じはず
assert our_labels.shape == expected_labels.shape
In [171]:
# 精度を念のため確認
# evaluate_generator()の結果と同じ
print(np.sum(our_labels == expected_labels) / our_labels.shape[0])
0.9835

分類結果の可視化

  1. 正しく分類できたサンプル
  2. 誤って分類されたサンプル
  3. 確信を持って猫だと分類して実際に猫だったサンプル
  4. 確信を持って犬だと分類して実際に犬だったサンプル
  5. 確信を持って猫だと分類したのに実際は犬だったサンプル
  6. 確信を持って犬だと分類したのに実際は猫だったサンプル
  7. 分類が曖昧なサンプル
In [178]:
from keras.preprocessing.image import load_img

# プロット関数
def plots(imgs, figsize=(12, 6), rows=1, titles=None):
    f = plt.figure(figsize=figsize)
    for i in range(len(imgs)):
        sp = f.add_subplot(rows, len(imgs) // rows, i + 1)
        sp.axis('off')
        if titles is not None:
            sp.set_title(titles[i])
        plt.imshow(imgs[i])

def plots_idx(idx, titles=None):
    plots([load_img('data/redux/valid/' + filenames[i]) for i in idx], titles=titles)

n_view = 4
In [212]:
# 1. 正しく分類できたサンプル
# タイトルの数字は「猫である確率」
correct = np.where(our_labels == expected_labels)[0]
print("Found {} correct labels".format(len(correct)))
idx = np.random.permutation(correct)[:n_view]
plots_idx(idx, our_predictions[idx])
Found 1967 correct labels
In [216]:
# 2. 誤って分類されたサンプル
correct = np.where(our_labels != expected_labels)[0]
print("Found {} incorrect labels".format(len(correct)))
idx = np.random.permutation(correct)[:n_view]
plots_idx(idx, our_predictions[idx])
Found 33 incorrect labels
In [220]:
# 3. 確信を持って猫だと分類して実際に猫だったサンプル
correct_cats = np.where((our_labels == 0) & (our_labels == expected_labels))[0]
print("Found {} confident correct cats labels".format(len(correct_cats)))
# 値が1.0に近いほど猫なので逆順にそーと
# この出力のインデックスはcorrect_catsに対するものであることに注意
most_correct_cats = np.argsort(our_predictions[correct_cats])[::-1][:n_view]
plots_idx(correct_cats[most_correct_cats], our_predictions[correct_cats][most_correct_cats])
Found 1029 confident correct cats labels