要用非常少的數據來訓練一個圖像分類的模型是一種在實務上常見的情況,如果您過去曾經進行過影像視覺處理的相關專案,這樣的情況勢必常常遇到。
訓練用樣本很"少"可以意味著從幾百到幾萬個圖像(視不同的應用與場景)。作為一個示範的案例,我們將集中在將圖像分類為“狗”或“貓”,數據集中包含4000張貓和狗照片(2000隻貓,2000隻狗)。我們將使用2000張圖片進行訓練,1000張用於驗證,最後1000張用於測試。
我們將回顧一個解決這種問題的基本策略:從零開始,我們提供了少量數據來訓練一個新的模型。我們將首先在我們的2000個訓練樣本上簡單地訓練一個小型卷積網絡(convnets)來做為未來優化調整的基準,在這過程中沒有任何正規化(regularization)的手法或配置。
這樣的模型將使我們的分類準確率達到71%左右。在這個階段,我們的主要問題將是過擬合(overfitting)。然後,我們將介紹數據擴充(data augmentation),這是一種用於減輕電腦視覺演算過度擬合(overfitting)的強大武器。通過利用數據擴充(data augmentation),我們將改進網絡,並提升準確率達到82%。
在另一個文章中,我們將探索另外兩種將深度學習應用到小數據集的基本技術:使用預訓練的網絡模型來進行特徵提取(這將使我們達到90%至93%的準確率),調整一個預先訓練的網絡模型(這將使我們達到95%的最終準確率)。 總而言之,以下三個策略:
將構成您未來的工具箱,用於解決計算機視覺運算應用到小數據集的問題上。
# 這個Jupyter Notebook的環境
import platform
import tensorflow
import keras
print("Platform: {}".format(platform.platform()))
print("Tensorflow version: {}".format(tensorflow.__version__))
print("Keras version: {}".format(keras.__version__))
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
from IPython.display import Image
Using TensorFlow backend.
Platform: Windows-10-10.0.15063-SP0 Tensorflow version: 1.4.0 Keras version: 2.0.9
我們將使用的數據集(貓與狗的圖片集)沒有被包裝Keras包裝發佈, 因此要自行另外下載。Kaggle.com在2013年底提供了這些數據來作為電腦視覺競賽題目。您可以從以下連結下載原始數據集:https://www.kaggle.com/c/dogs-vs-cats/data
圖片是中等解析度的彩色JPEG檔。他們看起來像這樣:
該原始數據集包含25,000張狗和貓的圖像(每個類別12,500個),大小為543MB(壓縮)。下載和解壓縮後,我們將創建一個包含三個子集的新數據集:一組包含每個類的1000個樣本的訓練集,每組500個樣本的驗證集,最後一個包含每個類的500個樣本的測試集。
import os
# 專案的根目錄路徑
ROOT_DIR = os.getcwd()
# 置放coco圖像資料與標註資料的目錄
DATA_PATH = os.path.join(ROOT_DIR, "data")
import os, shutil
# 原始數據集的路徑
original_dataset_dir = os.path.join(DATA_PATH, "train")
# 存儲小數據集的目錄
base_dir = os.path.join(DATA_PATH, "cats_and_dogs_small")
if not os.path.exists(base_dir):
os.mkdir(base_dir)
# 我們的訓練資料的目錄
train_dir = os.path.join(base_dir, 'train')
if not os.path.exists(train_dir):
os.mkdir(train_dir)
# 我們的驗證資料的目錄
validation_dir = os.path.join(base_dir, 'validation')
if not os.path.exists(validation_dir):
os.mkdir(validation_dir)
# 我們的測試資料的目錄
test_dir = os.path.join(base_dir, 'test')
if not os.path.exists(test_dir):
os.mkdir(test_dir)
# 貓的圖片的訓練資料目錄
train_cats_dir = os.path.join(train_dir, 'cats')
if not os.path.exists(train_cats_dir):
os.mkdir(train_cats_dir)
# 狗的圖片的訓練資料目錄
train_dogs_dir = os.path.join(train_dir, 'dogs')
if not os.path.exists(train_dogs_dir):
os.mkdir(train_dogs_dir)
# 貓的圖片的驗證資料目錄
validation_cats_dir = os.path.join(validation_dir, 'cats')
if not os.path.exists(validation_cats_dir):
os.mkdir(validation_cats_dir)
# 狗的圖片的驗證資料目錄
validation_dogs_dir = os.path.join(validation_dir, 'dogs')
if not os.path.exists(validation_dogs_dir):
os.mkdir(validation_dogs_dir)
# 貓的圖片的測試資料目錄
test_cats_dir = os.path.join(test_dir, 'cats')
if not os.path.exists(test_cats_dir):
os.mkdir(test_cats_dir)
# 狗的圖片的測試資料目錄
test_dogs_dir = os.path.join(test_dir, 'dogs')
if not os.path.exists(test_dogs_dir):
os.mkdir(test_dogs_dir)
# 複製前1000個貓的圖片到train_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
src = os.path.join(original_dataset_dir, fname)
dst = os.path.join(train_cats_dir, fname)
if not os.path.exists(dst):
shutil.copyfile(src, dst)
print('Copy first 1000 cat images to train_cats_dir complete!')
# 複製下500個貓的圖片到validation_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
src = os.path.join(original_dataset_dir, fname)
dst = os.path.join(validation_cats_dir, fname)
if not os.path.exists(dst):
shutil.copyfile(src, dst)
print('Copy next 500 cat images to validation_cats_dir complete!')
# 複製下500個貓的圖片到test_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
src = os.path.join(original_dataset_dir, fname)
dst = os.path.join(test_cats_dir, fname)
if not os.path.exists(dst):
shutil.copyfile(src, dst)
print('Copy next 500 cat images to test_cats_dir complete!')
# 複製前1000個狗的圖片到train_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
src = os.path.join(original_dataset_dir, fname)
dst = os.path.join(train_dogs_dir, fname)
if not os.path.exists(dst):
shutil.copyfile(src, dst)
print('Copy first 1000 dog images to train_dogs_dir complete!')
# 複製下500個狗的圖片到validation_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
src = os.path.join(original_dataset_dir, fname)
dst = os.path.join(validation_dogs_dir, fname)
if not os.path.exists(dst):
shutil.copyfile(src, dst)
print('Copy next 500 dog images to validation_dogs_dir complete!')
# C複製下500個狗的圖片到test_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
src = os.path.join(original_dataset_dir, fname)
dst = os.path.join(test_dogs_dir, fname)
if not os.path.exists(dst):
shutil.copyfile(src, dst)
print('Copy next 500 dog images to test_dogs_dir complete!')
Copy first 1000 cat images to train_cats_dir complete! Copy next 500 cat images to validation_cats_dir complete! Copy next 500 cat images to test_cats_dir complete! Copy first 1000 dog images to train_dogs_dir complete! Copy next 500 dog images to validation_dogs_dir complete! Copy next 500 dog images to test_dogs_dir complete!
作為一個健康檢查,讓我們計算每次訓練分組中有多少張照片(訓練/驗證/測試):
print('total training cat images:', len(os.listdir(train_cats_dir)))
print('total training dog images:', len(os.listdir(train_dogs_dir)))
print('total validation cat images:', len(os.listdir(validation_cats_dir)))
print('total validation dog images:', len(os.listdir(validation_dogs_dir)))
print('total test cat images:', len(os.listdir(test_cats_dir)))
print('total test dog images:', len(os.listdir(test_dogs_dir)))
total training cat images: 1000 total training dog images: 1000 total validation cat images: 500 total validation dog images: 500 total test cat images: 500 total test dog images: 500
所以我們確實有2000個訓練圖像,然後是1000個驗證圖像和1000個測試圖像。在每個資料分割(split)中,每個分類都有相同數量的樣本:這是一個平衡的二元分類問題,這意味著分類準確度將成為適當的度量。
如現在已經知道的那樣,數據應該被格式化成適當的預處理浮點張量,然後才能餵進我們的神經網絡。目前,我們的數據是在檔案目裡裡的JPEG影像文件,所以進入我們網絡的前處理步驟大概是:
這可能看起來有點令人生畏,但感謝Keras有一些工具程序可自動處理這些步驟。 Keras有一個圖像處理助手工具的模塊,位於keras.preprocessing.image
。其中的ImageDataGenerator
類別,可以快速的自動將磁盤上的圖像文件轉換成張量(tensors)。我們將在這裡使用這個工具。
from keras.preprocessing.image import ImageDataGenerator
# 所有的圖像將重新被進行歸一化處理 Rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
# 直接從檔案目錄讀取圖像檔資料
train_generator = train_datagen.flow_from_directory(
# 這是圖像資料的目錄
train_dir,
# 所有的圖像大小會被轉換成150x150
target_size=(150, 150),
# 每次產生20圖像的批次資料
batch_size=20,
# 由於這是一個二元分類問題, y的lable值也會被轉換成二元的標籤
class_mode='binary')
# 直接從檔案目錄讀取圖像檔資料
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150, 150),
batch_size=20,
class_mode='binary')
Found 2000 images belonging to 2 classes. Found 1000 images belonging to 2 classes.
我們來看看這些圖像張量產生器(generator)的輸出:它產生150×150 RGB圖像(形狀“(20,150,150,3)”)和二進制標籤(形狀“(20,)”)的批次張量。 20是每個批次中的樣品數(批次大小)。請注意,產生器可以無限制地產生這些批次:因為它只是持續地循環遍歷目標文件夾中存在的圖像。因此,我們需要在某些時候break
迭代循環。
for data_batch, labels_batch in train_generator:
print('data batch shape:', data_batch.shape)
print('labels batch shape:', labels_batch.shape)
break
data batch shape: (20, 150, 150, 3) labels batch shape: (20,)
讓我們將模型與使用圖像張量產生器的數據進行訓練。我們使用fit_generator
方法。
因為數據是可以無休止地持續生成,所以圖像張量產生器需要知道在一個訓練循環(epoch)要從圖像張量產生器中抽取多少個資料。這是steps_per_epoch
參數的作用:在從生成器中跑過steps_per_epoch
批次之後,即在運行steps_per_epoch
梯度下降步驟之後,訓練過程將轉到下一個循環(epoch)。在我們的情況下,批次是20個樣本,所以它需要100次,直到我們的模型讀進了2000個目標樣本。
當使用fit_generator
時,可以傳遞一個validation_data
參數,就像fit
方法一樣。重要的是,這個參數被允許作為數據生成器本身,但它也可以是一個Numpy數組的元組。如果您將生成器傳遞為validation_data
,那麼這個生成器有望不斷生成一批驗證數據,因此您還應該指定validation_steps
參數,該參數告訴進程從驗證生成器中抽取多少批次以進行評估。
我們的卷積網絡(convnets)將是一組交替的Conv2D
(具有relu
激活)和MaxPooling2D
層。
我們從大小150x150(有點任意選擇)的輸入開始,我們最終得到了尺寸為7x7的Flatten
層之前的特徵圖。
注意,特徵圖的深度在網絡中逐漸增加(從32到128),而特徵圖的大小正在減少(從148x148到7x7)。這是一個你將在幾乎所有的卷積網絡(convnets)建構中會看到的模式。
由於我們正在處理二元分類問題,所以我們用一個神經元(一個大小為1的密集層(Dense)
)和一個sigmoid
激活函數來結束網絡。該神經元將會被用來查看圖像歸屬於那一類或另一類的機率。
from keras import layers
from keras import models
from keras.utils import plot_model
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
我們來看看特徵圖的尺寸如何隨著每個連續的圖層而改變:
# 打印網絡結構
model.summary()
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_1 (Conv2D) (None, 148, 148, 32) 896 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 74, 74, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 72, 72, 64) 18496 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 36, 36, 64) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 34, 34, 128) 73856 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 17, 17, 128) 0 _________________________________________________________________ conv2d_4 (Conv2D) (None, 15, 15, 128) 147584 _________________________________________________________________ max_pooling2d_4 (MaxPooling2 (None, 7, 7, 128) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 6272) 0 _________________________________________________________________ dense_1 (Dense) (None, 512) 3211776 _________________________________________________________________ dense_2 (Dense) (None, 1) 513 ================================================================= Total params: 3,453,121 Trainable params: 3,453,121 Non-trainable params: 0 _________________________________________________________________
在我們的編譯步驟裡,我們使用RMSprop
優化器。由於我們用一個單一的神經元(Sigmoid
的激活函數)結束了我們的網絡,我們將使用二進制交叉熵(binary crossentropy)
作為我們的損失函數。
from keras import optimizers
model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=['acc'])
history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=30,
validation_data=validation_generator,
validation_steps=50)
Epoch 1/30 100/100 [==============================] - 11s 112ms/step - loss: 0.6955 - acc: 0.5295 - val_loss: 0.6809 - val_acc: 0.5340 Epoch 2/30 100/100 [==============================] - 8s 78ms/step - loss: 0.6572 - acc: 0.6135 - val_loss: 0.6676 - val_acc: 0.5710 Epoch 3/30 100/100 [==============================] - 8s 76ms/step - loss: 0.6214 - acc: 0.6590 - val_loss: 0.6408 - val_acc: 0.6290 Epoch 4/30 100/100 [==============================] - 8s 77ms/step - loss: 0.5882 - acc: 0.6920 - val_loss: 0.6296 - val_acc: 0.6440 Epoch 5/30 100/100 [==============================] - 8s 77ms/step - loss: 0.5517 - acc: 0.7130 - val_loss: 0.7241 - val_acc: 0.5950 Epoch 6/30 100/100 [==============================] - 8s 76ms/step - loss: 0.5108 - acc: 0.7530 - val_loss: 0.5785 - val_acc: 0.7010 Epoch 7/30 100/100 [==============================] - 8s 78ms/step - loss: 0.4899 - acc: 0.7680 - val_loss: 0.5620 - val_acc: 0.7060 Epoch 8/30 100/100 [==============================] - 8s 76ms/step - loss: 0.4575 - acc: 0.7810 - val_loss: 0.5803 - val_acc: 0.6970 Epoch 9/30 100/100 [==============================] - 8s 76ms/step - loss: 0.4193 - acc: 0.8070 - val_loss: 0.5881 - val_acc: 0.7120 Epoch 10/30 100/100 [==============================] - 8s 75ms/step - loss: 0.3869 - acc: 0.8195 - val_loss: 0.5986 - val_acc: 0.7050 Epoch 11/30 100/100 [==============================] - 8s 75ms/step - loss: 0.3620 - acc: 0.8355 - val_loss: 0.6368 - val_acc: 0.7090 Epoch 12/30 100/100 [==============================] - 8s 76ms/step - loss: 0.3434 - acc: 0.8480 - val_loss: 0.6214 - val_acc: 0.6970 Epoch 13/30 100/100 [==============================] - 8s 75ms/step - loss: 0.3165 - acc: 0.8670 - val_loss: 0.6897 - val_acc: 0.7010 Epoch 14/30 100/100 [==============================] - 8s 76ms/step - loss: 0.2878 - acc: 0.8755 - val_loss: 0.6249 - val_acc: 0.7100 Epoch 15/30 100/100 [==============================] - 8s 77ms/step - loss: 0.2650 - acc: 0.8975 - val_loss: 0.6438 - val_acc: 0.7060 Epoch 16/30 100/100 [==============================] - 8s 76ms/step - loss: 0.2362 - acc: 0.9090 - val_loss: 0.7780 - val_acc: 0.6920 Epoch 17/30 100/100 [==============================] - 8s 76ms/step - loss: 0.2098 - acc: 0.9165 - val_loss: 0.8215 - val_acc: 0.6750 Epoch 18/30 100/100 [==============================] - 8s 76ms/step - loss: 0.1862 - acc: 0.9305 - val_loss: 0.7044 - val_acc: 0.7120 Epoch 19/30 100/100 [==============================] - 8s 75ms/step - loss: 0.1669 - acc: 0.9425 - val_loss: 0.7941 - val_acc: 0.6990 Epoch 20/30 100/100 [==============================] - 8s 75ms/step - loss: 0.1522 - acc: 0.9475 - val_loss: 0.8285 - val_acc: 0.6960 Epoch 21/30 100/100 [==============================] - 8s 75ms/step - loss: 0.1254 - acc: 0.9575 - val_loss: 0.8199 - val_acc: 0.7070 Epoch 22/30 100/100 [==============================] - 8s 78ms/step - loss: 0.1117 - acc: 0.9620 - val_loss: 0.9325 - val_acc: 0.7090 Epoch 23/30 100/100 [==============================] - 8s 76ms/step - loss: 0.0907 - acc: 0.9750 - val_loss: 0.8740 - val_acc: 0.7220 Epoch 24/30 100/100 [==============================] - 8s 75ms/step - loss: 0.0806 - acc: 0.9755 - val_loss: 1.0178 - val_acc: 0.6900 Epoch 25/30 100/100 [==============================] - 8s 75ms/step - loss: 0.0602 - acc: 0.9815 - val_loss: 0.9158 - val_acc: 0.7260 Epoch 26/30 100/100 [==============================] - 8s 76ms/step - loss: 0.0591 - acc: 0.9810 - val_loss: 1.1284 - val_acc: 0.7030 Epoch 27/30 100/100 [==============================] - 8s 75ms/step - loss: 0.0511 - acc: 0.9820 - val_loss: 1.1136 - val_acc: 0.7140 Epoch 28/30 100/100 [==============================] - 8s 76ms/step - loss: 0.0335 - acc: 0.9930 - val_loss: 1.4372 - val_acc: 0.6820 Epoch 29/30 100/100 [==============================] - 8s 75ms/step - loss: 0.0409 - acc: 0.9860 - val_loss: 1.2121 - val_acc: 0.6930 Epoch 30/30 100/100 [==============================] - 8s 76ms/step - loss: 0.0271 - acc: 0.9920 - val_loss: 1.3055 - val_acc: 0.7010
訓練完後就把模型保存是個好習慣:
model.save('cats_and_dogs_small_2.h5')
讓我們使用圖表來秀出在訓練過程中模型對訓練和驗證數據的損失(loss)和準確性(accuracy)數據:
import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, label='Training acc')
plt.plot(epochs, val_acc, label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, label='Training loss')
plt.plot(epochs, val_loss, label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
這些圖表顯示了過度擬合(overfitting)的特徵。我們的訓練精確度隨著時間線性增長,直到接近100%,然而我們的驗證精確度卻停在70 ~ 72%。我們的驗證損失在第五個循環(epochs)之後達到最小值,然後停頓,而訓練損失在線性上保持直到達到接近0。
因為我們只有相對較少的訓練數據(2000筆),過度擬合(overfitting)將成為我們的首要的關注點。您已經知道了許多可以幫助減輕過度擬合的技術,例如Dropout和權重衰減(L2正規化)。我們現在要引入一個新的,特定於電腦視覺影像,並在使用深度學習模型處理圖像時幾乎普遍使用的技巧:數據擴充(data augmentation)。
過度擬合(overfitting)是由於樣本數量太少而導致的,導致我們無法訓練能夠推廣到新數據的模型。
給定無限數據,我們的模型將暴露在手頭的數據分佈的每個可能的方面:我們永遠不會過度的。數據增加採用從現有訓練樣本生成更多訓練數據的方法,通過產生可信的圖像的多個隨機變換來“增加”樣本。目標是在訓練的時候,我們的模型永遠不會再看到完全相同的畫面兩次。這有助於模型暴露於數據的更多方面,並更好地推廣。
在Keras中,可以通過配置對我們的ImageDataGenerator實例讀取的圖像執行多個隨機變換來完成。讓我們開始一個例子:
datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
這些只是列出一些可用的選項(更多選項資訊,請參閱Keras文檔)。我們快速看一下這些參數:
rotation_range
是以度(0-180)為單位的值,它是隨機旋轉圖片的範圍。width_shift
和height_shift
是范圍(佔總寬度或高度的一小部分),用於縱向或橫向隨機轉換圖片。shear_range
用於隨機剪切變換。zoom_range
用於隨機放大圖片內容。horizontal_flip
用於在沒有水平不對稱假設(例如真實世界圖片)的情況下水平地隨機翻轉一半圖像。fill_mode
是用於填充新創建的像素的策略,可以在旋轉或寬/高移位後顯示。我們來看看我們的增強後的圖像:
import matplotlib.pyplot as plt
from keras.preprocessing import image
# 取得訓練資料集中貓的檔案列表
fnames = [os.path.join(train_cats_dir, fname) for fname in os.listdir(train_cats_dir)]
# 取一個圖像
img_path = fnames[3]
# 讀圖像並進行大小處理
img = image.load_img(img_path, target_size=(150, 150))
# 轉換成Numpy array並且shape (150, 150, 3)
x = image.img_to_array(img)
# 重新Reshape成 (1, 150, 150, 3)以便輸入到模型中
x = x.reshape((1,) + x.shape)
# 透過flow()方法將會隨機產生新的圖像
# 它會無限循環,所以我們需要在某個時候“斷開”循環
i = 0
for batch in datagen.flow(x, batch_size=1):
plt.figure(i)
imgplot = plt.imshow(image.array_to_img(batch[0]))
i += 1
if i % 4 == 0:
break
plt.show()
如果我們使用這種數據增強配置來訓練一個新的網絡,我們的網絡將永遠不會看到相同重覆的輸入。然而,它看到的輸入仍然是相互關聯的,因為它們來自少量的原始圖像 - 我們不能產生新的信息,我們只能重新混合現有的信息。因此,這可能不足以完全擺脫過度擬合(overfitting)。為了進一步打擊過度擬合(overfitting),我們還將在密集連接(densely-connected)的分類器之前添加一個Dropout層:
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=['acc'])
我們使用數據擴充(data augmentation)和dropout來訓練我們的網絡:
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
# 這是圖像資料的目錄
train_dir,
# 所有的圖像大小會被轉換成150x150
target_size=(150, 150),
batch_size=32,
# 由於這是一個二元分類問題, y的lable值也會被轉換成二元的標籤
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150, 150),
batch_size=32,
class_mode='binary')
history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=50,
validation_data=validation_generator,
validation_steps=50)
Found 2000 images belonging to 2 classes. Found 1000 images belonging to 2 classes. Epoch 1/50 100/100 [==============================] - 21s 208ms/step - loss: 0.6937 - acc: 0.5078 - val_loss: 0.6878 - val_acc: 0.4848 Epoch 2/50 100/100 [==============================] - 19s 189ms/step - loss: 0.6829 - acc: 0.5572 - val_loss: 0.6779 - val_acc: 0.5647 Epoch 3/50 100/100 [==============================] - 19s 189ms/step - loss: 0.6743 - acc: 0.5816 - val_loss: 0.6520 - val_acc: 0.6136 Epoch 4/50 100/100 [==============================] - 19s 189ms/step - loss: 0.6619 - acc: 0.6122 - val_loss: 0.6348 - val_acc: 0.6218 Epoch 5/50 100/100 [==============================] - 19s 189ms/step - loss: 0.6448 - acc: 0.6338 - val_loss: 0.6166 - val_acc: 0.6428 Epoch 6/50 100/100 [==============================] - 19s 187ms/step - loss: 0.6342 - acc: 0.6453 - val_loss: 0.6137 - val_acc: 0.6656 Epoch 7/50 100/100 [==============================] - 19s 189ms/step - loss: 0.6255 - acc: 0.6525 - val_loss: 0.5948 - val_acc: 0.6713 Epoch 8/50 100/100 [==============================] - 19s 188ms/step - loss: 0.6269 - acc: 0.6444 - val_loss: 0.5899 - val_acc: 0.6891 Epoch 9/50 100/100 [==============================] - 19s 187ms/step - loss: 0.6105 - acc: 0.6691 - val_loss: 0.6440 - val_acc: 0.6313 Epoch 10/50 100/100 [==============================] - 19s 189ms/step - loss: 0.5952 - acc: 0.6794 - val_loss: 0.6291 - val_acc: 0.6263 Epoch 11/50 100/100 [==============================] - 21s 209ms/step - loss: 0.5926 - acc: 0.6850 - val_loss: 0.5518 - val_acc: 0.7049 Epoch 12/50 100/100 [==============================] - 19s 193ms/step - loss: 0.5830 - acc: 0.6844 - val_loss: 0.5418 - val_acc: 0.7234 Epoch 13/50 100/100 [==============================] - 19s 189ms/step - loss: 0.5839 - acc: 0.6903 - val_loss: 0.5382 - val_acc: 0.7354 Epoch 14/50 100/100 [==============================] - 19s 187ms/step - loss: 0.5663 - acc: 0.6944 - val_loss: 0.5891 - val_acc: 0.6662 Epoch 15/50 100/100 [==============================] - 19s 187ms/step - loss: 0.5620 - acc: 0.7175 - val_loss: 0.5613 - val_acc: 0.6923 Epoch 16/50 100/100 [==============================] - 19s 188ms/step - loss: 0.5458 - acc: 0.7228 - val_loss: 0.4970 - val_acc: 0.7582 Epoch 17/50 100/100 [==============================] - 19s 187ms/step - loss: 0.5478 - acc: 0.7106 - val_loss: 0.5104 - val_acc: 0.7335 Epoch 18/50 100/100 [==============================] - 19s 188ms/step - loss: 0.5479 - acc: 0.7250 - val_loss: 0.4990 - val_acc: 0.7544 Epoch 19/50 100/100 [==============================] - 19s 189ms/step - loss: 0.5390 - acc: 0.7275 - val_loss: 0.4918 - val_acc: 0.7557 Epoch 20/50 100/100 [==============================] - 19s 187ms/step - loss: 0.5391 - acc: 0.7209 - val_loss: 0.4965 - val_acc: 0.7532 Epoch 21/50 100/100 [==============================] - 19s 187ms/step - loss: 0.5379 - acc: 0.7262 - val_loss: 0.4888 - val_acc: 0.7640 Epoch 22/50 100/100 [==============================] - 19s 188ms/step - loss: 0.5168 - acc: 0.7400 - val_loss: 0.5499 - val_acc: 0.7056 Epoch 23/50 100/100 [==============================] - 19s 188ms/step - loss: 0.5250 - acc: 0.7369 - val_loss: 0.4768 - val_acc: 0.7697 Epoch 24/50 100/100 [==============================] - 19s 189ms/step - loss: 0.5088 - acc: 0.7359 - val_loss: 0.4716 - val_acc: 0.7766 Epoch 25/50 100/100 [==============================] - 19s 188ms/step - loss: 0.5218 - acc: 0.7359 - val_loss: 0.4922 - val_acc: 0.7544 Epoch 26/50 100/100 [==============================] - 19s 187ms/step - loss: 0.5143 - acc: 0.7391 - val_loss: 0.4687 - val_acc: 0.7716 Epoch 27/50 100/100 [==============================] - 19s 188ms/step - loss: 0.5111 - acc: 0.7494 - val_loss: 0.4637 - val_acc: 0.7671 Epoch 28/50 100/100 [==============================] - 19s 190ms/step - loss: 0.4974 - acc: 0.7506 - val_loss: 0.4899 - val_acc: 0.7557 Epoch 29/50 100/100 [==============================] - 19s 188ms/step - loss: 0.5136 - acc: 0.7463 - val_loss: 0.5077 - val_acc: 0.7557 Epoch 30/50 100/100 [==============================] - 19s 190ms/step - loss: 0.5019 - acc: 0.7559 - val_loss: 0.4595 - val_acc: 0.7830 Epoch 31/50 100/100 [==============================] - 19s 188ms/step - loss: 0.4961 - acc: 0.7628 - val_loss: 0.4805 - val_acc: 0.7709 Epoch 32/50 100/100 [==============================] - 19s 190ms/step - loss: 0.4925 - acc: 0.7638 - val_loss: 0.4463 - val_acc: 0.7874 Epoch 33/50 100/100 [==============================] - 19s 189ms/step - loss: 0.4783 - acc: 0.7700 - val_loss: 0.4667 - val_acc: 0.7824 Epoch 34/50 100/100 [==============================] - 19s 190ms/step - loss: 0.4792 - acc: 0.7738 - val_loss: 0.4307 - val_acc: 0.8084 Epoch 35/50 100/100 [==============================] - 19s 190ms/step - loss: 0.4774 - acc: 0.7753 - val_loss: 0.4269 - val_acc: 0.8027 Epoch 36/50 100/100 [==============================] - 19s 191ms/step - loss: 0.4756 - acc: 0.7725 - val_loss: 0.4642 - val_acc: 0.7652 Epoch 37/50 100/100 [==============================] - 19s 190ms/step - loss: 0.4796 - acc: 0.7684 - val_loss: 0.4349 - val_acc: 0.7995 Epoch 38/50 100/100 [==============================] - 19s 190ms/step - loss: 0.4895 - acc: 0.7665 - val_loss: 0.4588 - val_acc: 0.7836 Epoch 39/50 100/100 [==============================] - 19s 190ms/step - loss: 0.4832 - acc: 0.7694 - val_loss: 0.4243 - val_acc: 0.8001 Epoch 40/50 100/100 [==============================] - 19s 191ms/step - loss: 0.4678 - acc: 0.7772 - val_loss: 0.4442 - val_acc: 0.7773 Epoch 41/50 100/100 [==============================] - 19s 188ms/step - loss: 0.4623 - acc: 0.7797 - val_loss: 0.4565 - val_acc: 0.7874 Epoch 42/50 100/100 [==============================] - 19s 190ms/step - loss: 0.4668 - acc: 0.7697 - val_loss: 0.5352 - val_acc: 0.7297 Epoch 43/50 100/100 [==============================] - 19s 191ms/step - loss: 0.4612 - acc: 0.7906 - val_loss: 0.4236 - val_acc: 0.7951 Epoch 44/50 100/100 [==============================] - 19s 189ms/step - loss: 0.4598 - acc: 0.7816 - val_loss: 0.4343 - val_acc: 0.7893 Epoch 45/50 100/100 [==============================] - 19s 189ms/step - loss: 0.4553 - acc: 0.7881 - val_loss: 0.4315 - val_acc: 0.7970 Epoch 46/50 100/100 [==============================] - 19s 189ms/step - loss: 0.4621 - acc: 0.7734 - val_loss: 0.4303 - val_acc: 0.8027 Epoch 47/50 100/100 [==============================] - 19s 189ms/step - loss: 0.4516 - acc: 0.7912 - val_loss: 0.4099 - val_acc: 0.8065 Epoch 48/50 100/100 [==============================] - 19s 190ms/step - loss: 0.4524 - acc: 0.7822 - val_loss: 0.4088 - val_acc: 0.8115 Epoch 49/50 100/100 [==============================] - 19s 189ms/step - loss: 0.4508 - acc: 0.7944 - val_loss: 0.4048 - val_acc: 0.8128 Epoch 50/50 100/100 [==============================] - 19s 188ms/step - loss: 0.4368 - acc: 0.7953 - val_loss: 0.4746 - val_acc: 0.7722
我們來保存我們的模型 - 我們將在convnet可視化裡使用它。
model.save('cats_and_dogs_small_2.h5')
我們再來看一遍我們的結果:
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, label='Training acc')
plt.plot(epochs, val_acc, label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, label='Training loss')
plt.plot(epochs, val_loss, label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
由於數據增加(data augmentation)和丟棄(dropout)的使用,我們不再有過度擬合(overfitting)的問題:訓練曲線相當密切地跟隨著驗證曲線。我們現在能夠達到82%的準確度,比非正規化的模型相比改善了15%。
通過進一步利用正規化技術,及調整網絡參數(例如每個卷積層的濾波器數量或網絡層數),我們可以獲得更好的準確度,可能高達86 ~ 87%。然而,只要我們從頭開始訓練我們自己的卷積網絡(convnets),我們可以證明使用這麼少的數據要來訓練出一個準確率高的模型是非常困難的。為了繼續提高我們模型對這個問題的準確性,下一步我們將利用預先訓練的模型(pre-trained model)來進行操作。
在這篇文章中有一些個人學習到的一些有趣的重點:
MIT License
Copyright (c) 2017 François Chollet
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.