Transfer learning is a type of machine learning technique that takes advantage of knowledge gained from solving a problem in one domain, and transferring it to another in order to optimize accuracy and speed.
A common application of transfer learning is using the features learned by large convolutional neural networks as additional information that we can transfer to another domain. Models like VGG16 have learned generalized features seen in the 1000 classes of ImageNet, and these features can be used for a more specialized task.
One of the biggest benefits and motivations of transfer learning is to take advantage of small datasets for domains and tasks with little data available.
There are typically two approaches in applying transfer learning to deep learning:
Applying tranfer learning to machine learning originated in a NIPS-95 (Conference on Neural Information Processing Systems) workshop on "Learning to Learn", which focused on machine-learning methods that retain and reuse previously learned knowledge.
Since then transfer learning has appeared in different contexts under many different names such as:
Multitask learning is closely related, and tries to learn multiple tasks at once. We will cover it later in the semester.
In 2005, DARPA sought to apply transfer learning with the intent of extracting the knowledge learned from a source task to apply to a target task, without necessary regard for the source task. This differed from the goal of multitask learning in which all tasks are equally important.
Today with the abundance of pretrained models available, transfer learning is very prominent in the fields of machine learning and data mining.
A domain $\mathcal{D}$ consists of a feature space $\mathcal{X}$ and marginal probability distribution $P(X)$ over the feature space, where $X = x_1, ..., x_n \in \mathcal{X}$.
Given a domain, $\mathcal{D} = \left\{\mathcal{X}, P(X)\right\}$, a task $\mathcal{T}$ consists of a label space $\mathcal{Y}$ and a conditional probability distribution $P(Y|X)$ learned from the training data.
Given a source domain $\mathcal{D}_S$, a corresponding source task $\mathcal{T}_S$, as well as a target domain $\mathcal{D}_T$ and a target task $\mathcal{T}_T$, the objective of transfer learning is to learn the target conditional probability distribution $P(Y_T|X_T)$ in $\mathcal{D_T}$ with information gained from $\mathcal{D}_S$ and $\mathcal{D}_T$ where $\mathcal{D}_S \neq \mathcal{D}_T$ or $\mathcal{T}_S \neq \mathcal{T}_T$.
Early approaches to transfer learning defined three different settings within transfer learning:
Along with these settings are four different approaches that are be used in these settings:
The table below shows how these approaches can be used in different settings:
(Sinno Jialin Pan and Qiang Yang, A Survey on Transfer Learning)
A transfer learning example in which the domains are the same, but tasks are different, would be a classifier to detect spam in email. A model could be trained on emails from multiple users. A new email user could then use this model to filter their own messages.
An example in which domains differ but tasks are the same would be a classifier to detect bicycles, but the target domain has very little data (bikes in the wild), and the source domain has a lot of data (bikes in the lab).
(Sun, B., Feng, J., & Saenko, K. (2016). Return of Frustratingly Easy Domain Adaptation)
Pre-trained CNN features are often used as a form of transfer learning in a variety of tasks such as image classification, object detection, style transfer and generative models. Large CNNs that have been trained on ImageNet learn the underlying structure of images, which is knowledge that can be transferred. A lot of recent research in computer vision and deep learning tend to use VGG16 or VGG19 as a base to extract features from the target domain.
State of the art object detection is done by YOLO, which pretrains with ImageNet, to learn knowledge from a classification task before training the model for object detection.
Johnson et al's real time style transfer use the extracted features from VGG19 in the loss function used to train the style transformation network. The features can be used as statistics measuring the style and content of images.
Another application of using VGG19 in a loss function is with super resolution. The loss is taken as the difference between the extracted features of the output image and the actual high resolution image.
Applications in NLP include learning from a large set of labeled reviews, and transferring to a model to analyze reviews for a new product.
Some code adapted from Francois Chollet:
We will create a classifier to detect if a given image contains a cat xor a dog.
This classifier will be trained on features extracted from VGG16. We will then "fine tune" the last convolutional block of VGG, with our classifier on top using data augmentation.
Our training data consists of about 20,000 images (compared to ImageNet's 13M images), which we will see can still give us an impressive accuracy.
# watermark the notebook
import sys
import tensorflow.keras
import scipy as sp
import tensorflow as tf
import platform
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
import matplotlib.pyplot as plt
print(f"Python Platform: {platform.platform()}")
print(f"Tensor Flow Version: {tf.__version__}")
print(f"Keras Version: {tensorflow.keras.__version__}")
print()
print(f"Python {sys.version}")
gpus = tf.config.list_physical_devices('GPU')
print("GPU Resources Available:\n\t",gpus)
Python Platform: macOS-14.2-arm64-arm-64bit Tensor Flow Version: 2.12.0 Keras Version: 2.12.0 Python 3.8.16 (default, Mar 1 2023, 21:18:45) [Clang 14.0.6 ] GPU Resources Available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
import os
from tensorflow.io import read_file, write_file
from tensorflow.image import decode_image
if True:
should_rewrite_image = True # set to true if you are getting "Corrupt Data" error
num_skipped = 0
for folder_name in ("Cat", "Dog"):
folder_path = os.path.join("data/cats_dogs", folder_name)
for fname in os.listdir(folder_path):
fpath = os.path.join(folder_path, fname)
is_jfif = True
should_remove = False
with open(fpath, "rb") as fobj:
is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
try:
img = read_file(fpath)
if not tf.io.is_jpeg(img):
should_remove = True
img = decode_image(img)
if img.ndim != 3:
should_remove = True
except Exception as e:
should_remove = True
if (not is_jfif) or should_remove:
num_skipped += 1
# Delete corrupted image
os.remove(fpath)
elif should_rewrite_image:
tmp = tf.io.encode_jpeg(img)
write_file(fpath, tmp)
print("Deleted %d images" % num_skipped)
Metal device set to: Apple M2
Corrupt JPEG data: 1153 extraneous bytes before marker 0xd9 Corrupt JPEG data: 99 extraneous bytes before marker 0xd9 Corrupt JPEG data: 128 extraneous bytes before marker 0xd9 Corrupt JPEG data: 239 extraneous bytes before marker 0xd9 Corrupt JPEG data: 65 extraneous bytes before marker 0xd9 Corrupt JPEG data: 228 extraneous bytes before marker 0xd9 Corrupt JPEG data: 162 extraneous bytes before marker 0xd9 Warning: unknown JFIF revision number 0.00 Corrupt JPEG data: 2226 extraneous bytes before marker 0xd9 Corrupt JPEG data: 252 extraneous bytes before marker 0xd9 Corrupt JPEG data: 396 extraneous bytes before marker 0xd9 Corrupt JPEG data: 1403 extraneous bytes before marker 0xd9
Deleted 1592 images
So the above code went in and eliminated corrupt JPEG files, removing about 1600 images from the cats/dogs dataset. If you do not run that code above, eventually you will run into an error when we try to read in an image that cannot be properly decoded. It takes a few minutes to run the code above.
Now we will use the image directories to create two td datasets from the directory. Tensorflow datasets are nice data structures that allow us to rapidly loop through data and batch it for input to a tensorflow model. They also allow some pre-processing to occur and save out various versions of themselves (though we will not really use that mapping functionality here).
# dimensions of our images.
img_width, img_height = 150, 150
batch_size = 64
data_dir = 'data/cats_dogs/'
train_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
val_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
Found 23410 files belonging to 2 classes. Using 18728 files for training. Found 23410 files belonging to 2 classes. Using 4682 files for validation.
%matplotlib inline
# create two sequential models that simply rescale and augment the data
# By default, these operations will be on the same device
data_rescale = tf.keras.Sequential([
layers.Rescaling(1./255)
], name='normalizer')
# WARNING: this will not run on an M1 Mac as of January 2023
# Some are supported on newer M1 Macs, but then cause problems
# down the line for training convolutional layers...
# UPDATE: As of 2024, with tf versions 2.12, this does work!
#. All the augmentations are present and they help to generalize the model.
on_m1_mac = False
if not on_m1_mac:
data_augmentation = tf.keras.Sequential([
layers.RandomFlip('horizontal'), # not supported on M1
layers.RandomHeight(0.2), # not supported on M1
layers.RandomWidth(0.2), # not supported on M1
layers.RandomZoom(0.2), # zoom in randomly, up to 30%, # not supported on M1
layers.RandomRotation(0.2),
], name='augmentation')
# show some of augmented images for a batch
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
for i in range(9):
# we can get the images from the model here:
augmented_images = data_augmentation(images)
ax = plt.subplot(3, 3, i + 1)
# and then convert the data into numpy here:
plt.imshow(augmented_images[0].numpy().astype("uint8"))
plt.axis("off")
2024-02-08 11:03:46.159062: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
images = []
titles = []
i = 0
for image, labels in train_ds.take(4):
print(image.shape)
image = [x.numpy().astype("uint8") for x in image]
images.extend(image)
labels = ['Dog' if x==1 else 'Cat' for x in labels]
#titles.append('Cat' if labels[0] == 1 else 'Dog')
titles.extend(labels)
print(f'Found {len(images)} images to plot')
(64, 150, 150, 3) (64, 150, 150, 3) (64, 150, 150, 3) (64, 150, 150, 3) Found 256 images to plot
def plot_gallery(images, titles, n_row=3, n_col=6):
"""Helper function to plot a gallery of portraits"""
plt.figure(figsize=(1.7 * n_col, 2.3 * n_row))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
for i in range(n_row * n_col):
plt.subplot(n_row, n_col, i + 1)
plt.imshow(images[i])
plt.title(titles[i], size=12)
plt.xticks(())
plt.yticks(())
plot_gallery(images, titles, n_row=6, n_col=6)
from tensorflow.keras.applications import VGG16
# build the VGG16 network, leaving off the top classifier layer
# so we just get the features as output
print('Building VGG16...')
input_tensor = layers.Input(shape=(img_width, img_height, 3))
# UPDATE: this does now work with M1/M2 macs
if not on_m1_mac:
# here is where you can add in augmentation...
x = data_augmentation(input_tensor) # add in flips, scale, roatate, and everything else
x = data_rescale(x) # 1/255 from above
else:
x = data_rescale(input_tensor) # 1/255 from above, just normalize the data in pipe
base_model = VGG16(weights='imagenet', include_top=False,
input_tensor=x)
Building VGG16... Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5 58889256/58889256 [==============================] - 4s 0us/step
%%time
# Save bottleneck features from VGG
def save_bottleneck(ds, filename_addendum):
bottleneck_features = []
labels_train = []
for data, label in ds:
# loop through and get features and labels as lists
bottleneck_features.extend(base_model.predict(data,verbose=0))
labels_train.extend(label)
# convert to numpy and save
bottleneck_features = np.array(bottleneck_features)
labels_train = np.array(labels_train)
np.save(f'data/bottleneck_features_{filename_addendum}.npy', bottleneck_features)
np.save(f'data/bottleneck_labels_{filename_addendum}.npy', labels_train)
# Save training features
print('Saving bottleneck features (train)...')
save_bottleneck(train_ds,'train')
# Save validation features
print('Saving bottleneck features (test)...')
save_bottleneck(val_ds,'test')
#Submitted fix on Github: I have like 6-8 corrupted JPEG files that are not getting found..
# these need to be re-written to disk for the code to work.
# If you get an error in this block it likely that corrupt JPEG is to blame...
Saving bottleneck features (train)... Saving bottleneck features (test)... CPU times: user 42.9 s, sys: 15.6 s, total: 58.5 s Wall time: 2min 7s
# Build model
print('Building top model...')
top_model = tf.keras.Sequential(name='transfer_top')
# perform an average pooling here to collapse the features down
# some implementations opt to flatten the output convolutions
# this global avg pooling 2D collapses each filter output to a single value
# therefore each image is represented by 512 features, each the avg filter output
top_model.add(layers.GlobalAveragePooling2D())
# add two fully connected layers and some dropout
top_model.add(layers.Dense(256, activation='relu'))
top_model.add(layers.Dropout(0.5))
top_model.add(layers.Dense(1, activation='sigmoid'))
# compile and add loss function. Cross entropy seems like a good choice
# using the "binary_crossentropy" allows us to represent the output as 0 or 1
top_model.compile(optimizer='adam',
loss='binary_crossentropy', metrics=['accuracy'])
Building top model...
%%time
# Train model
# Load bottleneck features that have already been extracted by base model
train_data = np.load('data/bottleneck_features_train.npy')
# the first half of labels are sharks, and second half are dolphins
train_labels = np.load('data/bottleneck_labels_train.npy')
print('Training Data Shape: ',train_data.shape, 'Training Label Shape: ',train_labels.shape)
validation_data = np.load('data/bottleneck_features_test.npy')
# the first half of labels are sharks, and second half are dolphins
validation_labels = np.load('data/bottleneck_labels_test.npy')
print('Val Data Shape: ',validation_data.shape, 'Val Label Shape: ', validation_labels.shape)
# setup params and where to save features
epochs = 30
checkpoint_filepath = 'models/checkpoint'
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_filepath,
save_weights_only=True,
monitor='val_accuracy',
mode='max',
save_best_only=True)
print('Training transfer model from bottleneck...')
history = top_model.fit(train_data, train_labels,
epochs=epochs,
batch_size=batch_size,
callbacks=[model_checkpoint_callback],
validation_data=(validation_data, validation_labels),
verbose=1)
# notice that the model gets accurate on the validation VERY quickly.
# So quickly, that we need to be careful not to over train.
top_model_weights_path = 'models/cat_dog_fc.h5'
top_model.save_weights(top_model_weights_path)
top_model.save('models/cat_dog_full')
Training Data Shape: (18728, 4, 4, 512) Training Label Shape: (18728,) Val Data Shape: (4682, 4, 4, 512) Val Label Shape: (4682,) Training transfer model from bottleneck... Epoch 1/30
loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x256xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x256xi1>'
293/293 [==============================] - ETA: 0s - loss: 0.3264 - accuracy: 0.8547
loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x256xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x256xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>'
293/293 [==============================] - 5s 10ms/step - loss: 0.3264 - accuracy: 0.8547 - val_loss: 0.2576 - val_accuracy: 0.8872 Epoch 2/30 8/293 [..............................] - ETA: 2s - loss: 0.2362 - accuracy: 0.8965
loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x10x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/0032d1ee-80fd-11ee-8227-6aecfccc70fe/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x10x1x1xi1>'
293/293 [==============================] - 2s 8ms/step - loss: 0.2535 - accuracy: 0.8911 - val_loss: 0.2457 - val_accuracy: 0.8924 Epoch 3/30 293/293 [==============================] - 2s 8ms/step - loss: 0.2379 - accuracy: 0.8980 - val_loss: 0.2377 - val_accuracy: 0.8964 Epoch 4/30 293/293 [==============================] - 2s 7ms/step - loss: 0.2298 - accuracy: 0.9006 - val_loss: 0.2320 - val_accuracy: 0.9028 Epoch 5/30 293/293 [==============================] - 2s 7ms/step - loss: 0.2227 - accuracy: 0.9056 - val_loss: 0.2274 - val_accuracy: 0.9043 Epoch 6/30 293/293 [==============================] - 2s 7ms/step - loss: 0.2158 - accuracy: 0.9075 - val_loss: 0.2264 - val_accuracy: 0.9058 Epoch 7/30 293/293 [==============================] - 2s 7ms/step - loss: 0.2167 - accuracy: 0.9077 - val_loss: 0.2241 - val_accuracy: 0.9045 Epoch 8/30 293/293 [==============================] - 2s 7ms/step - loss: 0.2126 - accuracy: 0.9083 - val_loss: 0.2231 - val_accuracy: 0.9067 Epoch 9/30 293/293 [==============================] - 2s 7ms/step - loss: 0.2128 - accuracy: 0.9086 - val_loss: 0.2182 - val_accuracy: 0.9067 Epoch 10/30 293/293 [==============================] - 2s 7ms/step - loss: 0.2013 - accuracy: 0.9132 - val_loss: 0.2179 - val_accuracy: 0.9077 Epoch 11/30 293/293 [==============================] - 2s 7ms/step - loss: 0.2011 - accuracy: 0.9144 - val_loss: 0.2165 - val_accuracy: 0.9094 Epoch 12/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1949 - accuracy: 0.9165 - val_loss: 0.2160 - val_accuracy: 0.9088 Epoch 13/30 293/293 [==============================] - 2s 6ms/step - loss: 0.1931 - accuracy: 0.9197 - val_loss: 0.2179 - val_accuracy: 0.9077 Epoch 14/30 293/293 [==============================] - 2s 6ms/step - loss: 0.1934 - accuracy: 0.9179 - val_loss: 0.2130 - val_accuracy: 0.9073 Epoch 15/30 293/293 [==============================] - 2s 6ms/step - loss: 0.1867 - accuracy: 0.9215 - val_loss: 0.2161 - val_accuracy: 0.9077 Epoch 16/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1856 - accuracy: 0.9213 - val_loss: 0.2086 - val_accuracy: 0.9090 Epoch 17/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1799 - accuracy: 0.9248 - val_loss: 0.2187 - val_accuracy: 0.9088 Epoch 18/30 293/293 [==============================] - 2s 6ms/step - loss: 0.1773 - accuracy: 0.9243 - val_loss: 0.2065 - val_accuracy: 0.9131 Epoch 19/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1747 - accuracy: 0.9262 - val_loss: 0.2126 - val_accuracy: 0.9090 Epoch 20/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1728 - accuracy: 0.9276 - val_loss: 0.2057 - val_accuracy: 0.9135 Epoch 21/30 293/293 [==============================] - 2s 6ms/step - loss: 0.1703 - accuracy: 0.9295 - val_loss: 0.2156 - val_accuracy: 0.9122 Epoch 22/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1656 - accuracy: 0.9307 - val_loss: 0.2101 - val_accuracy: 0.9137 Epoch 23/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1605 - accuracy: 0.9339 - val_loss: 0.2086 - val_accuracy: 0.9165 Epoch 24/30 293/293 [==============================] - 2s 6ms/step - loss: 0.1612 - accuracy: 0.9323 - val_loss: 0.2129 - val_accuracy: 0.9111 Epoch 25/30 293/293 [==============================] - 2s 6ms/step - loss: 0.1580 - accuracy: 0.9339 - val_loss: 0.2080 - val_accuracy: 0.9152 Epoch 26/30 293/293 [==============================] - 2s 6ms/step - loss: 0.1571 - accuracy: 0.9352 - val_loss: 0.2170 - val_accuracy: 0.9129 Epoch 27/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1511 - accuracy: 0.9381 - val_loss: 0.2081 - val_accuracy: 0.9169 Epoch 28/30 293/293 [==============================] - 2s 6ms/step - loss: 0.1499 - accuracy: 0.9382 - val_loss: 0.2073 - val_accuracy: 0.9148 Epoch 29/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1461 - accuracy: 0.9392 - val_loss: 0.2072 - val_accuracy: 0.9180 Epoch 30/30 293/293 [==============================] - 2s 7ms/step - loss: 0.1437 - accuracy: 0.9403 - val_loss: 0.2105 - val_accuracy: 0.9148 INFO:tensorflow:Assets written to: models/cat_dog_full/assets CPU times: user 48.1 s, sys: 19.6 s, total: 1min 7s Wall time: 1min 3s
# Plot training and validation accuracy
def plot_training_validation_acc(history, smooth=False, smooth_factor=0.8):
def smooth_curve(points, factor=0.8):
smoothed_points = []
for point in points:
if smoothed_points:
previous = smoothed_points[-1]
smoothed_points.append(previous * factor + point * (1 - factor))
else:
smoothed_points.append(point)
return smoothed_points
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
if smooth:
acc = smooth_curve(acc)
val_acc = smooth_curve(val_acc)
loss = smooth_curve(loss)
val_loss = smooth_curve(val_loss)
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
plot_training_validation_acc(history)
We get pretty good validation accuracy, especially with such a small dataset, but there seems to be some overfitting.
To improve our classifier, we can place it on top of VGG, "unfreeze" the last convolutional block of VGG, then continue training with augmented samples from our dataset.
This will "fine tune" the last block in VGG, tweaking it to our domain, as well as continuing to train the classifier.
It is necessary to pre-train the classifier. If we were to place randomly initialized layers on top, large gradient updates would wreck the learned weights in the block we are fine tuning.
top_model.summary()
Model: "transfer_top" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= global_average_pooling2d (G (None, 512) 0 lobalAveragePooling2D) dense (Dense) (None, 256) 131328 dropout (Dropout) (None, 256) 0 dense_1 (Dense) (None, 1) 257 ================================================================= Total params: 131,585 Trainable params: 131,585 Non-trainable params: 0 _________________________________________________________________
512*256+256
131328
%%time
#
# Fine tune top convolutional block, with augmentation
#
print('Building combined model...')
# note that it is necessary to start with a fully-trained
# classifier, in order to successfully do fine-tuning
# Optionally load in the previously trained model if running from previous
# top_model.load_weights(top_model_weights_path)
# top_model.load('models/cat_dog_full')
# add the model on top of the convolutional base
model = tf.keras.Model(inputs=base_model.input,
outputs=top_model(base_model.output))
# now let's fine tune one layer within VGG
# Freeze all blocks up to block5 (the block we are fine tuning)
set_trainable = False
for layer in base_model.layers:
# go through each layer sequentially
# and start training after block 5
if layer.name == 'block5_conv1':
set_trainable = True
# all layers after this will be trainable
if set_trainable:
layer.trainable = True
else:
layer.trainable = False
# compile the model with a SGD/momentum optimizer
# and a very small learning rate.
model.compile(loss='binary_crossentropy',
optimizer=tf.keras.optimizers.SGD(learning_rate=1e-4, momentum=0.9),
metrics=['accuracy'])
model.summary()
Building combined model... Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 150, 150, 3)] 0 normalizer (Sequential) (None, 150, 150, 3) 0 block1_conv1 (Conv2D) (None, 150, 150, 64) 1792 block1_conv2 (Conv2D) (None, 150, 150, 64) 36928 block1_pool (MaxPooling2D) (None, 75, 75, 64) 0 block2_conv1 (Conv2D) (None, 75, 75, 128) 73856 block2_conv2 (Conv2D) (None, 75, 75, 128) 147584 block2_pool (MaxPooling2D) (None, 37, 37, 128) 0 block3_conv1 (Conv2D) (None, 37, 37, 256) 295168 block3_conv2 (Conv2D) (None, 37, 37, 256) 590080 block3_conv3 (Conv2D) (None, 37, 37, 256) 590080 block3_pool (MaxPooling2D) (None, 18, 18, 256) 0 block4_conv1 (Conv2D) (None, 18, 18, 512) 1180160 block4_conv2 (Conv2D) (None, 18, 18, 512) 2359808 block4_conv3 (Conv2D) (None, 18, 18, 512) 2359808 block4_pool (MaxPooling2D) (None, 9, 9, 512) 0 block5_conv1 (Conv2D) (None, 9, 9, 512) 2359808 block5_conv2 (Conv2D) (None, 9, 9, 512) 2359808 block5_conv3 (Conv2D) (None, 9, 9, 512) 2359808 block5_pool (MaxPooling2D) (None, 4, 4, 512) 0 transfer_top (Sequential) (None, 1) 131585 ================================================================= Total params: 14,846,273 Trainable params: 7,211,009 Non-trainable params: 7,635,264 _________________________________________________________________ CPU times: user 31.3 ms, sys: 6.15 ms, total: 37.4 ms Wall time: 37.3 ms
epochs = 30
checkpoint_filepath = 'models/checkpoint'
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_filepath,
save_weights_only=True,
monitor='val_accuracy',
mode='max',
save_best_only=True)
print('Fine tuning combined model...')
history = model.fit(train_ds,
epochs=epochs,
batch_size=batch_size,
callbacks=[model_checkpoint_callback],
validation_data=val_ds,
verbose=1)
Fine tuning combined model... Epoch 1/30
2023-01-24 23:25:27.155283: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
293/293 [==============================] - ETA: 0s - loss: 0.1230 - accuracy: 0.9517
2023-01-24 23:26:06.525601: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
293/293 [==============================] - 48s 162ms/step - loss: 0.1230 - accuracy: 0.9517 - val_loss: 0.1838 - val_accuracy: 0.9288 Epoch 2/30 293/293 [==============================] - 47s 162ms/step - loss: 0.0951 - accuracy: 0.9630 - val_loss: 0.1787 - val_accuracy: 0.9322 Epoch 3/30 293/293 [==============================] - 47s 160ms/step - loss: 0.0744 - accuracy: 0.9709 - val_loss: 0.1967 - val_accuracy: 0.9318 Epoch 4/30 293/293 [==============================] - 48s 162ms/step - loss: 0.0654 - accuracy: 0.9742 - val_loss: 0.1741 - val_accuracy: 0.9369 Epoch 5/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0550 - accuracy: 0.9806 - val_loss: 0.1717 - val_accuracy: 0.9386 Epoch 6/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0472 - accuracy: 0.9839 - val_loss: 0.1698 - val_accuracy: 0.9384 Epoch 7/30 293/293 [==============================] - 48s 162ms/step - loss: 0.0393 - accuracy: 0.9870 - val_loss: 0.1712 - val_accuracy: 0.9388 Epoch 8/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0351 - accuracy: 0.9886 - val_loss: 0.1907 - val_accuracy: 0.9365 Epoch 9/30 293/293 [==============================] - 48s 162ms/step - loss: 0.0295 - accuracy: 0.9911 - val_loss: 0.1798 - val_accuracy: 0.9421 Epoch 10/30 293/293 [==============================] - 48s 162ms/step - loss: 0.0269 - accuracy: 0.9920 - val_loss: 0.1830 - val_accuracy: 0.9425 Epoch 11/30 293/293 [==============================] - 48s 162ms/step - loss: 0.0247 - accuracy: 0.9929 - val_loss: 0.1772 - val_accuracy: 0.9446 Epoch 12/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0213 - accuracy: 0.9936 - val_loss: 0.1766 - val_accuracy: 0.9446 Epoch 13/30 293/293 [==============================] - 48s 162ms/step - loss: 0.0189 - accuracy: 0.9943 - val_loss: 0.1850 - val_accuracy: 0.9474 Epoch 14/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0177 - accuracy: 0.9955 - val_loss: 0.1908 - val_accuracy: 0.9425 Epoch 15/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0143 - accuracy: 0.9968 - val_loss: 0.1900 - val_accuracy: 0.9444 Epoch 16/30 293/293 [==============================] - 47s 162ms/step - loss: 0.0137 - accuracy: 0.9973 - val_loss: 0.1960 - val_accuracy: 0.9446 Epoch 17/30 293/293 [==============================] - 47s 160ms/step - loss: 0.0127 - accuracy: 0.9969 - val_loss: 0.1895 - val_accuracy: 0.9457 Epoch 18/30 293/293 [==============================] - 47s 160ms/step - loss: 0.0109 - accuracy: 0.9973 - val_loss: 0.1936 - val_accuracy: 0.9448 Epoch 19/30 293/293 [==============================] - 47s 160ms/step - loss: 0.0089 - accuracy: 0.9984 - val_loss: 0.1973 - val_accuracy: 0.9472 Epoch 20/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0103 - accuracy: 0.9975 - val_loss: 0.1964 - val_accuracy: 0.9470 Epoch 21/30 293/293 [==============================] - 47s 160ms/step - loss: 0.0091 - accuracy: 0.9984 - val_loss: 0.1894 - val_accuracy: 0.9461 Epoch 22/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0083 - accuracy: 0.9985 - val_loss: 0.2005 - val_accuracy: 0.9463 Epoch 23/30 293/293 [==============================] - 48s 162ms/step - loss: 0.0078 - accuracy: 0.9987 - val_loss: 0.1952 - val_accuracy: 0.9483 Epoch 24/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0085 - accuracy: 0.9983 - val_loss: 0.2053 - val_accuracy: 0.9451 Epoch 25/30 293/293 [==============================] - 48s 162ms/step - loss: 0.0066 - accuracy: 0.9988 - val_loss: 0.2022 - val_accuracy: 0.9489 Epoch 26/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0068 - accuracy: 0.9986 - val_loss: 0.2046 - val_accuracy: 0.9470 Epoch 27/30 293/293 [==============================] - 48s 162ms/step - loss: 0.0063 - accuracy: 0.9991 - val_loss: 0.2008 - val_accuracy: 0.9493 Epoch 28/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0062 - accuracy: 0.9987 - val_loss: 0.2009 - val_accuracy: 0.9478 Epoch 29/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0057 - accuracy: 0.9990 - val_loss: 0.2033 - val_accuracy: 0.9478 Epoch 30/30 293/293 [==============================] - 47s 161ms/step - loss: 0.0043 - accuracy: 0.9994 - val_loss: 0.2256 - val_accuracy: 0.9465
plot_training_validation_acc(history, smooth=True)
test_loss, test_acc = model.evaluate(val_ds)
print('test acc:', test_acc)
74/74 [==============================] - 8s 109ms/step - loss: 0.2256 - accuracy: 0.9465 test acc: 0.9465469121932983
We get an increase in validation accuracy, and there seems to be less overfitting now.
We can plot selected images from the test set and see what the classifier predicted (true label is in parentheses).
images = []
titles = []
for image, labels in val_ds.take(8):
y_hat = model.predict(image)
image = [x.numpy().astype("uint8") for x in image]
images.extend(image)
labels = ['Dog' if x==1 else 'Cat' for x in labels]
predictions = ['Dog' if x>0.5 else 'Cat' for x in y_hat]
tmp = [f'Y:({x}) $\hat Y$:{y}' if x!=y else f'Match: {x}' for x,y in zip(labels, predictions)]
titles.extend(tmp)
plot_gallery(images, titles, n_row=12, n_col=6)
2/2 [==============================] - 0s 157ms/step 2/2 [==============================] - 0s 55ms/step 2/2 [==============================] - 0s 55ms/step 2/2 [==============================] - 0s 55ms/step 2/2 [==============================] - 0s 55ms/step 2/2 [==============================] - 0s 54ms/step 2/2 [==============================] - 0s 54ms/step 2/2 [==============================] - 0s 56ms/step
By fine tuning the last convolutional block, we were able to achieve a slight increase in accuracy. If we were to use data augmentation from the beginning while training our classifier intially by placing the classifier directly on top of VGG instead of saving out the features, the training time would take longer, but we could get a slight increase in accuracy.
import IPython
url = 'https://storage.googleapis.com/tfjs-examples/webcam-transfer-learning/dist/index.html'
IPython.display.IFrame(url, width=900, height=700)
import IPython
url = 'https://storage.googleapis.com/tfjs-examples/webcam-transfer-learning/dist/index.html'
iframe = '<iframe src=' + url + ' width=900 height=700></iframe>'
IPython.display.HTML(iframe)
/Users/eclarson/opt/anaconda3/envs/mlenv2023/lib/python3.9/site-packages/IPython/core/display.py:431: UserWarning: Consider using IPython.display.IFrame instead warnings.warn("Consider using IPython.display.IFrame instead")
Info: https://machinelearningmastery.com/transfer-learning-for-deep-learning/
http://ruder.io/transfer-learning
https://www.cse.ust.hk/~qyang/Docs/2009/tkde_transfer_learning.pdf
Image Graph: Suzuki, Masahiro & Sato, Haruhiko & Oyama, Satoshi & Kurihara, Masahito. (2014). Transfer learning based on the observation probability of each attribute. Pacman: https://github.com/tensorflow/tfjs-examples/tree/master/webcam-transfer-learning