Notebook

Typical run times:

GPU (V100): 15s per epoch
TPU v3-8 (8 cores): 5s per epoch
TPU pod v2-32 (32 cores): 2s per epoch

Imports¶

In [2]:

import re, sys
if 'google.colab' in sys.modules: # Colab-only Tensorflow version selector
  %tensorflow_version 2.x
import tensorflow as tf
import numpy as np
from matplotlib import pyplot as plt
print("Tensorflow version " + tf.__version__)

Tensorflow version 2.1.0-dev20191029

TPU or GPU detection¶

TPUClusterResolver() automatically detects a connected TPU on all Gooogle's platforms: Colaboratory, AI Platform (ML Engine), Kubernetes and Deep Learning VMs provided the TPU_NAME environment variable is set on the VM.

In [3]:

# Detect hardware, return appropriate distribution strategy
try:
    tpu = tf.distribute.cluster_resolver.TPUClusterResolver()  # TPU detection
    print('Running on TPU ', tpu.cluster_spec().as_dict()['worker'])
except ValueError:
    tpu = None

if tpu:
    tf.config.experimental_connect_to_cluster(tpu)
    tf.tpu.experimental.initialize_tpu_system(tpu)
    strategy = tf.distribute.experimental.TPUStrategy(tpu)
else:
    strategy = tf.distribute.get_strategy()

print("REPLICAS: ", strategy.num_replicas_in_sync)

Running on TPU  ['192.168.30.2:8470']
INFO:tensorflow:Initializing the TPU system: martin-tpu-nightly

INFO:tensorflow:Initializing the TPU system: martin-tpu-nightly

INFO:tensorflow:Clearing out eager caches

INFO:tensorflow:Clearing out eager caches

INFO:tensorflow:Finished initializing TPU system.

INFO:tensorflow:Finished initializing TPU system.

INFO:tensorflow:Found TPU system:

INFO:tensorflow:Found TPU system:

INFO:tensorflow:*** Num TPU Cores: 8

INFO:tensorflow:*** Num TPU Cores: 8

INFO:tensorflow:*** Num TPU Workers: 1

INFO:tensorflow:*** Num TPU Workers: 1

INFO:tensorflow:*** Num TPU Cores Per Worker: 8

INFO:tensorflow:*** Num TPU Cores Per Worker: 8

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)

INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)

REPLICAS:  8

Configuration¶

In [3]:

EPOCHS = 60

if strategy.num_replicas_in_sync == 1: # GPU
    # This achieves 80% accuracy on a GPU (final loss 0.45)
    BATCH_SIZE = 16
    VALIDATION_BATCH_SIZE = 16
    start_lr = 0.01
    max_lr = 0.01
    min_lr = 0.01
    rampup_epochs = 0
    sustain_epochs = 0
    exp_decay = 1
    
elif strategy.num_replicas_in_sync == 8: # single TPU
    # This achieves 80% accuracy on a TPU v3-8 (final loss 0.44)
    BATCH_SIZE = 32 * strategy.num_replicas_in_sync
    VALIDATION_BATCH_SIZE = 256
    start_lr = 0.01
    max_lr = 0.01 * strategy.num_replicas_in_sync
    min_lr = 0.001
    rampup_epochs = 0
    sustain_epochs = 13
    exp_decay = .95

else: # TPU pod
    # This achieves 80% accuracy on a TPU v2-32 pod (final loss 0.54)
    BATCH_SIZE = 16 * strategy.num_replicas_in_sync  # Gobal batch size.
    VALIDATION_BATCH_SIZE = 256
    start_lr = 0.06
    max_lr = 0.012 * strategy.num_replicas_in_sync
    min_lr = 0.01
    rampup_epochs = 5
    sustain_epochs = 8
    exp_decay = 0.95

CLASSES = ['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips'] # do not change, maps to the labels in the data (folder names)

IMAGE_SIZE = [331, 331] # supported images sizes: 192x192, 331x331, 512,512
                        # make sure you load the appropriate dataset on the next line
#GCS_PATTERN = 'gs://flowers-public/tfrecords-jpeg-192x192-2/*.tfrec'
GCS_PATTERN = 'gs://flowers-public/tfrecords-jpeg-331x331/*.tfrec'
#GCS_PATTERN = 'gs://flowers-public/tfrecords-jpeg-512x512/*.tfrec'
VALIDATION_SPLIT = 0.19

# Learning rate schedule
def lrfn(epoch):
    def lr(epoch, start_lr, min_lr, max_lr, rampup_epochs, sustain_epochs, exp_decay):
        if epoch < rampup_epochs:
            lr = (max_lr - start_lr)/rampup_epochs * epoch + start_lr
        elif epoch < rampup_epochs + sustain_epochs:
            lr = max_lr
        else:
            lr = (max_lr - min_lr) * exp_decay**(epoch-rampup_epochs-sustain_epochs) + min_lr
        return lr
    return lr(epoch, start_lr, min_lr, max_lr, rampup_epochs, sustain_epochs, exp_decay)
    
lr_callback = tf.keras.callbacks.LearningRateScheduler(lambda epoch: lrfn(epoch), verbose=True)

print("Learning rate schedule:")
rng = [i for i in range(EPOCHS)]
y = [lrfn(x) for x in rng]
plt.plot(rng, [lrfn(x) for x in rng])
plt.show()

Learning rate schedule:

In [4]:

#@title display utilities [RUN ME]

def dataset_to_numpy_util(dataset, N):
  dataset = dataset.batch(N)
  
  if tf.executing_eagerly():
    # In eager mode, iterate in the Datset directly.
    for images, labels in dataset:
      numpy_images = images.numpy()
      numpy_labels = labels.numpy()
      break;
      
  else: # In non-eager mode, must get the TF note that 
        # yields the nextitem and run it in a tf.Session.
    get_next_item = dataset.make_one_shot_iterator().get_next()
    with tf.Session() as ses:
      numpy_images, numpy_labels = ses.run(get_next_item)

  return numpy_images, numpy_labels

def title_from_label_and_target(label, correct_label):
  label = np.argmax(label, axis=-1)  # one-hot to class number
  correct_label = np.argmax(correct_label, axis=-1) # one-hot to class number
  correct = (label == correct_label)
  return "{} [{}{}{}]".format(CLASSES[label], str(correct), ', shoud be ' if not correct else '',
                              CLASSES[correct_label] if not correct else ''), correct

def display_one_flower(image, title, subplot, red=False):
    plt.subplot(subplot)
    plt.axis('off')
    plt.imshow(image)
    plt.title(title, fontsize=16, color='red' if red else 'black')
    return subplot+1
  
def display_9_images_from_dataset(dataset):
  subplot=331
  plt.figure(figsize=(13,13))
  images, labels = dataset_to_numpy_util(dataset, 9)
  for i, image in enumerate(images):
    title = CLASSES[np.argmax(labels[i], axis=-1)]
    subplot = display_one_flower(image, title, subplot)
    if i >= 8:
      break;
              
  plt.tight_layout()
  plt.subplots_adjust(wspace=0.1, hspace=0.1)
  plt.show()
  
def display_9_images_with_predictions(images, predictions, labels):
  subplot=331
  plt.figure(figsize=(13,13))
  for i, image in enumerate(images):
    title, correct = title_from_label_and_target(predictions[i], labels[i])
    subplot = display_one_flower(image, title, subplot, not correct)
    if i >= 8:
      break;
              
  plt.tight_layout()
  plt.subplots_adjust(wspace=0.1, hspace=0.1)
  plt.show()
  
def display_training_curves(training, validation, title, subplot):
  if subplot%10==1: # set up the subplots on the first call
    plt.subplots(figsize=(10,10), facecolor='#F0F0F0')
    plt.tight_layout()
  ax = plt.subplot(subplot)
  ax.set_facecolor('#F8F8F8')
  ax.plot(training)
  ax.plot(validation)
  ax.set_title('model '+ title)
  ax.set_ylabel(title)
  #ax.set_ylim(0.28,1.05)
  ax.set_xlabel('epoch')
  ax.legend(['train', 'valid.'])

Read images and labels from TFRecords¶

In [5]:

AUTOTUNE = tf.data.experimental.AUTOTUNE

def count_data_items(filenames):
    # trick: the number of data items is written in the name of
    # the .tfrec files a flowers00-230.tfrec = 230 data items
    n = [int(re.compile(r"-([0-9]*)\.").search(filename).group(1)) for filename in filenames]
    return np.sum(n)

def data_augment(image, one_hot_class):
    image = tf.image.random_flip_left_right(image)
    image = tf.image.random_saturation(image, 0, 2)
    return image, one_hot_class

def read_tfrecord(example):
    features = {
        "image": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
        "class": tf.io.FixedLenFeature([], tf.int64),  # shape [] means scalar
        "one_hot_class": tf.io.VarLenFeature(tf.float32),
    }
    example = tf.io.parse_single_example(example, features)
    image = tf.image.decode_jpeg(example['image'], channels=3)
    image = tf.cast(image, tf.float32) / 255.0  # convert image to floats in [0, 1] range
    image = tf.reshape(image, [*IMAGE_SIZE, 3]) # force the image size so that the shape of the tensor is known to Tensorflow
    class_label = tf.cast(example['class'], tf.int32)
    one_hot_class = tf.sparse.to_dense(example['one_hot_class'])
    one_hot_class = tf.reshape(one_hot_class, [5])
    return image, one_hot_class

def load_dataset(filenames):
    # read from TFRecords. For optimal performance, use TFRecordDataset with
    # num_parallel_calls=AUTOTUNE to read from multiple TFRecord files at once
    # band set the option experimental_deterministic = False
    # to allow order-altering optimizations.

    opt = tf.data.Options()
    opt.experimental_deterministic = False

    dataset = tf.data.Dataset.from_tensor_slices(filenames).with_options(opt)
    dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=16) # can be AUTOTUNE in TF 2.1
    dataset = dataset.map(read_tfrecord, num_parallel_calls=AUTOTUNE)
    return dataset

def batch_dataset(filenames, batch_size, train):
    dataset = load_dataset(filenames)
    n = count_data_items(filenames)
    
    if train:
        dataset = dataset.repeat() # training dataset must repeat
        dataset = dataset.map(data_augment, num_parallel_calls=AUTOTUNE)
        dataset = dataset.shuffle(2048)
    else:
        # usually fewer validation files than workers so disable FILE auto-sharding on validation
        if strategy.num_replicas_in_sync > 1: # option not useful if there is no sharding (not harmful either)
            opt = tf.data.Options()
            opt.experimental_distribute.auto_shard_policy = tf.data.experimental.AutoShardPolicy.DATA
            dataset = dataset.with_options(opt)
        # validation dataset does not need to repeat
        # also no need to shuffle or apply data augmentation
    dataset = dataset.batch(batch_size)
    dataset = dataset.prefetch(AUTOTUNE) # prefetch next batch while training (autotune prefetch buffer size)
    return dataset, n//batch_size

def get_training_dataset(filenames):
    dataset, steps = batch_dataset(filenames, BATCH_SIZE, train=True)
    return dataset, steps

def get_validation_dataset(filenames):
    dataset, steps = batch_dataset(filenames, VALIDATION_BATCH_SIZE, train=False)
    return dataset, steps

In [6]:

# instantiate datasets
filenames = tf.io.gfile.glob(GCS_PATTERN)
split = len(filenames) - int(len(filenames) * VALIDATION_SPLIT)
train_filenames = filenames[:split]
valid_filenames = filenames[split:]

training_dataset, steps_per_epoch = get_training_dataset(train_filenames)
validation_dataset, validation_steps = get_validation_dataset(valid_filenames)

print("TRAINING   IMAGES: ", count_data_items(train_filenames), ", STEPS PER EPOCH: ", steps_per_epoch)
print("VALIDATION IMAGES: ", count_data_items(valid_filenames), ", STEPS PER EPOCH: ", validation_steps)

# numpy data to test predictions
some_flowers, some_labels = dataset_to_numpy_util(load_dataset(valid_filenames), 160)

TRAINING   IMAGES:  2990 , STEPS PER EPOCH:  11
VALIDATION IMAGES:  680 , STEPS PER EPOCH:  2

In [7]:

display_9_images_from_dataset(load_dataset(train_filenames))

The model: squeezenet with 12 layers¶

In [8]:

def create_model():
    bnmomemtum=0.9 # with only a handful of batches per epoch, the batch norm running average period must be lowered
    def fire(x, squeeze, expand):
        y  = tf.keras.layers.Conv2D(filters=squeeze, kernel_size=1, activation=None, padding='same', use_bias=False)(x)
        y = tf.keras.layers.BatchNormalization(momentum=bnmomemtum, scale=False, center=True)(y)
        y = tf.keras.layers.Activation('relu')(y)
        y1 = tf.keras.layers.Conv2D(filters=expand//2, kernel_size=1, activation=None, padding='same', use_bias=False)(y)
        y1 = tf.keras.layers.BatchNormalization(momentum=bnmomemtum, scale=False, center=True)(y1)
        y1 = tf.keras.layers.Activation('relu')(y1)
        y3 = tf.keras.layers.Conv2D(filters=expand//2, kernel_size=3, activation=None, padding='same', use_bias=False)(y)
        y3 = tf.keras.layers.BatchNormalization(momentum=bnmomemtum, scale=False, center=True)(y3)
        y3 = tf.keras.layers.Activation('relu')(y3)
        return tf.keras.layers.concatenate([y1, y3])

    def fire_module(squeeze, expand):
        return lambda x: fire(x, squeeze, expand)

    x = tf.keras.layers.Input(shape=(*IMAGE_SIZE, 3)) # input is 331x331 pixels RGB
    y = tf.keras.layers.Conv2D(kernel_size=3, filters=32, padding='same', use_bias=True, activation='relu')(x)
    y = tf.keras.layers.BatchNormalization(momentum=bnmomemtum)(y)
    y = fire_module(24, 48)(y)
    y = tf.keras.layers.MaxPooling2D(pool_size=2)(y)
    y = fire_module(48, 96)(y)
    y = tf.keras.layers.MaxPooling2D(pool_size=2)(y)
    y = fire_module(64, 128)(y)
    y = tf.keras.layers.MaxPooling2D(pool_size=2)(y)
    y = fire_module(48, 96)(y)
    y = tf.keras.layers.MaxPooling2D(pool_size=2)(y)
    y = fire_module(24, 48)(y)
    y = tf.keras.layers.GlobalAveragePooling2D()(y)
    y = tf.keras.layers.Dropout(0.4)(y)
    y = tf.keras.layers.Dense(5, activation='softmax')(y)
    return tf.keras.Model(x, y)

Instantiate the model¶

In [9]:

with strategy.scope():
    model = create_model()
    
    model.compile(optimizer=tf.keras.optimizers.SGD(nesterov=True, momentum=0.9),
              loss='categorical_crossentropy',
              metrics=['accuracy'])
    
    model.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 331, 331, 3) 0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 331, 331, 32) 896         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 331, 331, 32) 128         conv2d[0][0]                     
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 331, 331, 24) 768         batch_normalization[0][0]        
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 331, 331, 24) 72          conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation (Activation)         (None, 331, 331, 24) 0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 331, 331, 24) 576         activation[0][0]                 
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 331, 331, 24) 5184        activation[0][0]                 
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 331, 331, 24) 72          conv2d_2[0][0]                   
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 331, 331, 24) 72          conv2d_3[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 331, 331, 24) 0           batch_normalization_2[0][0]      
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 331, 331, 24) 0           batch_normalization_3[0][0]      
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 331, 331, 48) 0           activation_1[0][0]               
                                                                 activation_2[0][0]               
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 165, 165, 48) 0           concatenate[0][0]                
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 165, 165, 48) 2304        max_pooling2d[0][0]              
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 165, 165, 48) 144         conv2d_4[0][0]                   
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 165, 165, 48) 0           batch_normalization_4[0][0]      
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 165, 165, 48) 2304        activation_3[0][0]               
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 165, 165, 48) 20736       activation_3[0][0]               
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 165, 165, 48) 144         conv2d_5[0][0]                   
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 165, 165, 48) 144         conv2d_6[0][0]                   
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 165, 165, 48) 0           batch_normalization_5[0][0]      
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 165, 165, 48) 0           batch_normalization_6[0][0]      
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 165, 165, 96) 0           activation_4[0][0]               
                                                                 activation_5[0][0]               
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 82, 82, 96)   0           concatenate_1[0][0]              
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 82, 82, 64)   6144        max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 82, 82, 64)   192         conv2d_7[0][0]                   
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 82, 82, 64)   0           batch_normalization_7[0][0]      
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 82, 82, 64)   4096        activation_6[0][0]               
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 82, 82, 64)   36864       activation_6[0][0]               
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 82, 82, 64)   192         conv2d_8[0][0]                   
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 82, 82, 64)   192         conv2d_9[0][0]                   
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 82, 82, 64)   0           batch_normalization_8[0][0]      
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 82, 82, 64)   0           batch_normalization_9[0][0]      
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 82, 82, 128)  0           activation_7[0][0]               
                                                                 activation_8[0][0]               
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 41, 41, 128)  0           concatenate_2[0][0]              
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 41, 41, 48)   6144        max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 41, 41, 48)   144         conv2d_10[0][0]                  
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 41, 41, 48)   0           batch_normalization_10[0][0]     
__________________________________________________________________________________________________
conv2d_11 (Conv2D)              (None, 41, 41, 48)   2304        activation_9[0][0]               
__________________________________________________________________________________________________
conv2d_12 (Conv2D)              (None, 41, 41, 48)   20736       activation_9[0][0]               
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 41, 41, 48)   144         conv2d_11[0][0]                  
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 41, 41, 48)   144         conv2d_12[0][0]                  
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 41, 41, 48)   0           batch_normalization_11[0][0]     
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 41, 41, 48)   0           batch_normalization_12[0][0]     
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 41, 41, 96)   0           activation_10[0][0]              
                                                                 activation_11[0][0]              
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 20, 20, 96)   0           concatenate_3[0][0]              
__________________________________________________________________________________________________
conv2d_13 (Conv2D)              (None, 20, 20, 24)   2304        max_pooling2d_3[0][0]            
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 20, 20, 24)   72          conv2d_13[0][0]                  
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 20, 20, 24)   0           batch_normalization_13[0][0]     
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 20, 20, 24)   576         activation_12[0][0]              
__________________________________________________________________________________________________
conv2d_15 (Conv2D)              (None, 20, 20, 24)   5184        activation_12[0][0]              
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 20, 20, 24)   72          conv2d_14[0][0]                  
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 20, 20, 24)   72          conv2d_15[0][0]                  
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 20, 20, 24)   0           batch_normalization_14[0][0]     
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 20, 20, 24)   0           batch_normalization_15[0][0]     
__________________________________________________________________________________________________
concatenate_4 (Concatenate)     (None, 20, 20, 48)   0           activation_13[0][0]              
                                                                 activation_14[0][0]              
__________________________________________________________________________________________________
global_average_pooling2d (Globa (None, 48)           0           concatenate_4[0][0]              
__________________________________________________________________________________________________
dropout (Dropout)               (None, 48)           0           global_average_pooling2d[0][0]   
__________________________________________________________________________________________________
dense (Dense)                   (None, 5)            245         dropout[0][0]                    
==================================================================================================
Total params: 119,365
Trainable params: 118,053
Non-trainable params: 1,312
__________________________________________________________________________________________________

Training¶

In [10]:

history = model.fit(training_dataset,
                    steps_per_epoch=steps_per_epoch,
                    epochs=EPOCHS,
                    validation_data=validation_dataset,
                    callbacks=[lr_callback])

final_accuracy = history.history["val_accuracy"][-5:]
print("FINAL ACCURACY MEAN-5: ", np.mean(final_accuracy))

Train for 11 steps

Epoch 00001: LearningRateScheduler reducing learning rate to 0.08.
Epoch 1/60
11/11 [==============================] - 41s 4s/step - loss: 1.5309 - accuracy: 0.3377 - val_loss: 1.6670 - val_accuracy: 0.2471

Epoch 00002: LearningRateScheduler reducing learning rate to 0.08.
Epoch 2/60
11/11 [==============================] - 5s 443ms/step - loss: 1.3051 - accuracy: 0.4748 - val_loss: 1.6256 - val_accuracy: 0.3059

Epoch 00003: LearningRateScheduler reducing learning rate to 0.08.
Epoch 3/60
11/11 [==============================] - 5s 433ms/step - loss: 1.1993 - accuracy: 0.5078 - val_loss: 1.2215 - val_accuracy: 0.4956

Epoch 00004: LearningRateScheduler reducing learning rate to 0.08.
Epoch 4/60
11/11 [==============================] - 5s 453ms/step - loss: 1.1245 - accuracy: 0.5572 - val_loss: 1.0812 - val_accuracy: 0.5500

Epoch 00005: LearningRateScheduler reducing learning rate to 0.08.
Epoch 5/60
11/11 [==============================] - 5s 434ms/step - loss: 1.0772 - accuracy: 0.5753 - val_loss: 1.0880 - val_accuracy: 0.5809

Epoch 00006: LearningRateScheduler reducing learning rate to 0.08.
Epoch 6/60
11/11 [==============================] - 5s 436ms/step - loss: 1.0711 - accuracy: 0.5771 - val_loss: 1.2590 - val_accuracy: 0.5485

Epoch 00007: LearningRateScheduler reducing learning rate to 0.08.
Epoch 7/60
11/11 [==============================] - 5s 440ms/step - loss: 1.0477 - accuracy: 0.5856 - val_loss: 1.0739 - val_accuracy: 0.5824

Epoch 00008: LearningRateScheduler reducing learning rate to 0.08.
Epoch 8/60
11/11 [==============================] - 5s 451ms/step - loss: 1.0192 - accuracy: 0.6023 - val_loss: 2.1184 - val_accuracy: 0.4176

Epoch 00009: LearningRateScheduler reducing learning rate to 0.08.
Epoch 9/60
11/11 [==============================] - 5s 441ms/step - loss: 0.9859 - accuracy: 0.6183 - val_loss: 0.9828 - val_accuracy: 0.6515

Epoch 00010: LearningRateScheduler reducing learning rate to 0.08.
Epoch 10/60
11/11 [==============================] - 5s 452ms/step - loss: 0.9885 - accuracy: 0.6158 - val_loss: 1.0052 - val_accuracy: 0.6132

Epoch 00011: LearningRateScheduler reducing learning rate to 0.08.
Epoch 11/60
11/11 [==============================] - 5s 451ms/step - loss: 0.9541 - accuracy: 0.6289 - val_loss: 0.9258 - val_accuracy: 0.6485

Epoch 00012: LearningRateScheduler reducing learning rate to 0.08.
Epoch 12/60
11/11 [==============================] - 6s 532ms/step - loss: 0.9369 - accuracy: 0.6463 - val_loss: 1.0385 - val_accuracy: 0.6044

Epoch 00013: LearningRateScheduler reducing learning rate to 0.08.
Epoch 13/60
11/11 [==============================] - 5s 439ms/step - loss: 0.9104 - accuracy: 0.6445 - val_loss: 0.9956 - val_accuracy: 0.6382

Epoch 00014: LearningRateScheduler reducing learning rate to 0.08.
Epoch 14/60
11/11 [==============================] - 6s 514ms/step - loss: 0.8707 - accuracy: 0.6676 - val_loss: 1.7858 - val_accuracy: 0.5074

Epoch 00015: LearningRateScheduler reducing learning rate to 0.07604999999999999.
Epoch 15/60
11/11 [==============================] - 6s 548ms/step - loss: 0.8836 - accuracy: 0.6673 - val_loss: 0.8594 - val_accuracy: 0.6544

Epoch 00016: LearningRateScheduler reducing learning rate to 0.0722975.
Epoch 16/60
11/11 [==============================] - 5s 441ms/step - loss: 0.8548 - accuracy: 0.6729 - val_loss: 0.8791 - val_accuracy: 0.6603

Epoch 00017: LearningRateScheduler reducing learning rate to 0.06873262499999999.
Epoch 17/60
11/11 [==============================] - 5s 449ms/step - loss: 0.8555 - accuracy: 0.6779 - val_loss: 1.0839 - val_accuracy: 0.6176

Epoch 00018: LearningRateScheduler reducing learning rate to 0.06534599375.
Epoch 18/60
11/11 [==============================] - 5s 447ms/step - loss: 0.7878 - accuracy: 0.7131 - val_loss: 0.7593 - val_accuracy: 0.7324

Epoch 00019: LearningRateScheduler reducing learning rate to 0.06212869406249998.
Epoch 19/60
11/11 [==============================] - 5s 457ms/step - loss: 0.7952 - accuracy: 0.6935 - val_loss: 0.9663 - val_accuracy: 0.6485

Epoch 00020: LearningRateScheduler reducing learning rate to 0.05907225935937498.
Epoch 20/60
11/11 [==============================] - 5s 433ms/step - loss: 0.7962 - accuracy: 0.6957 - val_loss: 0.8895 - val_accuracy: 0.6426

Epoch 00021: LearningRateScheduler reducing learning rate to 0.05616864639140623.
Epoch 21/60
11/11 [==============================] - 5s 442ms/step - loss: 0.7727 - accuracy: 0.7102 - val_loss: 0.7556 - val_accuracy: 0.7235

Epoch 00022: LearningRateScheduler reducing learning rate to 0.05341021407183592.
Epoch 22/60
11/11 [==============================] - 5s 433ms/step - loss: 0.7611 - accuracy: 0.7077 - val_loss: 0.8697 - val_accuracy: 0.7015

Epoch 00023: LearningRateScheduler reducing learning rate to 0.05078970336824412.
Epoch 23/60
11/11 [==============================] - 5s 437ms/step - loss: 0.7470 - accuracy: 0.7219 - val_loss: 0.8027 - val_accuracy: 0.6912

Epoch 00024: LearningRateScheduler reducing learning rate to 0.048300218199831914.
Epoch 24/60
11/11 [==============================] - 5s 433ms/step - loss: 0.7202 - accuracy: 0.7362 - val_loss: 0.7410 - val_accuracy: 0.7544

Epoch 00025: LearningRateScheduler reducing learning rate to 0.04593520728984032.
Epoch 25/60
11/11 [==============================] - 5s 455ms/step - loss: 0.7128 - accuracy: 0.7362 - val_loss: 0.7587 - val_accuracy: 0.7265

Epoch 00026: LearningRateScheduler reducing learning rate to 0.0436884469253483.
Epoch 26/60
11/11 [==============================] - 5s 445ms/step - loss: 0.7016 - accuracy: 0.7401 - val_loss: 0.8714 - val_accuracy: 0.7044

Epoch 00027: LearningRateScheduler reducing learning rate to 0.04155402457908088.
Epoch 27/60
11/11 [==============================] - 6s 517ms/step - loss: 0.6964 - accuracy: 0.7450 - val_loss: 0.7005 - val_accuracy: 0.7618

Epoch 00028: LearningRateScheduler reducing learning rate to 0.03952632335012683.
Epoch 28/60
11/11 [==============================] - 5s 446ms/step - loss: 0.6676 - accuracy: 0.7493 - val_loss: 0.6607 - val_accuracy: 0.7809

Epoch 00029: LearningRateScheduler reducing learning rate to 0.03760000718262049.
Epoch 29/60
11/11 [==============================] - 5s 447ms/step - loss: 0.7039 - accuracy: 0.7365 - val_loss: 0.7674 - val_accuracy: 0.7676

Epoch 00030: LearningRateScheduler reducing learning rate to 0.035770006823489464.
Epoch 30/60
11/11 [==============================] - 5s 461ms/step - loss: 0.6408 - accuracy: 0.7653 - val_loss: 0.7429 - val_accuracy: 0.7603

Epoch 00031: LearningRateScheduler reducing learning rate to 0.03403150648231499.
Epoch 31/60
11/11 [==============================] - 5s 465ms/step - loss: 0.6406 - accuracy: 0.7653 - val_loss: 1.1098 - val_accuracy: 0.6485

Epoch 00032: LearningRateScheduler reducing learning rate to 0.03237993115819924.
Epoch 32/60
11/11 [==============================] - 5s 456ms/step - loss: 0.6168 - accuracy: 0.7710 - val_loss: 0.7020 - val_accuracy: 0.7441

Epoch 00033: LearningRateScheduler reducing learning rate to 0.030810934600289275.
Epoch 33/60
11/11 [==============================] - 5s 442ms/step - loss: 0.6285 - accuracy: 0.7660 - val_loss: 0.6638 - val_accuracy: 0.7691

Epoch 00034: LearningRateScheduler reducing learning rate to 0.02932038787027481.
Epoch 34/60
11/11 [==============================] - 5s 451ms/step - loss: 0.5960 - accuracy: 0.7965 - val_loss: 0.6516 - val_accuracy: 0.7632

Epoch 00035: LearningRateScheduler reducing learning rate to 0.02790436847676107.
Epoch 35/60
11/11 [==============================] - 5s 444ms/step - loss: 0.6118 - accuracy: 0.7837 - val_loss: 0.6187 - val_accuracy: 0.7941

Epoch 00036: LearningRateScheduler reducing learning rate to 0.026559150052923013.
Epoch 36/60
11/11 [==============================] - 5s 452ms/step - loss: 0.5924 - accuracy: 0.7788 - val_loss: 0.6799 - val_accuracy: 0.7824

Epoch 00037: LearningRateScheduler reducing learning rate to 0.025281192550276863.
Epoch 37/60
11/11 [==============================] - 5s 449ms/step - loss: 0.5698 - accuracy: 0.7894 - val_loss: 0.6188 - val_accuracy: 0.7985

Epoch 00038: LearningRateScheduler reducing learning rate to 0.02406713292276302.
Epoch 38/60
11/11 [==============================] - 6s 522ms/step - loss: 0.5981 - accuracy: 0.7894 - val_loss: 0.6554 - val_accuracy: 0.7618

Epoch 00039: LearningRateScheduler reducing learning rate to 0.02291377627662487.
Epoch 39/60
11/11 [==============================] - 5s 449ms/step - loss: 0.5766 - accuracy: 0.7919 - val_loss: 0.6247 - val_accuracy: 0.7838

Epoch 00040: LearningRateScheduler reducing learning rate to 0.021818087462793623.
Epoch 40/60
11/11 [==============================] - 5s 455ms/step - loss: 0.5731 - accuracy: 0.7937 - val_loss: 0.6269 - val_accuracy: 0.7662

Epoch 00041: LearningRateScheduler reducing learning rate to 0.02077718308965394.
Epoch 41/60
11/11 [==============================] - 6s 542ms/step - loss: 0.5768 - accuracy: 0.7947 - val_loss: 1.0723 - val_accuracy: 0.6662

Epoch 00042: LearningRateScheduler reducing learning rate to 0.019788323935171243.
Epoch 42/60
11/11 [==============================] - 5s 454ms/step - loss: 0.5631 - accuracy: 0.7912 - val_loss: 0.5907 - val_accuracy: 0.8000

Epoch 00043: LearningRateScheduler reducing learning rate to 0.01884890773841268.
Epoch 43/60
11/11 [==============================] - 5s 447ms/step - loss: 0.5435 - accuracy: 0.8058 - val_loss: 0.6120 - val_accuracy: 0.7809

Epoch 00044: LearningRateScheduler reducing learning rate to 0.017956462351492047.
Epoch 44/60
11/11 [==============================] - 5s 458ms/step - loss: 0.5520 - accuracy: 0.7915 - val_loss: 0.5497 - val_accuracy: 0.8206

Epoch 00045: LearningRateScheduler reducing learning rate to 0.017108639233917443.
Epoch 45/60
11/11 [==============================] - 5s 456ms/step - loss: 0.5223 - accuracy: 0.8047 - val_loss: 0.6351 - val_accuracy: 0.7941

Epoch 00046: LearningRateScheduler reducing learning rate to 0.01630320727222157.
Epoch 46/60
11/11 [==============================] - 5s 435ms/step - loss: 0.5303 - accuracy: 0.8129 - val_loss: 0.7617 - val_accuracy: 0.7471

Epoch 00047: LearningRateScheduler reducing learning rate to 0.015538046908610489.
Epoch 47/60
11/11 [==============================] - 5s 449ms/step - loss: 0.5287 - accuracy: 0.8153 - val_loss: 0.6121 - val_accuracy: 0.8147

Epoch 00048: LearningRateScheduler reducing learning rate to 0.014811144563179963.
Epoch 48/60
11/11 [==============================] - 5s 446ms/step - loss: 0.5392 - accuracy: 0.7969 - val_loss: 0.6052 - val_accuracy: 0.7912

Epoch 00049: LearningRateScheduler reducing learning rate to 0.014120587335020966.
Epoch 49/60
11/11 [==============================] - 5s 455ms/step - loss: 0.5131 - accuracy: 0.8196 - val_loss: 0.5801 - val_accuracy: 0.8059

Epoch 00050: LearningRateScheduler reducing learning rate to 0.013464557968269918.
Epoch 50/60
11/11 [==============================] - 5s 461ms/step - loss: 0.5296 - accuracy: 0.8182 - val_loss: 0.5829 - val_accuracy: 0.7971

Epoch 00051: LearningRateScheduler reducing learning rate to 0.01284133006985642.
Epoch 51/60
11/11 [==============================] - 5s 448ms/step - loss: 0.5029 - accuracy: 0.8210 - val_loss: 0.5856 - val_accuracy: 0.7897

Epoch 00052: LearningRateScheduler reducing learning rate to 0.012249263566363598.
Epoch 52/60
11/11 [==============================] - 5s 450ms/step - loss: 0.5163 - accuracy: 0.8132 - val_loss: 0.6040 - val_accuracy: 0.7853

Epoch 00053: LearningRateScheduler reducing learning rate to 0.01168680038804542.
Epoch 53/60
11/11 [==============================] - 5s 443ms/step - loss: 0.5095 - accuracy: 0.8114 - val_loss: 0.5741 - val_accuracy: 0.8118

Epoch 00054: LearningRateScheduler reducing learning rate to 0.011152460368643147.
Epoch 54/60
11/11 [==============================] - 5s 438ms/step - loss: 0.4929 - accuracy: 0.8253 - val_loss: 0.5673 - val_accuracy: 0.7985

Epoch 00055: LearningRateScheduler reducing learning rate to 0.01064483735021099.
Epoch 55/60
11/11 [==============================] - 5s 451ms/step - loss: 0.5018 - accuracy: 0.8161 - val_loss: 0.5933 - val_accuracy: 0.8191

Epoch 00056: LearningRateScheduler reducing learning rate to 0.010162595482700439.
Epoch 56/60
11/11 [==============================] - 5s 476ms/step - loss: 0.4639 - accuracy: 0.8324 - val_loss: 0.5662 - val_accuracy: 0.8118

Epoch 00057: LearningRateScheduler reducing learning rate to 0.009704465708565417.
Epoch 57/60
11/11 [==============================] - 5s 450ms/step - loss: 0.4750 - accuracy: 0.8338 - val_loss: 0.5769 - val_accuracy: 0.8147

Epoch 00058: LearningRateScheduler reducing learning rate to 0.009269242423137147.
Epoch 58/60
11/11 [==============================] - 5s 441ms/step - loss: 0.4973 - accuracy: 0.8200 - val_loss: 0.5813 - val_accuracy: 0.8118

Epoch 00059: LearningRateScheduler reducing learning rate to 0.008855780301980289.
Epoch 59/60
11/11 [==============================] - 5s 454ms/step - loss: 0.4575 - accuracy: 0.8430 - val_loss: 0.5927 - val_accuracy: 0.7985

Epoch 00060: LearningRateScheduler reducing learning rate to 0.008462991286881274.
Epoch 60/60
11/11 [==============================] - 5s 456ms/step - loss: 0.4806 - accuracy: 0.8281 - val_loss: 0.5778 - val_accuracy: 0.8132
FINAL ACCURACY MEAN-5:  0.81000006

In [11]:

print(history.history.keys())
display_training_curves(history.history['accuracy'], history.history['val_accuracy'], 'accuracy', 211)
display_training_curves(history.history['loss'][15:], history.history['val_loss'][15:], 'loss', 212)

dict_keys(['val_loss', 'lr', 'accuracy', 'loss', 'val_accuracy'])

Predictions¶

In [16]:

# randomize the input so that you can execute multiple times to change results
permutation = np.random.permutation(8*20)
some_flowers, some_labels = (some_flowers[permutation], some_labels[permutation])

predictions = model.predict(some_flowers, batch_size=16)
evaluations = model.evaluate(some_flowers, some_labels, batch_size=16, verbose=0) # Little wrinkle: eval progress bar broken inb this version of TF 2.0 preview
  
print(np.array(CLASSES)[np.argmax(predictions, axis=-1)].tolist())
print('[val_loss, val_acc]', evaluations)

['sunflowers', 'tulips', 'roses', 'daisy', 'tulips', 'tulips', 'daisy', 'daisy', 'sunflowers', 'tulips', 'dandelion', 'daisy', 'roses', 'dandelion', 'sunflowers', 'dandelion', 'daisy', 'tulips', 'roses', 'sunflowers', 'tulips', 'dandelion', 'daisy', 'roses', 'tulips', 'dandelion', 'dandelion', 'tulips', 'roses', 'dandelion', 'tulips', 'roses', 'daisy', 'daisy', 'sunflowers', 'roses', 'roses', 'daisy', 'sunflowers', 'daisy', 'roses', 'sunflowers', 'dandelion', 'roses', 'daisy', 'dandelion', 'dandelion', 'roses', 'sunflowers', 'daisy', 'daisy', 'dandelion', 'dandelion', 'tulips', 'sunflowers', 'tulips', 'dandelion', 'sunflowers', 'daisy', 'daisy', 'dandelion', 'tulips', 'dandelion', 'tulips', 'tulips', 'roses', 'dandelion', 'roses', 'sunflowers', 'dandelion', 'daisy', 'daisy', 'roses', 'sunflowers', 'dandelion', 'daisy', 'dandelion', 'roses', 'dandelion', 'roses', 'roses', 'daisy', 'dandelion', 'dandelion', 'roses', 'daisy', 'daisy', 'dandelion', 'sunflowers', 'daisy', 'dandelion', 'dandelion', 'roses', 'dandelion', 'tulips', 'dandelion', 'tulips', 'sunflowers', 'sunflowers', 'daisy', 'dandelion', 'tulips', 'tulips', 'tulips', 'tulips', 'tulips', 'tulips', 'roses', 'dandelion', 'tulips', 'dandelion', 'sunflowers', 'tulips', 'daisy', 'sunflowers', 'dandelion', 'dandelion', 'daisy', 'roses', 'daisy', 'sunflowers', 'tulips', 'tulips', 'daisy', 'daisy', 'tulips', 'tulips', 'dandelion', 'tulips', 'dandelion', 'tulips', 'dandelion', 'dandelion', 'sunflowers', 'sunflowers', 'sunflowers', 'dandelion', 'dandelion', 'dandelion', 'dandelion', 'daisy', 'sunflowers', 'dandelion', 'dandelion', 'roses', 'dandelion', 'daisy', 'roses', 'roses', 'dandelion', 'dandelion', 'dandelion', 'dandelion', 'tulips', 'dandelion', 'tulips', 'sunflowers', 'sunflowers', 'dandelion', 'daisy']
[val_loss, val_acc] [0.7120346754789353, 0.7625]

In [17]:

display_9_images_with_predictions(some_flowers, predictions, some_labels)

License¶

author: Martin Gorner
twitter: @martin_gorner

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

This is not an official Google product but sample code provided for an educational purpose