Copyright (c) Microsoft Corporation. All rights reserved.
Licensed under the MIT License.
This example shows how to build, train, evaluate and deploy a model running on FPGA. Only Windows is supported for transfer learning, whereas inference works across all platforms. We use TensorFlow and Keras to build our model. We are going to use transfer learning, with ResNet152 as a featurizer. We don't use the last layer of ResNet152 in this case and instead add and train our own classification layer.
We will use the Kaggle Cats and Dogs dataset to train the classifier. The dataset can be downloaded here. Download the zip and extract to a directory named 'catsanddogs' under your user directory ("~/catsanddogs").
Please set up your environment as described in the quick start.
import os
import tensorflow as tf
import numpy as np
Load the files we are going to use for training and testing. By default this notebook uses only a very small subset of the Cats and Dogs dataset. That makes it run quickly, but doesn't create a very accurate classifier. You can improve the classifier by using more of the dataset.
import glob
import imghdr
datadir = os.path.expanduser("~/catsanddogs")
cat_files = glob.glob(os.path.join(datadir, 'PetImages', 'Cat', '*.jpg'))
dog_files = glob.glob(os.path.join(datadir, 'PetImages', 'Dog', '*.jpg'))
# Limit the data set to make the notebook execute quickly.
cat_files = cat_files[:64]
dog_files = dog_files[:64]
# The data set has a few images that are not jpeg. Remove them.
cat_files = [f for f in cat_files if imghdr.what(f) == 'jpeg']
dog_files = [f for f in dog_files if imghdr.what(f) == 'jpeg']
if(not len(cat_files) or not len(dog_files)):
print("Please download the Kaggle Cats and Dogs dataset form https://www.microsoft.com/en-us/download/details.aspx?id=54765 and extract the zip to " + datadir)
raise ValueError("Data not found")
else:
print(cat_files[0])
print(dog_files[0])
# constructing a numpy array as labels
image_paths = cat_files + dog_files
total_files = len(cat_files) + len(dog_files)
labels = np.zeros(total_files)
labels[len(cat_files):] = 1
We need to preprocess the input file to get it into the form expected by ResNet152. We've provided a default implementation of the preprocessing that you can use.
# Input images as a two-dimensional tensor containing an arbitrary number of images represented a strings
import azureml.contrib.brainwave.models.utils as utils
in_images = tf.placeholder(tf.string)
image_tensors = utils.preprocess_array(in_images)
print(image_tensors.shape)
Alternatively, if you would like to customize the preprocessing, you can write your own preprocessor using TensorFlow operations.
The input to the classifier we are training is the set of features produced by ResNet50. To train the classifier we need to featurize the images using ResNet50. You can also run the featurizer locally on CPU or GPU. We import the featurizer as frozen, so that we are only training the classifier.
from azureml.contrib.brainwave.models import QuantizedResnet152
model_path = os.path.expanduser('~/models')
bwmodel = QuantizedResnet152(model_path, is_frozen = True)
print(bwmodel.version)
Calling import_graph_def on the featurizer will create a service that runs the featurizer on FPGA.
features = bwmodel.import_graph_def(input_tensor=image_tensors)
Load the data set and compute the features. These can be precomputed because they don't change during training. This can take a while to run on CPU.
from tqdm import tqdm
def chunks(l, n):
"""Yield successive n-sized chunks from l."""
for i in range(0, len(l), n):
yield l[i:i + n]
def read_files(files):
contents = []
for path in files:
with open(path, 'rb') as f:
contents.append(f.read())
return contents
feature_list = []
with tf.Session() as sess:
for chunk in tqdm(chunks(image_paths, 5)):
contents = read_files(chunk)
result = sess.run([features], feed_dict={in_images: contents})
feature_list.extend(result[0])
feature_results = np.array(feature_list)
print(feature_results.shape)
We use Keras to define and train a simple classifier.
from keras.models import Sequential
from keras.layers import Dropout, Dense, Flatten
from keras import optimizers
FC_SIZE = 1024
NUM_CLASSES = 2
model = Sequential()
model.add(Dropout(0.2, input_shape=(1, 1, 2048,)))
model.add(Dense(FC_SIZE, activation='relu', input_dim=(1, 1, 2048,)))
model.add(Flatten())
model.add(Dense(NUM_CLASSES, activation='sigmoid', input_dim=FC_SIZE))
model.compile(optimizer=optimizers.SGD(lr=1e-4,momentum=0.9), loss='binary_crossentropy', metrics=['accuracy'])
Prepare the train and test data.
from sklearn.model_selection import train_test_split
onehot_labels = np.array([[0,1] if i else [1,0] for i in labels])
X_train, X_test, y_train, y_test = train_test_split(feature_results, onehot_labels, random_state=42, shuffle=True)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
Train the classifier.
model.fit(X_train, y_train, epochs=16, batch_size=32)
Let's test the classifier and see how well it does. Since we only trained on a few images, we are not expecting to win a Kaggle competition, but it will likely get most of the images correct.
from numpy import argmax
y_probs = model.predict(X_test)
y_prob_max = np.argmax(y_probs, 1)
y_test_max = np.argmax(y_test, 1)
print(y_prob_max)
print(y_test_max)
from sklearn.metrics import confusion_matrix, roc_auc_score, accuracy_score, precision_score, recall_score, f1_score
import itertools
import matplotlib
from matplotlib import pyplot as plt
# compute a bunch of classification metrics
def classification_metrics(y_true, y_pred, y_prob):
cm_dict = {}
cm_dict['Accuracy'] = accuracy_score(y_true, y_pred)
cm_dict['Precision'] = precision_score(y_true, y_pred)
cm_dict['Recall'] = recall_score(y_true, y_pred)
cm_dict['F1'] = f1_score(y_true, y_pred)
cm_dict['AUC'] = roc_auc_score(y_true, y_prob[:,0])
cm_dict['Confusion Matrix'] = confusion_matrix(y_true, y_pred).tolist()
return cm_dict
def plot_confusion_matrix(cm, classes, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues):
"""Plots a confusion matrix.
Source: http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html
New BSD License - see appendix
"""
cm_max = cm.max()
cm_min = cm.min()
if cm_min > 0: cm_min = 0
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
cm_max = 1
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)
thresh = cm_max / 2.
plt.clim(cm_min, cm_max)
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i,
round(cm[i, j], 3), # round to 3 decimals if they are float
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.ylabel('True label')
plt.xlabel('Predicted label')
plt.show()
cm_dict = classification_metrics(y_test_max, y_prob_max, y_probs)
for m in cm_dict:
print(m, cm_dict[m])
cm = np.asarray(cm_dict['Confusion Matrix'])
plot_confusion_matrix(cm, ['fail','pass'], normalize=False)
Like in the QuickStart notebook our service definition pipeline consists of three stages. Because the preprocessing and featurizing stage don't contain any variables, we can use a default session. Here we use the Keras classifier as the final stage.
from azureml.contrib.brainwave.pipeline import ModelDefinition, TensorflowStage, BrainWaveStage, KerasStage
model_def = ModelDefinition()
model_def.pipeline.append(TensorflowStage(tf.Session(), in_images, image_tensors))
model_def.pipeline.append(BrainWaveStage(tf.Session(), bwmodel))
model_def.pipeline.append(KerasStage(model))
model_def_path = os.path.join(datadir, 'save', 'model_def.zip')
model_def.save(model_def_path)
print(model_def_path)
from azureml.core.model import Model
from azureml.core import Workspace
ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')
model_name = "catsanddogs-model"
service_name = "modelbuild-service"
registered_model = Model.register(ws, model_def_path, model_name)
The first time the code below runs it will create a new service running your model. If you want to change the model you can make changes above in this notebook and save a new service definition. Then this code will update the running service in place to run the new model.
from azureml.core.webservice import Webservice
from azureml.exceptions import WebserviceException
from azureml.contrib.brainwave import BrainwaveWebservice, BrainwaveImage
try:
service = Webservice(ws, service_name)
except WebserviceException:
image_config = BrainwaveImage.image_configuration()
deployment_config = BrainwaveWebservice.deploy_configuration()
service = Webservice.deploy_from_model(ws, service_name, [registered_model], image_config, deployment_config)
service.wait_for_deployment(True)
The service is now running in Azure and ready to serve requests. We can check the address and port.
print(service.ip_address + ':' + str(service.port))
There is a simple test client at amlrealtimeai.PredictionClient which can be used for testing. We'll use this client to score an image with our new service.
from azureml.contrib.brainwave.client import PredictionClient
client = PredictionClient(service.ipAddress, service.port)
Let's see how our service does on a few images. It may get a few wrong.
# Specify an image to classify
print('CATS')
for image_file in cat_files[:8]:
results = client.score_image(image_file)
result = 'CORRECT ' if results[0] > results[1] else 'WRONG '
print(result + str(results))
print('DOGS')
for image_file in dog_files[:8]:
results = client.score_image(image_file)
result = 'CORRECT ' if results[1] > results[0] else 'WRONG '
print(result + str(results))
Run the cell below to delete your service. In the next notebook you will learn how to retrain all the weights of one of the models
service.delete()
registered_model.delete()
License for plot_confusion_matrix:
New BSD License
Copyright (c) 2007–2018 The scikit-learn developers. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
a. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. b. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. c. Neither the name of the Scikit-learn Developers nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.