Azure ML Hardware Accelerated Models Quickstart

This tutorial will show you how to deploy an image recognition service based on the ResNet 50 classifier in just a few minutes using the Azure Machine Learning Accelerated AI service. Get more help from our documentation or forum.

We will use an accelerated ResNet50 featurizer running on an FPGA. This functionality is powered by Project Brainwave, which handles translating deep neural networks (DNN) into an FPGA program.

Request Quota

IMPORTANT: You must request quota and be approved before you can successfully run this notebook.

Environment setup

  1. Download and install Git 2.16 or later
  2. Open a Git prompt and clone this repo:

    git clone https://github.com/Azure/aml-real-time-ai

  3. Install conda (Python 3.6):

    https://conda.io/miniconda.html

  4. Open an Anaconda Prompt and run the rest of the commands in the prompt. On Windows the prompt will look like:

    (base) C:\>

  5. Create the environment:

    conda env create -f aml-real-time-ai/environment.yml

  6. Activate the environment:
    1. Windows: conda activate amlrealtimeai
    2. Mac/Linux: source activate amlrealtimeai
  7. Launch the Jupyter notebook browser:

    jupyter notebook

  8. In the browser, open this notebook by navigating to notebooks/resnet50/00_QuickStart.ipynb. (If you're using Chrome, copy and paste the URL with the notebook token into the address bar).

  9. Run through each cell and enter the appropriate information as necessary (e.g. Azure subscription ID, resource group ID, Model Management Account, etc.)

Imports

In [ ]:
import os
import tensorflow as tf

import amlrealtimeai
from amlrealtimeai import resnet50

Image preprocessing

We'd like our service to accept JPEG images as input. However the input to ResNet50 is a tensor. So we need code that decodes JPEG images and does the preprocessing required by ResNet50. The Accelerated AI service can execute TensorFlow graphs as part of the service and we'll use that ability to do the image preprocessing. This code defines a TensorFlow graph that preprocesses an array of JPEG images (as strings) and produces a tensor that is ready to be featurized by ResNet50.

In [ ]:
# Input images as a two-dimensional tensor containing an arbitrary number of images represented a strings
import amlrealtimeai.resnet50.utils
in_images = tf.placeholder(tf.string)
image_tensors = resnet50.utils.preprocess_array(in_images)
print(image_tensors.shape)

Featurizer

We use ResNet50 as a featurizer. In this step we initialize the model. This downloads a TensorFlow checkpoint of the quantized ResNet50.

In [ ]:
from amlrealtimeai.resnet50.model import LocalQuantizedResNet50
model_path = os.path.expanduser('~/models')
model = LocalQuantizedResNet50(model_path)
print(model.version)

Classifier

The model we downloaded includes a classifier which takes the output of the ResNet50 and identifies an image. This classifier is trained on the ImageNet dataset. We are going to use this classifier for our service. The next notebook shows how to train a classifier for a different data set. The input to the classifier is a tensor matching the output of our ResNet50 featurizer.

In [ ]:
model.import_graph_def(include_featurizer=False)
print(model.classifier_input.shape)

Service Definition

Now that we've definied the image preprocessing, featurizer, and classifier that we will execute on our service we can create a service definition. The service definition is a set of files generated from the model that allow us to deploy to the FPGA service. The service definition consists of a pipeline. The pipeline is a series of stages that are executed in order. We support TensorFlow stages, Keras stages, and BrainWave stages. The stages will be executed in order on the service, with the output of each stage input into the subsequent stage.

To create a TensorFlow stage we specify a session containing the graph (in this case we are using the default graph) and the input and output tensors to this stage. We use this information to save the graph so that we can execute it on the service.

In [ ]:
from amlrealtimeai.pipeline import ServiceDefinition, TensorflowStage, BrainWaveStage

save_path = os.path.expanduser('~/models/save')
service_def_path = os.path.join(save_path, 'service_def.zip')

service_def = ServiceDefinition()
service_def.pipeline.append(TensorflowStage(tf.Session(), in_images, image_tensors))
service_def.pipeline.append(BrainWaveStage(model))
service_def.pipeline.append(TensorflowStage(tf.Session(), model.classifier_input, model.classifier_output))
service_def.save(service_def_path)
print(service_def_path)

Deploy

Time to create a service from the service definition. You need a Model Management Account in the East US 2 location. Go to our GitHub repo "docs" folder to learn how to create a Model Management Account and find the required information below.

This code creates the deployment client that we will use to deploy the service. Follow the instructions in the output to sign in to your account.

In [ ]:
from amlrealtimeai import DeploymentClient

subscription_id = "<Your Azure Subscription ID>"
resource_group = "<Your Azure Resource Group Name>"
model_management_account = "<Your AzureML Model Management Account Name>"

model_name = "resnet50-model"
service_name = "quickstart-service"

deployment_client = DeploymentClient(subscription_id, resource_group, model_management_account)

Upload the service definition to the model management service.

In [ ]:
model_id = deployment_client.register_model(model_name, service_def_path)

Create a service from the model that we registered. If this is a new service then we create it. If you already have a service with this name then the existing service will be updated to use this model.

In [ ]:
service = deployment_client.get_service_by_name(service_name)
if(service is None):
    service = deployment_client.create_service(service_name, model_id)    
else:
    service = deployment_client.update_service(service.id, model_id)

Client

The service supports gRPC and the TensorFlow Serving "predict" API. We provide a client that can call the service to get predictions.

In [ ]:
from amlrealtimeai import PredictionClient
client = PredictionClient(service.ipAddress, service.port)

To understand the results we need a mapping to the human readable imagenet classes

In [ ]:
import requests
classes_entries = requests.get("https://raw.githubusercontent.com/Lasagne/Recipes/master/examples/resnet50/imagenet_classes.txt").text.splitlines()

We can now send an image to the service and get the predictions. Let's see if it can identify a snow leopard. title Snow leopard in a zoo. Photo by Peter Bolliger.

In [ ]:
image_file = 'snowleopardgaze.jpg'
results = client.score_image(image_file)
# map results [class_id] => [confidence]
results = enumerate(results)
# sort results by confidence
sorted_results = sorted(results, key=lambda x: x[1], reverse=True)
# print top 5 results
for top in sorted_results[:5]:
    print(classes_entries[top[0]], 'confidence:', top[1])

Cleanup

Run the cell below to delete your service.

In [ ]:
services = deployment_client.list_services()

for service in filter(lambda x: x.name == service_name, services):
    print(service.id)
    deployment_client.delete_service(service.id)
    
models = deployment_client.list_models()

for model in filter(lambda x: x.name == model_name, models):
    print(model.id)
    deployment_client.delete_model(model.id)

Congratulations! You've just created a service that does predictions using an FPGA. The next notebook shows how to customize the service using transfer learning to classify different types of images.