Basic classification: Classify images of clothing

BentoML makes moving trained ML models to production easy:

  • Package models trained with any ML framework and reproduce them for model serving in production
  • Deploy anywhere for online API serving or offline batch serving
  • High-Performance API model server with adaptive micro-batching support
  • Central hub for managing models and deployment process via Web UI and APIs
  • Modular and flexible design making it adaptable to your infrastrcuture

BentoML is a framework for serving, managing, and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable teams to deliver prediction services in a fast, repeatable, and scalable way.

Impression

In [2]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
In [1]:
!pip install -q bentoml tensorflow matplotlib
In [2]:
from __future__ import absolute_import, division, print_function, unicode_literals

import io

# TensorFlow
import tensorflow as tf

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
print(tf.__version__)
2.1.0
In [3]:
fashion_mnist = tf.keras.datasets.fashion_mnist
(_train_images, train_labels), (_test_images, test_labels) = fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
train_images = _train_images / 255.0
test_images = _test_images / 255.0
In [4]:
class FashionMnist(tf.keras.Model):
    def __init__(self):
        super(FashionMnist, self).__init__()
        self.cnn = tf.keras.Sequential([
            tf.keras.layers.Flatten(input_shape=(28, 28)),
            tf.keras.layers.Dense(128, activation='relu'),
            tf.keras.layers.Dense(10, activation='softmax')
        ])
    
    @staticmethod
    def image_bytes2tensor(inputs):
        with tf.device("cpu:0"):  # map_fn has issues on GPU https://github.com/tensorflow/tensorflow/issues/28007
            inputs = tf.map_fn(lambda i: tf.io.decode_png(i, channels=1), inputs, dtype=tf.uint8)
        inputs = tf.cast(inputs, tf.float32)
        inputs = (255.0 - inputs) / 255.0
        inputs = tf.reshape(inputs, [-1, 28, 28])
        return inputs

    @tf.function(input_signature=[tf.TensorSpec(shape=(None,), dtype=tf.string)])
    def predict_image(self, inputs):
        inputs = self.image_bytes2tensor(inputs)
        return self(inputs)
    
    def call(self, inputs):
        return self.cnn(inputs)

Build the image preprocessing

In [5]:
# pick up a test image
d_test_img = _test_images[0]
print(class_names[test_labels[0]])

plt.imshow(255.0 - d_test_img, cmap='gray')
plt.imsave("test.png", 255.0 - d_test_img, cmap='gray')

# read bytes
with open("test.png", "rb") as f:
    img_bytes = f.read()

# verify saved image
assert tf.reduce_mean(FashionMnist.image_bytes2tensor(tf.constant([img_bytes])) - d_test_img) < 0.01
Ankle boot

Train the model

In [9]:
model = FashionMnist()
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=50)
Train on 60000 samples
Epoch 1/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.5019 - accuracy: 0.8245
Epoch 2/50
60000/60000 [==============================] - 3s 51us/sample - loss: 0.3775 - accuracy: 0.8638
Epoch 3/50
60000/60000 [==============================] - 3s 53us/sample - loss: 0.3397 - accuracy: 0.8765
Epoch 4/50
60000/60000 [==============================] - 3s 52us/sample - loss: 0.3135 - accuracy: 0.8852
Epoch 5/50
60000/60000 [==============================] - 3s 52us/sample - loss: 0.2946 - accuracy: 0.8908
Epoch 6/50
60000/60000 [==============================] - 3s 53us/sample - loss: 0.2792 - accuracy: 0.8968
Epoch 7/50
60000/60000 [==============================] - 3s 52us/sample - loss: 0.2695 - accuracy: 0.9012
Epoch 8/50
60000/60000 [==============================] - 3s 53us/sample - loss: 0.2575 - accuracy: 0.9033
Epoch 9/50
60000/60000 [==============================] - 3s 53us/sample - loss: 0.2478 - accuracy: 0.9072
Epoch 10/50
60000/60000 [==============================] - 3s 53us/sample - loss: 0.2403 - accuracy: 0.9105
Epoch 11/50
60000/60000 [==============================] - 3s 54us/sample - loss: 0.2319 - accuracy: 0.9134
Epoch 12/50
60000/60000 [==============================] - 3s 53us/sample - loss: 0.2239 - accuracy: 0.9163
Epoch 13/50
60000/60000 [==============================] - 3s 53us/sample - loss: 0.2181 - accuracy: 0.9187
Epoch 14/50
60000/60000 [==============================] - 3s 53us/sample - loss: 0.2129 - accuracy: 0.9200
Epoch 15/50
60000/60000 [==============================] - 3s 53us/sample - loss: 0.2053 - accuracy: 0.9235
Epoch 16/50
60000/60000 [==============================] - 3s 54us/sample - loss: 0.1995 - accuracy: 0.9245
Epoch 17/50
60000/60000 [==============================] - 3s 57us/sample - loss: 0.1942 - accuracy: 0.9268
Epoch 18/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1879 - accuracy: 0.9289
Epoch 19/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1849 - accuracy: 0.9305
Epoch 20/50
60000/60000 [==============================] - 3s 54us/sample - loss: 0.1788 - accuracy: 0.9328
Epoch 21/50
60000/60000 [==============================] - 3s 54us/sample - loss: 0.1779 - accuracy: 0.9330
Epoch 22/50
60000/60000 [==============================] - 3s 54us/sample - loss: 0.1715 - accuracy: 0.9363
Epoch 23/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1661 - accuracy: 0.9368
Epoch 24/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1605 - accuracy: 0.9390
Epoch 25/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1584 - accuracy: 0.9406
Epoch 26/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1543 - accuracy: 0.9420
Epoch 27/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1522 - accuracy: 0.9423
Epoch 28/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1486 - accuracy: 0.9437
Epoch 29/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1448 - accuracy: 0.9452
Epoch 30/50
60000/60000 [==============================] - 3s 57us/sample - loss: 0.1421 - accuracy: 0.9477
Epoch 31/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1394 - accuracy: 0.9478
Epoch 32/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1369 - accuracy: 0.9478
Epoch 33/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1323 - accuracy: 0.9506
Epoch 34/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1299 - accuracy: 0.9512
Epoch 35/50
60000/60000 [==============================] - 3s 56us/sample - loss: 0.1285 - accuracy: 0.9511
Epoch 36/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1252 - accuracy: 0.9530
Epoch 37/50
60000/60000 [==============================] - 3s 56us/sample - loss: 0.1255 - accuracy: 0.9526
Epoch 38/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1196 - accuracy: 0.9546
Epoch 39/50
60000/60000 [==============================] - 3s 57us/sample - loss: 0.1182 - accuracy: 0.9548
Epoch 40/50
60000/60000 [==============================] - 3s 56us/sample - loss: 0.1157 - accuracy: 0.9557
Epoch 41/50
60000/60000 [==============================] - 3s 56us/sample - loss: 0.1153 - accuracy: 0.9570
Epoch 42/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1089 - accuracy: 0.9596
Epoch 43/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1098 - accuracy: 0.9586
Epoch 44/50
60000/60000 [==============================] - 3s 54us/sample - loss: 0.1087 - accuracy: 0.9590
Epoch 45/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1037 - accuracy: 0.9609
Epoch 46/50
60000/60000 [==============================] - 3s 56us/sample - loss: 0.1059 - accuracy: 0.9600
Epoch 47/50
60000/60000 [==============================] - 3s 56us/sample - loss: 0.1024 - accuracy: 0.9623
Epoch 48/50
60000/60000 [==============================] - 3s 56us/sample - loss: 0.1004 - accuracy: 0.9622
Epoch 49/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.1000 - accuracy: 0.9624
Epoch 50/50
60000/60000 [==============================] - 3s 55us/sample - loss: 0.0957 - accuracy: 0.9641
Out[9]:
<tensorflow.python.keras.callbacks.History at 0x7f477003d9b0>

Model inference test run

In [10]:
predict = model.predict_image(tf.constant([img_bytes]))
klass = tf.argmax(predict, axis=1)
[class_names[c] for c in klass]
Out[10]:
['Ankle boot']

And the model predicts a label as expected.

Create BentoService class

In [18]:
%%writefile tensorflow_fashion_mnist.py

import bentoml
import tensorflow as tf

from bentoml.artifact import TensorflowSavedModelArtifact
from bentoml.adapters import TfTensorInput


FASHION_MNIST_CLASSES = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']


@bentoml.env(pip_dependencies=['tensorflow', 'numpy', 'pillow'])
@bentoml.artifacts([TensorflowSavedModelArtifact('model')])
class FashionMnistTensorflow(bentoml.BentoService):

    @bentoml.api(input=TfTensorInput(), batch=True)
    def predict(self, inputs):
        outputs = self.artifacts.model.predict_image(inputs)
        output_classes = tf.math.argmax(outputs, axis=1)
        return [FASHION_MNIST_CLASSES[c] for c in output_classes]
Overwriting tensorflow_fashion_mnist.py
In [19]:
from tensorflow_fashion_mnist import FashionMnistTensorflow

bento_svc = FashionMnistTensorflow()
bento_svc.pack("model", model)
saved_path = bento_svc.save()
[2020-07-28 15:36:46,482] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle.
WARNING:tensorflow:From /opt/anaconda3/envs/bentoml-dev-py36/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: /tmp/bentoml-temp-8dcup0pe/FashionMnistTensorflow/artifacts/model_saved_model/assets
[2020-07-28 15:36:58,212] INFO - Detect BentoML installed in development model, copying local BentoML module file to target saved bundle path
running sdist
running egg_info
writing BentoML.egg-info/PKG-INFO
writing dependency_links to BentoML.egg-info/dependency_links.txt
writing entry points to BentoML.egg-info/entry_points.txt
writing requirements to BentoML.egg-info/requires.txt
writing top-level names to BentoML.egg-info/top_level.txt
reading manifest file 'BentoML.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '*~' found anywhere in distribution
warning: no previously-included files matching '*.pyo' found anywhere in distribution
warning: no previously-included files matching '.git' found anywhere in distribution
warning: no previously-included files matching '.ipynb_checkpoints' found anywhere in distribution
warning: no previously-included files matching '__pycache__' found anywhere in distribution
warning: no directories found matching 'bentoml/server/static'
warning: no directories found matching 'bentoml/yatai/web/dist'
no previously-included directories found matching 'e2e_tests'
no previously-included directories found matching 'tests'
no previously-included directories found matching 'benchmark'
writing manifest file 'BentoML.egg-info/SOURCES.txt'
running check
creating BentoML-0.8.3+42.gb8d36b6
creating BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
creating BentoML-0.8.3+42.gb8d36b6/bentoml
creating BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
creating BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
creating BentoML-0.8.3+42.gb8d36b6/bentoml/cli
creating BentoML-0.8.3+42.gb8d36b6/bentoml/clipper
creating BentoML-0.8.3+42.gb8d36b6/bentoml/configuration
creating BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
creating BentoML-0.8.3+42.gb8d36b6/bentoml/handlers
creating BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
creating BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
creating BentoML-0.8.3+42.gb8d36b6/bentoml/server
creating BentoML-0.8.3+42.gb8d36b6/bentoml/utils
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/client
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/__pycache__
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions/__pycache__
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/validator
copying files to BentoML-0.8.3+42.gb8d36b6...
copying LICENSE -> BentoML-0.8.3+42.gb8d36b6
copying MANIFEST.in -> BentoML-0.8.3+42.gb8d36b6
copying README.md -> BentoML-0.8.3+42.gb8d36b6
copying pyproject.toml -> BentoML-0.8.3+42.gb8d36b6
copying setup.cfg -> BentoML-0.8.3+42.gb8d36b6
copying setup.py -> BentoML-0.8.3+42.gb8d36b6
copying versioneer.py -> BentoML-0.8.3+42.gb8d36b6
copying BentoML.egg-info/PKG-INFO -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/SOURCES.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/dependency_links.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/entry_points.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/requires.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/top_level.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying bentoml/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/_version.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/exceptions.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/service.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/service_env.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/adapters/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/base_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/base_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/clipper_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/dataframe_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/dataframe_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/default_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/fastai_image_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/file_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/image_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/json_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/json_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/legacy_image_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/legacy_json_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/multi_image_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/pytorch_tensor_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/tensorflow_tensor_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/tensorflow_tensor_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/artifact/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/fastai2_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/fastai_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/fasttext_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/h2o_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/json_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/keras_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/lightgbm_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/onnx_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/pickle_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/pytorch_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/sklearn_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/spacy_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/text_file_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/tf_savedmodel_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/xgboost_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/cli/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/aws_lambda.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/aws_sagemaker.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/azure_functions.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/bento_management.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/bento_service.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/click_utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/config.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/deployment.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/yatai_service.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/clipper/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/clipper
copying bentoml/configuration/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration
copying bentoml/configuration/configparser.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration
copying bentoml/configuration/default_bentoml.cfg -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration
copying bentoml/configuration/__pycache__/__init__.cpython-36.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/__init__.cpython-37.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/__init__.cpython-38.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/configparser.cpython-36.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/configparser.cpython-37.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/configparser.cpython-38.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/handlers/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/handlers
copying bentoml/marshal/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
copying bentoml/marshal/dispatcher.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
copying bentoml/marshal/marshal.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
copying bentoml/marshal/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
copying bentoml/saved_bundle/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/bentoml-init.sh -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/bundler.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/config.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/docker-entrypoint.sh -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/loader.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/pip_pkg.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/py_module_utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/templates.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/server/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/api_server.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/gunicorn_config.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/gunicorn_server.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/instruments.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/marshal_server.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/open_api.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/trace.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/utils/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/alg.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/benchmark.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/cloudpickle.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/dataframe_util.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/flask_ngrok.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/hybridmethod.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/lazy_loader.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/log.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/s3.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/tempdir.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/usage_stats.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/yatai/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/alembic.ini -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/db.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/deployment_utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/status.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/yatai_service.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/yatai_service_impl.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/client/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/client
copying bentoml/yatai/client/bento_repository_api.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/client
copying bentoml/yatai/client/deployment_api.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/client
copying bentoml/yatai/deployment/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
copying bentoml/yatai/deployment/operator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
copying bentoml/yatai/deployment/store.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
copying bentoml/yatai/deployment/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
copying bentoml/yatai/deployment/aws_lambda/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/aws_lambda/download_extra_resources.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/aws_lambda/lambda_app.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/aws_lambda/operator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/aws_lambda/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/azure_functions/Dockerfile -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/app_init.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/constants.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/host.json -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/local.settings.json -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/operator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/templates.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/sagemaker/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/model_server.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/nginx.conf -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/operator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/serve -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/wsgi.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/migrations/README -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
copying bentoml/yatai/migrations/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
copying bentoml/yatai/migrations/env.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
copying bentoml/yatai/migrations/script.py.mako -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
copying bentoml/yatai/migrations/__pycache__/env.cpython-36.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/__pycache__
copying bentoml/yatai/migrations/versions/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions
copying bentoml/yatai/migrations/versions/a6b00ae45279_add_last_updated_at_for_deployments.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions
copying bentoml/yatai/migrations/versions/__pycache__/a6b00ae45279_add_last_updated_at_for_deployments.cpython-36.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions/__pycache__
copying bentoml/yatai/proto/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/deployment_pb2.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/repository_pb2.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/status_pb2.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/yatai_service_pb2.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/yatai_service_pb2_grpc.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/repository/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/base_repository.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/local_repository.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/metadata_store.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/repository.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/s3_repository.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/validator/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/validator
copying bentoml/yatai/validator/deployment_pb_validator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/validator
Writing BentoML-0.8.3+42.gb8d36b6/setup.cfg
UPDATING BentoML-0.8.3+42.gb8d36b6/bentoml/_version.py
set BentoML-0.8.3+42.gb8d36b6/bentoml/_version.py to '0.8.3+42.gb8d36b6'
Creating tar archive
removing 'BentoML-0.8.3+42.gb8d36b6' (and everything under it)
[2020-07-28 15:36:59,228] INFO - BentoService bundle 'FashionMnistTensorflow:20200728153646_C57052' saved to: /home/bentoml/bentoml/repository/FashionMnistTensorflow/20200728153646_C57052

Use BentoService with BentoML CLI

bentoml get <BentoService Name> list all of BentoService's versions

In [ ]:
!bentoml get FashionMnistTensorflow

bentoml get <BentoService name>:<bentoService version> display detailed information of the specific BentoService version

In [7]:
!bentoml get FashionMnistTensorflow:latest 
[2020-07-28 15:41:15,592] INFO - Getting latest version FashionMnistTensorflow:20200728153646_C57052
{
  "name": "FashionMnistTensorflow",
  "version": "20200728153646_C57052",
  "uri": {
    "type": "LOCAL",
    "uri": "/home/bentoml/bentoml/repository/FashionMnistTensorflow/20200728153646_C57052"
  },
  "bentoServiceMetadata": {
    "name": "FashionMnistTensorflow",
    "version": "20200728153646_C57052",
    "createdAt": "2020-07-28T07:36:59.200032Z",
    "env": {
      "condaEnv": "name: bentoml-FashionMnistTensorflow\nchannels:\n- defaults\ndependencies:\n- python=3.6.10\n- pip\n",
      "pipDependencies": "tensorflow\nbentoml==0.8.3\nnumpy\npillow",
      "pythonVersion": "3.6.10",
      "dockerBaseImage": "bentoml/model-server:0.8.3"
    },
    "artifacts": [
      {
        "name": "model",
        "artifactType": "TensorflowSavedModelArtifact"
      }
    ],
    "apis": [
      {
        "name": "predict",
        "inputType": "TfTensorInput",
        "docs": "BentoService inference API 'predict', input: 'TfTensorInput', output: 'DefaultOutput'",
        "inputConfig": {
          "method": "predict",
          "is_batch_input": true
        },
        "outputConfig": {
          "cors": "*"
        },
        "outputType": "DefaultOutput",
        "mbMaxLatency": 10000,
        "mbMaxBatchSize": 2000
      }
    ]
  }
}

Serve bentoml REST server locally

In [2]:
!bentoml serve FashionMnistTensorflow:latest
[2020-07-28 15:39:14,893] INFO - Getting latest version FashionMnistTensorflow:20200728153646_C57052
[2020-07-28 15:39:14,893] INFO - Starting BentoML API server in development mode..
[2020-07-28 15:39:16,013] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle.
[2020-07-28 15:39:16,036] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.8.3, but loading from BentoML version 0.8.3+42.gb8d36b6
2020-07-28 15:39:17.628512: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-28 15:39:17.643972: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:39:17.644615: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 15:39:17.644822: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 15:39:17.646534: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 15:39:17.648390: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 15:39:17.648745: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 15:39:17.650656: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 15:39:17.651735: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 15:39:17.654958: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 15:39:17.655092: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:39:17.655520: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:39:17.655816: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 15:39:17.656046: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-07-28 15:39:17.795174: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:39:17.795603: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55d4fe6ec2f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-28 15:39:17.795621: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1060, Compute Capability 6.1
2020-07-28 15:39:17.795770: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:39:17.796095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 15:39:17.796129: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 15:39:17.796146: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 15:39:17.796159: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 15:39:17.796172: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 15:39:17.796186: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 15:39:17.796236: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 15:39:17.796262: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 15:39:17.796343: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:39:17.796708: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:39:17.797002: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 15:39:17.797030: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 15:39:17.797505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-28 15:39:17.797517: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-07-28 15:39:17.797523: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-07-28 15:39:17.797606: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:39:17.797948: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:39:17.798261: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5065 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-07-28 15:39:17.821394: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2699905000 Hz
2020-07-28 15:39:17.821762: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55d4fe719900 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-28 15:39:17.821807: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
 * Serving Flask app "FashionMnistTensorflow" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
^C

Query REST API with python

In [8]:
import base64
import json
import requests

with open("test.png", "rb") as f:
    img_bytes = f.read()
img_b64 = base64.b64encode(img_bytes).decode()


headers = {"content-type": "application/json"}
data = json.dumps(
       {"instances": [{"b64": img_b64}]}
)
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))

json_response = requests.post(f'http://localhost:5000/predict', data=data, headers=headers)
print(json_response)
print(json_response.text)
Data: {"instances": [{"b64": "iVBORw0KGgoAAAANSUhEUgAAAB ... ufkz8DPG//sD/AX8I8DvdgnOxdB4B1wAAAAASUVORK5CYII="}]}
<Response [200]>
["Ankle boot"]

Use BentoService as PyPI package

pip install $SAVED_PATH also installs a CLI tool for accessing the BentoML service

In [10]:
!pip install -q {saved_path}
In [11]:
!FashionMnistTensorflow --help
Usage: FashionMnistTensorflow [OPTIONS] COMMAND [ARGS]...

  BentoML CLI tool

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  containerize        Containerizes given Bento into a ready-to-use Docker
                      image

  info                List APIs
  install-completion  Install shell command completion
  open-api-spec       Display OpenAPI/Swagger JSON specs
  run                 Run API function
  serve               Start local dev API server
  serve-gunicorn      Start production API server

Run 'predict' api with json data:

In [13]:
!echo '{\"instances\":[{\"b64\":\"'$(base64 test.png)'\"}]}' > test.json
!cat test.json | xargs -I {} FashionMnistTensorflow run predict --input={}
2020-07-28 15:45:44.309676: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-28 15:45:44.324141: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:45:44.324506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 15:45:44.324699: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 15:45:44.326113: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 15:45:44.327389: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 15:45:44.327618: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 15:45:44.329060: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 15:45:44.329947: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 15:45:44.333107: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 15:45:44.333293: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:45:44.333693: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:45:44.334018: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 15:45:44.334257: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-07-28 15:45:44.357433: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2699905000 Hz
2020-07-28 15:45:44.357821: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5594561dee60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-28 15:45:44.357859: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-28 15:45:44.358360: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:45:44.358965: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 15:45:44.359027: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 15:45:44.359055: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 15:45:44.359079: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 15:45:44.359107: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 15:45:44.359133: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 15:45:44.359158: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 15:45:44.359183: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 15:45:44.359282: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:45:44.359891: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:45:44.360422: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 15:45:44.360476: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 15:45:44.500133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-28 15:45:44.500167: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-07-28 15:45:44.500179: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-07-28 15:45:44.500530: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:45:44.501328: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:45:44.502447: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 15:45:44.502893: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5065 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-07-28 15:45:44.505063: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5594570adff0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-28 15:45:44.505092: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1060, Compute Capability 6.1
2020-07-28 15:45:46.298717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
['Ankle boot']

Build realtime prediction service in docker with BentoService

In [14]:
!docker build --quiet -t tensorflow2-fashion-mnist {saved_path}
sha256:dbf669648a388423aedf26c2124c31299c5453fc8a5740ad15bddafe22fda5e5
In [15]:
!docker run -p 5000:5000 tensorflow2-fashion-mnist --workers 1 --enable-microbatch
[2020-07-28 07:51:49,529] INFO - Starting BentoML API server in production mode..
[2020-07-28 07:51:49,927] INFO - Running micro batch service on :5000
[2020-07-28 07:51:49 +0000] [12] [INFO] Starting gunicorn 20.0.4
[2020-07-28 07:51:49 +0000] [1] [INFO] Starting gunicorn 20.0.4
[2020-07-28 07:51:49 +0000] [12] [INFO] Listening at: http://0.0.0.0:5000 (12)
[2020-07-28 07:51:49 +0000] [12] [INFO] Using worker: aiohttp.worker.GunicornWebWorker
[2020-07-28 07:51:49 +0000] [1] [INFO] Listening at: http://0.0.0.0:37381 (1)
[2020-07-28 07:51:49 +0000] [1] [INFO] Using worker: sync
[2020-07-28 07:51:49 +0000] [13] [INFO] Booting worker with pid: 13
[2020-07-28 07:51:49 +0000] [14] [INFO] Booting worker with pid: 14
[2020-07-28 07:51:49,954] WARNING - Using BentoML not from official PyPI release. In order to find the same version of BentoML when deplying your BentoService, you must set the 'core/bentoml_deploy_version' config to a http/git location of your BentoML fork, e.g.: 'bentoml_deploy_version = git+https://github.com/{username}/[email protected]{branch}'
[2020-07-28 07:51:49,974] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.8.3, but loading from BentoML version 0.8.3+42.gb8d36b6
[2020-07-28 07:51:49,985] INFO - Micro batch enabled for API `predict`
[2020-07-28 07:51:49,985] INFO - Your system nofile limit is 1048576, which means each instance of microbatch service is able to hold this number of connections at same time. You can increase the number of file descriptors for the server process, or launch more microbatch instances to accept more concurrent connection.
[2020-07-28 07:51:50,905] WARNING - Using BentoML not from official PyPI release. In order to find the same version of BentoML when deplying your BentoService, you must set the 'core/bentoml_deploy_version' config to a http/git location of your BentoML fork, e.g.: 'bentoml_deploy_version = git+https://github.com/{username}/[email protected]{branch}'
[2020-07-28 07:51:50,925] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.8.3, but loading from BentoML version 0.8.3+42.gb8d36b6
2020-07-28 07:51:51.085351: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-07-28 07:51:51.085674: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-07-28 07:51:52.300840: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-07-28 07:51:52.300872: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-07-28 07:51:52.300898: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
2020-07-28 07:51:52.301165: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-07-28 07:51:52.325362: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2699905000 Hz
2020-07-28 07:51:52.325805: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55f10b5e5c00 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-28 07:51:52.325835: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
^C
[2020-07-28 07:51:59 +0000] [1] [INFO] Handling signal: int
[2020-07-28 07:51:59 +0000] [13] [INFO] Worker exiting (pid: 13)

Deploy BentoService as REST API server to the cloud

BentoML support deployment to multiply cloud provider services, such as AWS Lambda, AWS Sagemaker, Google Cloudrun and etc. You can find the full list and guide on the documentation site at https://docs.bentoml.org/en/latest/deployment/index.html

For this demo, we are going to deploy to AWS Sagemaker

In [ ]:
bento_service_tag = f'{bento_svc.name}:{bento_svc.version}'
print(bento_service_tag)
In [ ]:
!bentoml sagemaker deploy first-tf-fashion -b {bento_service_tag} --api-name predict --verbose
In [ ]:
!bentoml sagemaker get first-tf-fashion
In [ ]:
!aws sagemaker-runtime invoke-endpoint --endpoint-name dev-first-tf-fashion --content-type 'application/json' \
--body "{\"instances\":[{\"b64\":\"iVBORw0KGgoAAAANSUhEUgAAABwAAAAcCAYAAAByDd+UAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAA2dJREFUSIntlk9L60oYh59Mp43axJoiVj2KboSuXPoJXLkQ/Fx+Ahe6ceFG3CtdWlFcFf9RKlhFatpSbWuaNLHJnIW3OR7ugcu9BzxccGAgZCbv887v/b1JNKWU4hOH+EzYF/AL+OeBYRjysct83wegUqn8HlApRRRFADw+PrK/v4/jOCQSCTRNi/fpug7AwcHB7wEBhHh/9Pj4mGKxyM7Ozt/2NBoNdnd3MU0zvif/CywMQ6SUnJ+fc3NzQy6Xo1KpsLGxgWVZ9Pt9FhYWaLVadLtd5ubmfiT6b2FRFCGlxHEc9vf3UUrR7/d5fX1FKRXPq6srpJRYlsVgMPhn4LD4URTF12EYxlJubW2Ry+WYnp7G8zz6/T65XC6uYzqdRtd1fN+n2+3iOM6vgcPgw+ILIdA0jTAMSSQSAOzt7WHbNtPT0xiGQbvdJpvNMjU1RTKZJAxD3t7e4ni9Xo/b29tfA4egKIoYDAZxAkPYzs4OpVKJ+fl5Wq0W7XYbz/PIZDK8vr6iaRpjY2Mkk0mUUnG8w8ND4INphjbXNA2lFEKIWD6AWq3GwcEBnuextLSE4zj4vk+r1SKVSqFpGq7rxsnpuo4QgnQ6jRCCYrH4DhxK9TH4MKtms0m1WqVcLvP09EQqlWJ8fJx2u0232+Xt7Q3f9xFCUK1WGQwGTExMkEwmEUKglGJ0dJQwDDEMg8vLS+RQqnq9zv39Pb1ej16vh+d53N3d4bouUkpM0ySKIjqdDp7nIaXEdV1GR0fRdZ0gCJidnaXT6eC6LpZl4TgOLy8vpNNpbNvm+fn5XdJCoUCtVkNKSbPZjA0yBDmOg23bKKXwfR/LsoiiCMdxCMOQdDqNYRhkMhkajUaslGVZCCHwPI8gCJBSIo+Ojtje3iafzzMzMxOfJJVKxe9G0zQJggAhRNxvnuehaRpRFGHbNvV6nevra4IgIAxDAAzDwHVddF3HMAympqaQKysrnJ6ecnFx8aOwf50sm82SzWbJZDIEQYBSilarRblcxnVdut0umqZRKpVYXl5mcXGRQqGA7/uxD6SUfPv2DdM031388SfKcRzOzs4ol8ucnJzQbDZ/atqhA7PZLPl8ntXVVdbW1hgZGYnX19fXeXh4YHJyEtM0MU0TKSW6rrO5ufkz8DPG//sD/AX8I8DvdgnOxdB4B1wAAAAASUVORK5CYII=\"}]}" \
output.json && cat output.json
In [ ]:
!bentoml sagemaker delete first-tf-fashion --force
In [ ]: