BentoML Example: Tensorflow 2.0 example (Echo model)

BentoML is an open source platform for machine learning model serving and deployment.

This notebook demonstrates how to use BentoML to turn a Tensorflow model into a docker image containing a REST API server serving this model, how to use your ML service built with BentoML as a CLI tool, and how to distribute it a pypi package.

Impression

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
In [8]:
from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf
import numpy as np
print(tf.__version__)

import os
import time
import requests
import json
2.1.0
In [9]:
class EchoModel(tf.keras.Model):
    def call(self, x):
        return tf.multiply(x, 1)

custom_model = EchoModel()
custom_model.compile(optimizer='sgd',
              loss="mean_squared_error",
              metrics=['accuracy'])

test_input =  tf.constant(np.zeros([1, 2, 2]))
test_output = tf.constant(np.zeros([1, 2, 2]))

custom_model.fit(test_input, test_output, epochs=1)  # required. it will generate the signature automaticlly

# test
custom_model(tf.constant(np.ones([4, 2, 3]), dtype=tf.float32))
Train on 1 samples
WARNING:tensorflow:The list of trainable weights is empty. Make sure that you are not setting model.trainable to False before compiling the model.
1/1 [==============================] - 0s 53ms/sample - loss: 0.0000e+00 - accuracy: 1.0000
Out[9]:
<tf.Tensor: shape=(4, 2, 3), dtype=float32, numpy=
array([[[1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.]]], dtype=float32)>
In [10]:
test_tensor = tf.constant(np.zeros([2,4,1]), dtype=tf.float32)
custom_model(test_tensor)
Out[10]:
<tf.Tensor: shape=(2, 4, 1), dtype=float32, numpy=
array([[[0.],
        [0.],
        [0.],
        [0.]],

       [[0.],
        [0.],
        [0.],
        [0.]]], dtype=float32)>

Create BentoService with BentoML

In [6]:
%%writefile tensorflow_echo.py

import bentoml
import tensorflow as tf
import numpy as np

from bentoml.artifact import TensorflowSavedModelArtifact
from bentoml.adapters import TfTensorInput


@bentoml.env(pip_dependencies=['tensorflow', 'numpy', 'scikit-learn'])
@bentoml.artifacts([TensorflowSavedModelArtifact('model')])
class EchoServicer(bentoml.BentoService):
    @bentoml.api(input=TfTensorInput())
    def predict(self, tensor):
        outputs = self.artifacts.model(tensor)
        return outputs
Overwriting tensorflow_echo.py
In [7]:
# save model
from tensorflow_echo import EchoServicer
bento_svc = EchoServicer()
bento_svc.pack("model", custom_model)
saved_path = bento_svc.save()
[2020-07-28 16:01:48,920] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle.
WARNING:tensorflow:From /opt/anaconda3/envs/bentoml-dev-py36/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: /tmp/bentoml-temp-z9s76fxu/EchoServicer/artifacts/model_saved_model/assets
[2020-07-28 16:01:59,163] INFO - Detect BentoML installed in development model, copying local BentoML module file to target saved bundle path
running sdist
running egg_info
writing BentoML.egg-info/PKG-INFO
writing dependency_links to BentoML.egg-info/dependency_links.txt
writing entry points to BentoML.egg-info/entry_points.txt
writing requirements to BentoML.egg-info/requires.txt
writing top-level names to BentoML.egg-info/top_level.txt
reading manifest file 'BentoML.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '*~' found anywhere in distribution
warning: no previously-included files matching '*.pyo' found anywhere in distribution
warning: no previously-included files matching '.git' found anywhere in distribution
warning: no previously-included files matching '.ipynb_checkpoints' found anywhere in distribution
warning: no previously-included files matching '__pycache__' found anywhere in distribution
warning: no directories found matching 'bentoml/server/static'
warning: no directories found matching 'bentoml/yatai/web/dist'
no previously-included directories found matching 'e2e_tests'
no previously-included directories found matching 'tests'
no previously-included directories found matching 'benchmark'
writing manifest file 'BentoML.egg-info/SOURCES.txt'
running check
creating BentoML-0.8.3+42.gb8d36b6
creating BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
creating BentoML-0.8.3+42.gb8d36b6/bentoml
creating BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
creating BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
creating BentoML-0.8.3+42.gb8d36b6/bentoml/cli
creating BentoML-0.8.3+42.gb8d36b6/bentoml/clipper
creating BentoML-0.8.3+42.gb8d36b6/bentoml/configuration
creating BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
creating BentoML-0.8.3+42.gb8d36b6/bentoml/handlers
creating BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
creating BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
creating BentoML-0.8.3+42.gb8d36b6/bentoml/server
creating BentoML-0.8.3+42.gb8d36b6/bentoml/utils
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/client
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/__pycache__
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions/__pycache__
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
creating BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/validator
copying files to BentoML-0.8.3+42.gb8d36b6...
copying LICENSE -> BentoML-0.8.3+42.gb8d36b6
copying MANIFEST.in -> BentoML-0.8.3+42.gb8d36b6
copying README.md -> BentoML-0.8.3+42.gb8d36b6
copying pyproject.toml -> BentoML-0.8.3+42.gb8d36b6
copying setup.cfg -> BentoML-0.8.3+42.gb8d36b6
copying setup.py -> BentoML-0.8.3+42.gb8d36b6
copying versioneer.py -> BentoML-0.8.3+42.gb8d36b6
copying BentoML.egg-info/PKG-INFO -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/SOURCES.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/dependency_links.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/entry_points.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/requires.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying BentoML.egg-info/top_level.txt -> BentoML-0.8.3+42.gb8d36b6/BentoML.egg-info
copying bentoml/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/_version.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/exceptions.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/service.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/service_env.py -> BentoML-0.8.3+42.gb8d36b6/bentoml
copying bentoml/adapters/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/base_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/base_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/clipper_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/dataframe_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/dataframe_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/default_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/fastai_image_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/file_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/image_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/json_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/json_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/legacy_image_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/legacy_json_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/multi_image_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/pytorch_tensor_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/tensorflow_tensor_input.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/tensorflow_tensor_output.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/adapters/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/adapters
copying bentoml/artifact/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/fastai2_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/fastai_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/fasttext_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/h2o_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/json_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/keras_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/lightgbm_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/onnx_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/pickle_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/pytorch_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/sklearn_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/spacy_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/text_file_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/tf_savedmodel_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/artifact/xgboost_model_artifact.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/artifact
copying bentoml/cli/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/aws_lambda.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/aws_sagemaker.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/azure_functions.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/bento_management.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/bento_service.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/click_utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/config.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/deployment.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/cli/yatai_service.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/cli
copying bentoml/clipper/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/clipper
copying bentoml/configuration/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration
copying bentoml/configuration/configparser.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration
copying bentoml/configuration/default_bentoml.cfg -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration
copying bentoml/configuration/__pycache__/__init__.cpython-36.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/__init__.cpython-37.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/__init__.cpython-38.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/configparser.cpython-36.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/configparser.cpython-37.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/configuration/__pycache__/configparser.cpython-38.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/configuration/__pycache__
copying bentoml/handlers/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/handlers
copying bentoml/marshal/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
copying bentoml/marshal/dispatcher.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
copying bentoml/marshal/marshal.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
copying bentoml/marshal/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/marshal
copying bentoml/saved_bundle/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/bentoml-init.sh -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/bundler.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/config.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/docker-entrypoint.sh -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/loader.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/pip_pkg.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/py_module_utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/saved_bundle/templates.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/saved_bundle
copying bentoml/server/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/api_server.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/gunicorn_config.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/gunicorn_server.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/instruments.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/marshal_server.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/open_api.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/trace.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/server/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/server
copying bentoml/utils/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/alg.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/benchmark.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/cloudpickle.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/dataframe_util.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/flask_ngrok.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/hybridmethod.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/lazy_loader.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/log.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/s3.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/tempdir.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/utils/usage_stats.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/utils
copying bentoml/yatai/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/alembic.ini -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/db.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/deployment_utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/status.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/yatai_service.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/yatai_service_impl.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai
copying bentoml/yatai/client/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/client
copying bentoml/yatai/client/bento_repository_api.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/client
copying bentoml/yatai/client/deployment_api.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/client
copying bentoml/yatai/deployment/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
copying bentoml/yatai/deployment/operator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
copying bentoml/yatai/deployment/store.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
copying bentoml/yatai/deployment/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment
copying bentoml/yatai/deployment/aws_lambda/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/aws_lambda/download_extra_resources.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/aws_lambda/lambda_app.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/aws_lambda/operator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/aws_lambda/utils.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/aws_lambda
copying bentoml/yatai/deployment/azure_functions/Dockerfile -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/app_init.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/constants.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/host.json -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/local.settings.json -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/operator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/azure_functions/templates.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/azure_functions
copying bentoml/yatai/deployment/sagemaker/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/model_server.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/nginx.conf -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/operator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/serve -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/deployment/sagemaker/wsgi.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/deployment/sagemaker
copying bentoml/yatai/migrations/README -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
copying bentoml/yatai/migrations/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
copying bentoml/yatai/migrations/env.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
copying bentoml/yatai/migrations/script.py.mako -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations
copying bentoml/yatai/migrations/__pycache__/env.cpython-36.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/__pycache__
copying bentoml/yatai/migrations/versions/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions
copying bentoml/yatai/migrations/versions/a6b00ae45279_add_last_updated_at_for_deployments.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions
copying bentoml/yatai/migrations/versions/__pycache__/a6b00ae45279_add_last_updated_at_for_deployments.cpython-36.pyc -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/migrations/versions/__pycache__
copying bentoml/yatai/proto/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/deployment_pb2.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/repository_pb2.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/status_pb2.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/yatai_service_pb2.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/proto/yatai_service_pb2_grpc.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/proto
copying bentoml/yatai/repository/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/base_repository.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/local_repository.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/metadata_store.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/repository.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/repository/s3_repository.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/repository
copying bentoml/yatai/validator/__init__.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/validator
copying bentoml/yatai/validator/deployment_pb_validator.py -> BentoML-0.8.3+42.gb8d36b6/bentoml/yatai/validator
Writing BentoML-0.8.3+42.gb8d36b6/setup.cfg
UPDATING BentoML-0.8.3+42.gb8d36b6/bentoml/_version.py
set BentoML-0.8.3+42.gb8d36b6/bentoml/_version.py to '0.8.3+42.gb8d36b6'
Creating tar archive
removing 'BentoML-0.8.3+42.gb8d36b6' (and everything under it)
[2020-07-28 16:02:00,347] INFO - BentoService bundle 'EchoServicer:20200728160149_E7E0E9' saved to: /home/bentoml/bentoml/repository/EchoServicer/20200728160149_E7E0E9

Test packed BentoML service

In [8]:
bento_svc.predict([1, 2, 3])
Out[8]:
<tf.Tensor: shape=(3,), dtype=int32, numpy=array([1, 2, 3], dtype=int32)>

Use BentoService with BentoML CLI

bentoml get retrieves the service and all of its versions

In [ ]:
!bentoml get EchoServicer

With additional version information, bentoml get will display metadata and additional information

In [10]:
!bentoml get EchoServicer:latest
[2020-07-28 16:02:25,624] INFO - Getting latest version EchoServicer:20200728160149_E7E0E9
{
  "name": "EchoServicer",
  "version": "20200728160149_E7E0E9",
  "uri": {
    "type": "LOCAL",
    "uri": "/home/bentoml/bentoml/repository/EchoServicer/20200728160149_E7E0E9"
  },
  "bentoServiceMetadata": {
    "name": "EchoServicer",
    "version": "20200728160149_E7E0E9",
    "createdAt": "2020-07-28T08:02:00.319914Z",
    "env": {
      "condaEnv": "name: bentoml-EchoServicer\nchannels:\n- defaults\ndependencies:\n- python=3.6.10\n- pip\n",
      "pipDependencies": "tensorflow\nbentoml==0.8.3\nnumpy\nscikit-learn",
      "pythonVersion": "3.6.10",
      "dockerBaseImage": "bentoml/model-server:0.8.3"
    },
    "artifacts": [
      {
        "name": "model",
        "artifactType": "TensorflowSavedModelArtifact"
      }
    ],
    "apis": [
      {
        "name": "predict",
        "inputType": "TfTensorInput",
        "docs": "BentoService inference API 'predict', input: 'TfTensorInput', output: 'DefaultOutput'",
        "inputConfig": {
          "method": "predict",
          "is_batch_input": true
        },
        "outputConfig": {
          "cors": "*"
        },
        "outputType": "DefaultOutput",
        "mbMaxLatency": 10000,
        "mbMaxBatchSize": 2000
      }
    ]
  }
}

Make prediction with CLI is very simple, use bentoml run command to quickly get your prediction result

In [1]:
!bentoml run EchoServicer:latest predict --input='{"instances": [[1, 2]]}'
[2020-07-28 16:03:08,224] INFO - Getting latest version EchoServicer:20200728160149_E7E0E9
[2020-07-28 16:03:09,273] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle.
[2020-07-28 16:03:09,292] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.8.3, but loading from BentoML version 0.8.3+42.gb8d36b6
2020-07-28 16:03:10.932666: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-28 16:03:10.948708: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:10.949104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 16:03:10.949350: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 16:03:10.950906: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 16:03:10.952614: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 16:03:10.953035: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 16:03:10.954697: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 16:03:10.955657: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 16:03:10.959253: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 16:03:10.959392: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:10.959778: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:10.960078: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 16:03:10.960305: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-07-28 16:03:11.085597: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:11.086039: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55696c6c3400 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-28 16:03:11.086060: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1060, Compute Capability 6.1
2020-07-28 16:03:11.086246: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:11.086602: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 16:03:11.086637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 16:03:11.086650: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 16:03:11.086692: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 16:03:11.086717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 16:03:11.086729: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 16:03:11.086754: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 16:03:11.086780: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 16:03:11.086873: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:11.087238: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:11.087564: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 16:03:11.087591: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 16:03:11.088060: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-28 16:03:11.088073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-07-28 16:03:11.088078: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-07-28 16:03:11.088157: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:11.088496: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:11.088811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5084 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-07-28 16:03:11.109453: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2699905000 Hz
2020-07-28 16:03:11.109870: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55696ce812f0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-28 16:03:11.109939: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
[
  [
    1.0,
    2.0
  ]
]

Run REST API server locally

In [2]:
!bentoml serve EchoServicer:latest
[2020-07-28 16:03:41,001] INFO - Getting latest version EchoServicer:20200728160149_E7E0E9
[2020-07-28 16:03:41,002] INFO - Starting BentoML API server in development mode..
[2020-07-28 16:03:42,320] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle.
[2020-07-28 16:03:42,355] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.8.3, but loading from BentoML version 0.8.3+42.gb8d36b6
2020-07-28 16:03:43.931423: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-28 16:03:43.944089: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:43.944466: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 16:03:43.944640: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 16:03:43.946157: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 16:03:43.947497: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 16:03:43.947726: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 16:03:43.949299: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 16:03:43.950235: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 16:03:43.953614: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 16:03:43.953759: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:43.954170: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:43.954469: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 16:03:43.954690: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-07-28 16:03:44.094579: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:44.095138: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55818597c7e0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-28 16:03:44.095173: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1060, Compute Capability 6.1
2020-07-28 16:03:44.095357: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:44.095697: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 16:03:44.095749: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 16:03:44.095763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 16:03:44.095775: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 16:03:44.095787: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 16:03:44.095800: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 16:03:44.095813: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 16:03:44.095826: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 16:03:44.095882: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:44.096222: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:44.096555: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 16:03:44.096583: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 16:03:44.097016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-28 16:03:44.097029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-07-28 16:03:44.097035: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-07-28 16:03:44.097119: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:44.097584: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:03:44.097926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5080 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-07-28 16:03:44.117365: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2699905000 Hz
2020-07-28 16:03:44.117892: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558185a80e30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-28 16:03:44.117932: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
 * Serving Flask app "EchoServicer" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
^C

Send prediction request to REST API server

Run the following command in terminal to make a HTTP request to the API server

curl -i \
--header "Content-Type: application/json" \
--request POST \
--data '{"instances": [[1, 2]]}' \
localhost:5000/predict
In [3]:
import requests
import json
headers = {"content-type": "application/json"}
data = json.dumps(
    {"instances": [[1, 2, 2, 3], [2, 3, 3, 4]]}
)
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))
json_response = requests.post(f'http://127.0.0.1:5000/predict', data=data, headers=headers)
print(json_response)
print(json_response.text)
Data: {"instances": [[1, 2, 2, 3], [2, 3, 3, 4]]} ... , 3, 4]]}
<Response [200]>
[[1.0, 2.0, 2.0, 3.0], [2.0, 3.0, 3.0, 4.0]]
In [ ]:
 

"pip install" a BentoService bundle

BentoML user can directly pip install saved BentoML archive with pip install $SAVED_PATH, and use it as a regular python package.

In [4]:
!pip install -q {saved_path}
In [5]:
import EchoServicer

pip_installed_svc = EchoServicer.load()
In [11]:
pip_installed_svc.predict(test_tensor)
Out[11]:
<tf.Tensor: shape=(2, 4, 1), dtype=float32, numpy=
array([[[0.],
        [0.],
        [0.],
        [0.]],

       [[0.],
        [0.],
        [0.],
        [0.]]], dtype=float32)>

CLI access

pip install $SAVED_PATH also installs a CLI tool for accessing the BentoML service

In [12]:
!EchoServicer --help
Usage: EchoServicer [OPTIONS] COMMAND [ARGS]...

  BentoML CLI tool

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  containerize        Containerizes given Bento into a ready-to-use Docker
                      image

  info                List APIs
  install-completion  Install shell command completion
  open-api-spec       Display OpenAPI/Swagger JSON specs
  run                 Run API function
  serve               Start local dev API server
  serve-gunicorn      Start production API server
In [13]:
!EchoServicer info
{
  "name": "EchoServicer",
  "version": "20200728160149_E7E0E9",
  "created_at": "2020-07-28T08:01:59.060883Z",
  "env": {
    "conda_env": "name: bentoml-EchoServicer\nchannels:\n- defaults\ndependencies:\n- python=3.6.10\n- pip\n",
    "pip_dependencies": "tensorflow\nbentoml==0.8.3\nnumpy\nscikit-learn",
    "python_version": "3.6.10",
    "docker_base_image": "bentoml/model-server:0.8.3"
  },
  "artifacts": [
    {
      "name": "model",
      "artifact_type": "TensorflowSavedModelArtifact"
    }
  ],
  "apis": [
    {
      "name": "predict",
      "input_type": "TfTensorInput",
      "docs": "BentoService inference API 'predict', input: 'TfTensorInput', output: 'DefaultOutput'",
      "input_config": {
        "method": "predict",
        "is_batch_input": true
      },
      "output_config": {
        "cors": "*"
      },
      "output_type": "DefaultOutput",
      "mb_max_latency": 10000,
      "mb_max_batch_size": 2000
    }
  ]
}

Run 'predict' api with json data:

In [1]:
!EchoServicer run predict --input='{"instances": [[1, 2]]}'
2020-07-28 16:28:36.115351: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-28 16:28:36.128965: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:28:36.129383: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 16:28:36.129541: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 16:28:36.130790: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 16:28:36.132114: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 16:28:36.132347: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 16:28:36.133871: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 16:28:36.134690: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 16:28:36.137855: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 16:28:36.137977: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:28:36.138373: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:28:36.138682: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 16:28:36.138912: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-07-28 16:28:36.165337: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2699905000 Hz
2020-07-28 16:28:36.165633: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x557bad7b7bf0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-28 16:28:36.165663: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-28 16:28:36.165858: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:28:36.166249: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-07-28 16:28:36.166294: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 16:28:36.166309: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-28 16:28:36.166322: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-28 16:28:36.166334: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-28 16:28:36.166346: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-28 16:28:36.166358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-28 16:28:36.166372: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-28 16:28:36.166431: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:28:36.166996: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:28:36.167415: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-28 16:28:36.167460: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-28 16:28:36.281505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-28 16:28:36.281531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-07-28 16:28:36.281539: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-07-28 16:28:36.281686: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:28:36.282064: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:28:36.282412: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-28 16:28:36.282752: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5079 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-07-28 16:28:36.284369: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x557bae541ab0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-28 16:28:36.284388: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1060, Compute Capability 6.1
[
  [
    1.0,
    2.0
  ]
]

Deploy BentoService as REST API server to the cloud

BentoML support deployment to multiply cloud provider services, such as AWS Lambda, AWS Sagemaker, Google Cloudrun and etc. You can find the full list and guide on the documentation site at https://docs.bentoml.org/en/latest/deployment/index.html

For this demo, we are going to deploy to AWS Sagemaker

In [49]:
!bentoml sagemaker deploy tf2-echo -b EchoServicer:latest --api-name predict
Deploying Sagemaker deployment -[2020-02-24 14:16:43,609] INFO - Step 1/11 : FROM continuumio/miniconda3:4.7.12
[2020-02-24 14:16:43,610] INFO - 

[2020-02-24 14:16:43,611] INFO -  ---> 406f2b43ea59

[2020-02-24 14:16:43,611] INFO - Step 2/11 : EXPOSE 8080
[2020-02-24 14:16:43,611] INFO - 

[2020-02-24 14:16:43,611] INFO -  ---> Using cache

[2020-02-24 14:16:43,611] INFO -  ---> 58636f0540f4

[2020-02-24 14:16:43,612] INFO - Step 3/11 : RUN set -x      && apt-get update      && apt-get install --no-install-recommends --no-install-suggests -y libpq-dev build-essential     && apt-get install -y nginx      && rm -rf /var/lib/apt/lists/*
[2020-02-24 14:16:43,612] INFO - 

[2020-02-24 14:16:43,612] INFO -  ---> Using cache

[2020-02-24 14:16:43,612] INFO -  ---> 70d334258584

[2020-02-24 14:16:43,612] INFO - Step 4/11 : RUN conda install pip numpy scipy       && pip install gunicorn gevent
[2020-02-24 14:16:43,612] INFO - 

[2020-02-24 14:16:43,612] INFO -  ---> Using cache

[2020-02-24 14:16:43,612] INFO -  ---> 0ebedc0aed5a

[2020-02-24 14:16:43,613] INFO - Step 5/11 : COPY . /opt/program
[2020-02-24 14:16:43,613] INFO - 

/[2020-02-24 14:16:44,052] INFO -  ---> 6a4c7ac8bd07

[2020-02-24 14:16:44,052] INFO - Step 6/11 : WORKDIR /opt/program
[2020-02-24 14:16:44,052] INFO - 

|[2020-02-24 14:16:44,179] INFO -  ---> Running in 1e660d5fb48d

/[2020-02-24 14:16:44,465] INFO -  ---> bbb45d19b9c3

[2020-02-24 14:16:44,465] INFO - Step 7/11 : RUN conda env update -n base -f /opt/program/environment.yml
[2020-02-24 14:16:44,465] INFO - 

|[2020-02-24 14:16:44,622] INFO -  ---> Running in 558466f27860

/[2020-02-24 14:16:46,924] INFO - Collecting package metadata (repodata.json): 
[2020-02-24 14:16:46,924] INFO - ...working... 
/[2020-02-24 14:16:53,104] INFO - done
Solving environment: ...working... 
-[2020-02-24 14:16:59,583] INFO - done

/[2020-02-24 14:16:59,699] INFO - 
Downloading and Extracting Packages
python-3.7.3         | 32.1 MB   |            |   0% 
python-3.7.3         | 32.1 MB   |            |   0% 
python-3.7.3         | 32.1 MB   | 2          |   3% 
python-3.7.3         | 32.1 MB   | 3          |   4% 
python-3.7.3         | 32.1 MB   | 6          |   7% 
python-3.7.3         | 32.1 MB   | #          |  11% 
python-3.7.3         | 32.1 MB   | #4         |  15% 
python-3.7.3         | 32.1 MB   | #7         |  18% 
python-3.7.3         | 32.1 MB   | ##1        |  21% 
python-3.7.3         | 32.1 MB   | ##4        |  25% 
python-3.7.3         | 32.1 MB   | ##8        |  28% 
python-3.7.3         | 32.1 MB   | ###1       |  32% 
python-3.7.3         | 32.1 MB   | ###5       |  35% 
python-3.7.3         | 32.1 MB   | ###8       |  39% 
python-3.7.3         | 32.1 MB   | ####2      |  42% 
python-3.7.3         | 32.1 MB   | ####5      |  46% 
python-3.7.3         | 32.1 MB   | ####9      |  49% 
python-3.7.3         | 32.1 MB   | #####2     |  53% 
python-3.7.3         | 32.1 MB   | #####6     |  56% 
python-3.7.3         | 32.1 MB   | ######     |  60% 
python-3.7.3         | 32.1 MB   | ######3    |  64% 
python-3.7.3         | 32.1 MB   | ######7    |  68% 
python-3.7.3         | 32.1 MB   | #######1   |  71% 
python-3.7.3         | 32.1 MB   | #######4   |  75% 
python-3.7.3         | 32.1 MB   | #######8   |  79% 
python-3.7.3         | 32.1 MB   | ########2  |  82% 
python-3.7.3         | 32.1 MB   | ########5  |  86% 
python-3.7.3         | 32.1 MB   | ########9  |  89% 
python-3.7.3         | 32.1 MB   | #########3 |  93% 
python-3.7.3         | 32.1 MB   | #########6 |  97% 
python-3.7.3         | 32.1 MB   | ########## | 100% 
[2020-02-24 14:17:03,854] INFO - 
Preparing transaction: ...working... 
|[2020-02-24 14:17:03,970] INFO - done

[2020-02-24 14:17:03,970] INFO - Verifying transaction: ...working... 
\[2020-02-24 14:17:04,839] INFO - done

[2020-02-24 14:17:04,839] INFO - Executing transaction: ...working... 
\[2020-02-24 14:17:08,969] INFO - done

-[2020-02-24 14:17:09,459] INFO - #
# To activate this environment, use
#
#     $ conda activate base
#
# To deactivate an active environment, use
#
#     $ conda deactivate


\[2020-02-24 14:17:12,632] INFO -  ---> 39cfe1acea9f

[2020-02-24 14:17:12,632] INFO - Step 8/11 : RUN pip install -r /opt/program/requirements.txt
[2020-02-24 14:17:12,632] INFO - 

-[2020-02-24 14:17:12,789] INFO -  ---> Running in de4a91384130

-[2020-02-24 14:17:14,833] INFO - Collecting bentoml==0.6.2

/[2020-02-24 14:17:14,908] INFO -   Downloading BentoML-0.6.2-py3-none-any.whl (554 kB)

\[2020-02-24 14:17:15,530] INFO - Collecting tensorflow

[2020-02-24 14:17:15,543] INFO -   Downloading tensorflow-2.1.0-cp37-cp37m-manylinux2010_x86_64.whl (421.8 MB)

-[2020-02-24 14:18:47,097] INFO - Requirement already satisfied: numpy in /opt/conda/lib/python3.7/site-packages (from -r /opt/program/requirements.txt (line 3)) (1.18.1)

\[2020-02-24 14:18:47,462] INFO - Collecting scikit-learn

[2020-02-24 14:18:47,474] INFO -   Downloading scikit_learn-0.22.1-cp37-cp37m-manylinux1_x86_64.whl (7.0 MB)

-[2020-02-24 14:18:49,551] INFO - Collecting Pillow

[2020-02-24 14:18:49,566] INFO -   Downloading Pillow-7.0.0-cp37-cp37m-manylinux1_x86_64.whl (2.1 MB)

-[2020-02-24 14:18:49,999] INFO - Collecting cerberus

[2020-02-24 14:18:50,011] INFO -   Downloading Cerberus-1.3.2.tar.gz (52 kB)

|[2020-02-24 14:18:50,563] INFO - Collecting tabulate

[2020-02-24 14:18:50,573] INFO -   Downloading tabulate-0.8.6.tar.gz (45 kB)

/[2020-02-24 14:18:50,953] INFO - Collecting sqlalchemy>=1.3.0

[2020-02-24 14:18:50,964] INFO -   Downloading SQLAlchemy-1.3.13.tar.gz (6.0 MB)

\[2020-02-24 14:18:52,799] INFO - Requirement already satisfied: gunicorn in /opt/conda/lib/python3.7/site-packages (from bentoml==0.6.2->-r /opt/program/requirements.txt (line 1)) (20.0.4)

-[2020-02-24 14:18:53,306] INFO - Collecting protobuf>=3.6.0

[2020-02-24 14:18:53,319] INFO -   Downloading protobuf-3.11.3-cp37-cp37m-manylinux1_x86_64.whl (1.3 MB)

\[2020-02-24 14:18:53,597] INFO - Collecting click>=7.0

[2020-02-24 14:18:53,608] INFO -   Downloading Click-7.0-py2.py3-none-any.whl (81 kB)

-[2020-02-24 14:18:53,708] INFO - Collecting packaging

[2020-02-24 14:18:53,719] INFO -   Downloading packaging-20.1-py2.py3-none-any.whl (36 kB)

/[2020-02-24 14:18:54,608] INFO - Collecting grpcio

[2020-02-24 14:18:54,635] INFO -   Downloading grpcio-1.27.2-cp37-cp37m-manylinux2010_x86_64.whl (2.7 MB)

|[2020-02-24 14:18:55,160] INFO - Collecting prometheus-client

[2020-02-24 14:18:55,172] INFO -   Downloading prometheus_client-0.7.1.tar.gz (38 kB)

-[2020-02-24 14:18:55,737] INFO - Collecting pandas

[2020-02-24 14:18:55,749] INFO -   Downloading pandas-1.0.1-cp37-cp37m-manylinux1_x86_64.whl (10.1 MB)

\[2020-02-24 14:18:58,468] INFO - Collecting ruamel.yaml>=0.15.0

[2020-02-24 14:18:58,479] INFO -   Downloading ruamel.yaml-0.16.10-py2.py3-none-any.whl (111 kB)

-[2020-02-24 14:18:58,607] INFO - Collecting docker

[2020-02-24 14:18:58,618] INFO -   Downloading docker-4.2.0-py2.py3-none-any.whl (143 kB)

|[2020-02-24 14:18:58,790] INFO - Collecting python-dateutil<2.8.1,>=2.1

[2020-02-24 14:18:58,803] INFO -   Downloading python_dateutil-2.8.0-py2.py3-none-any.whl (226 kB)

-[2020-02-24 14:18:58,990] INFO - Collecting humanfriendly

[2020-02-24 14:18:59,003] INFO -   Downloading humanfriendly-7.1.1-py2.py3-none-any.whl (77 kB)

[2020-02-24 14:18:59,078] INFO - Collecting python-json-logger

/[2020-02-24 14:18:59,088] INFO -   Downloading python-json-logger-0.1.11.tar.gz (6.0 kB)

/[2020-02-24 14:18:59,958] INFO - Collecting boto3

[2020-02-24 14:18:59,970] INFO -   Downloading boto3-1.12.6-py2.py3-none-any.whl (128 kB)

|[2020-02-24 14:19:00,077] INFO - Collecting configparser

[2020-02-24 14:19:00,088] INFO -   Downloading configparser-4.0.2-py2.py3-none-any.whl (22 kB)

\[2020-02-24 14:19:00,162] INFO - Collecting flask

[2020-02-24 14:19:00,174] INFO -   Downloading Flask-1.1.1-py2.py3-none-any.whl (94 kB)

-[2020-02-24 14:19:00,291] INFO - Collecting alembic

[2020-02-24 14:19:00,303] INFO -   Downloading alembic-1.4.0.tar.gz (1.1 MB)

\[2020-02-24 14:19:00,940] INFO - Requirement already satisfied: requests in /opt/conda/lib/python3.7/site-packages (from bentoml==0.6.2->-r /opt/program/requirements.txt (line 1)) (2.22.0)

[2020-02-24 14:19:00,991] INFO - Collecting keras-preprocessing>=1.1.0

[2020-02-24 14:19:01,003] INFO -   Downloading Keras_Preprocessing-1.1.0-py2.py3-none-any.whl (41 kB)

-[2020-02-24 14:19:01,085] INFO - Collecting tensorflow-estimator<2.2.0,>=2.1.0rc0

[2020-02-24 14:19:01,098] INFO -   Downloading tensorflow_estimator-2.1.0-py2.py3-none-any.whl (448 kB)

|[2020-02-24 14:19:01,290] INFO - Collecting keras-applications>=1.0.8

[2020-02-24 14:19:01,305] INFO -   Downloading Keras_Applications-1.0.8-py3-none-any.whl (50 kB)

[2020-02-24 14:19:01,341] INFO - Requirement already satisfied: scipy==1.4.1; python_version >= "3" in /opt/conda/lib/python3.7/site-packages (from tensorflow->-r /opt/program/requirements.txt (line 2)) (1.4.1)

\[2020-02-24 14:19:01,407] INFO - Collecting tensorboard<2.2.0,>=2.1.0

[2020-02-24 14:19:01,427] INFO -   Downloading tensorboard-2.1.0-py3-none-any.whl (3.8 MB)

\[2020-02-24 14:19:02,243] INFO - Collecting wrapt>=1.11.1

[2020-02-24 14:19:02,254] INFO -   Downloading wrapt-1.12.0.tar.gz (27 kB)

\[2020-02-24 14:19:02,578] INFO - Requirement already satisfied: six>=1.12.0 in /opt/conda/lib/python3.7/site-packages (from tensorflow->-r /opt/program/requirements.txt (line 2)) (1.12.0)

[2020-02-24 14:19:02,582] INFO - Requirement already satisfied: wheel>=0.26; python_version >= "3" in /opt/conda/lib/python3.7/site-packages (from tensorflow->-r /opt/program/requirements.txt (line 2)) (0.33.6)

-[2020-02-24 14:19:02,703] INFO - Collecting termcolor>=1.1.0

[2020-02-24 14:19:02,715] INFO -   Downloading termcolor-1.1.0.tar.gz (3.9 kB)

/[2020-02-24 14:19:03,251] INFO - Collecting google-pasta>=0.1.6

[2020-02-24 14:19:03,264] INFO -   Downloading google_pasta-0.1.8-py3-none-any.whl (57 kB)

|[2020-02-24 14:19:03,326] INFO - Collecting absl-py>=0.7.0

[2020-02-24 14:19:03,337] INFO -   Downloading absl-py-0.9.0.tar.gz (104 kB)

/[2020-02-24 14:19:03,672] INFO - Collecting opt-einsum>=2.3.2

[2020-02-24 14:19:03,687] INFO -   Downloading opt_einsum-3.1.0.tar.gz (69 kB)

/[2020-02-24 14:19:04,019] INFO - Collecting gast==0.2.2

[2020-02-24 14:19:04,030] INFO -   Downloading gast-0.2.2.tar.gz (10 kB)

-[2020-02-24 14:19:04,318] INFO - Collecting astor>=0.6.0

[2020-02-24 14:19:04,335] INFO -   Downloading astor-0.8.1-py2.py3-none-any.whl (27 kB)

/[2020-02-24 14:19:04,448] INFO - Collecting joblib>=0.11

[2020-02-24 14:19:04,461] INFO -   Downloading joblib-0.14.1-py2.py3-none-any.whl (294 kB)

|[2020-02-24 14:19:04,570] INFO - Requirement already satisfied: setuptools in /opt/conda/lib/python3.7/site-packages (from cerberus->bentoml==0.6.2->-r /opt/program/requirements.txt (line 1)) (41.4.0)

\[2020-02-24 14:19:04,714] INFO - Collecting pyparsing>=2.0.2

[2020-02-24 14:19:04,726] INFO -   Downloading pyparsing-2.4.6-py2.py3-none-any.whl (67 kB)

|[2020-02-24 14:19:04,971] INFO - Collecting pytz>=2017.2

[2020-02-24 14:19:04,981] INFO -   Downloading pytz-2019.3-py2.py3-none-any.whl (509 kB)

/[2020-02-24 14:19:05,256] INFO - Collecting ruamel.yaml.clib>=0.1.2; platform_python_implementation == "CPython" and python_version < "3.9"

[2020-02-24 14:19:05,269] INFO -   Downloading ruamel.yaml.clib-0.2.0-cp37-cp37m-manylinux1_x86_64.whl (547 kB)

|[2020-02-24 14:19:05,439] INFO - Collecting websocket-client>=0.32.0

[2020-02-24 14:19:05,450] INFO -   Downloading websocket_client-0.57.0-py2.py3-none-any.whl (200 kB)

\[2020-02-24 14:19:05,539] INFO - Collecting s3transfer<0.4.0,>=0.3.0

[2020-02-24 14:19:05,552] INFO -   Downloading s3transfer-0.3.3-py2.py3-none-any.whl (69 kB)

-[2020-02-24 14:19:05,640] INFO - Collecting jmespath<1.0.0,>=0.7.1

[2020-02-24 14:19:05,652] INFO -   Downloading jmespath-0.9.5-py2.py3-none-any.whl (24 kB)

|[2020-02-24 14:19:06,259] INFO - Collecting botocore<1.16.0,>=1.15.6

[2020-02-24 14:19:06,272] INFO -   Downloading botocore-1.15.6-py2.py3-none-any.whl (5.9 MB)

|[2020-02-24 14:19:07,870] INFO - Collecting Werkzeug>=0.15

[2020-02-24 14:19:07,882] INFO -   Downloading Werkzeug-1.0.0-py2.py3-none-any.whl (298 kB)

\[2020-02-24 14:19:08,007] INFO - Collecting itsdangerous>=0.24

-[2020-02-24 14:19:08,018] INFO -   Downloading itsdangerous-1.1.0-py2.py3-none-any.whl (16 kB)

[2020-02-24 14:19:08,081] INFO - Collecting Jinja2>=2.10.1

[2020-02-24 14:19:08,093] INFO -   Downloading Jinja2-2.11.1-py2.py3-none-any.whl (126 kB)

|[2020-02-24 14:19:08,223] INFO - Collecting Mako

[2020-02-24 14:19:08,236] INFO -   Downloading Mako-1.1.1.tar.gz (468 kB)

\[2020-02-24 14:19:08,734] INFO - Collecting python-editor>=0.3

[2020-02-24 14:19:08,744] INFO -   Downloading python_editor-1.0.4-py3-none-any.whl (4.9 kB)

[2020-02-24 14:19:08,766] INFO - Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.7/site-packages (from requests->bentoml==0.6.2->-r /opt/program/requirements.txt (line 1)) (1.24.2)

[2020-02-24 14:19:08,775] INFO - Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/lib/python3.7/site-packages (from requests->bentoml==0.6.2->-r /opt/program/requirements.txt (line 1)) (2.8)

[2020-02-24 14:19:08,778] INFO - Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.7/site-packages (from requests->bentoml==0.6.2->-r /opt/program/requirements.txt (line 1)) (3.0.4)

[2020-02-24 14:19:08,782] INFO - Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.7/site-packages (from requests->bentoml==0.6.2->-r /opt/program/requirements.txt (line 1)) (2019.11.28)

-[2020-02-24 14:19:08,890] INFO - Collecting h5py

[2020-02-24 14:19:08,909] INFO -   Downloading h5py-2.10.0-cp37-cp37m-manylinux1_x86_64.whl (2.9 MB)

/[2020-02-24 14:19:09,777] INFO - Collecting markdown>=2.6.8

[2020-02-24 14:19:09,790] INFO -   Downloading Markdown-3.2.1-py2.py3-none-any.whl (88 kB)

|[2020-02-24 14:19:09,925] INFO - Collecting google-auth<2,>=1.6.3

[2020-02-24 14:19:09,941] INFO -   Downloading google_auth-1.11.2-py2.py3-none-any.whl (76 kB)

\[2020-02-24 14:19:10,011] INFO - Collecting google-auth-oauthlib<0.5,>=0.4.1

[2020-02-24 14:19:10,024] INFO -   Downloading google_auth_oauthlib-0.4.1-py2.py3-none-any.whl (18 kB)

-[2020-02-24 14:19:10,104] INFO - Collecting docutils<0.16,>=0.10

[2020-02-24 14:19:10,124] INFO -   Downloading docutils-0.15.2-py3-none-any.whl (547 kB)

|[2020-02-24 14:19:10,344] INFO - Collecting MarkupSafe>=0.23

[2020-02-24 14:19:10,366] INFO -   Downloading MarkupSafe-1.1.1-cp37-cp37m-manylinux1_x86_64.whl (27 kB)

\[2020-02-24 14:19:10,426] INFO - Collecting cachetools<5.0,>=2.0.0

[2020-02-24 14:19:10,440] INFO -   Downloading cachetools-4.0.0-py3-none-any.whl (10 kB)

-[2020-02-24 14:19:10,516] INFO - Collecting rsa<4.1,>=3.1.4

[2020-02-24 14:19:10,559] INFO -   Downloading rsa-4.0-py2.py3-none-any.whl (38 kB)

|[2020-02-24 14:19:10,680] INFO - Collecting pyasn1-modules>=0.2.1

[2020-02-24 14:19:10,692] INFO -   Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)

\[2020-02-24 14:19:10,797] INFO - Collecting requests-oauthlib>=0.7.0

[2020-02-24 14:19:10,809] INFO -   Downloading requests_oauthlib-1.3.0-py2.py3-none-any.whl (23 kB)

/[2020-02-24 14:19:11,043] INFO - Collecting pyasn1>=0.1.3

[2020-02-24 14:19:11,059] INFO -   Downloading pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)

|[2020-02-24 14:19:11,136] INFO - Collecting oauthlib>=3.0.0

[2020-02-24 14:19:11,150] INFO -   Downloading oauthlib-3.1.0-py2.py3-none-any.whl (147 kB)

\[2020-02-24 14:19:11,240] INFO - Building wheels for collected packages: cerberus, tabulate, sqlalchemy, prometheus-client, python-json-logger, alembic, wrapt, termcolor, absl-py, opt-einsum, gast, Mako

[2020-02-24 14:19:11,242] INFO -   Building wheel for cerberus (setup.py): started

\[2020-02-24 14:19:11,646] INFO -   Building wheel for cerberus (setup.py): finished with status 'done'

[2020-02-24 14:19:11,647] INFO -   Created wheel for cerberus: filename=Cerberus-1.3.2-py3-none-any.whl size=54335 sha256=69082db526b62f5f2d899cb19224c220d2d92de7c93df152219a577c0ada6c78

[2020-02-24 14:19:11,647] INFO -   Stored in directory: /root/.cache/pip/wheels/17/3a/0d/e2fc48cf85cb858f5e65f1baa36180ebb5dce6397c35c4cfcb

[2020-02-24 14:19:11,650] INFO -   Building wheel for tabulate (setup.py): started

|[2020-02-24 14:19:11,924] INFO -   Building wheel for tabulate (setup.py): finished with status 'done'

[2020-02-24 14:19:11,924] INFO -   Created wheel for tabulate: filename=tabulate-0.8.6-py3-none-any.whl size=23273 sha256=8e0cf6353fa4364eadb738ec3970154e906822898926396375f3c73733277ea9

[2020-02-24 14:19:11,925] INFO -   Stored in directory: /root/.cache/pip/wheels/09/b6/7e/08b4ee715a1239453e89a59081f0ac369a9036f232e013ecd8

[2020-02-24 14:19:11,927] INFO -   Building wheel for sqlalchemy (setup.py): started

|[2020-02-24 14:19:13,569] INFO -   Building wheel for sqlalchemy (setup.py): finished with status 'done'

[2020-02-24 14:19:13,575] INFO -   Created wheel for sqlalchemy: filename=SQLAlchemy-1.3.13-cp37-cp37m-linux_x86_64.whl size=1223713 sha256=4683e575af82487b2bc25ad4d6c4e115e0c416c8ed7a410f25f4487c8584db3c
  Stored in directory: /root/.cache/pip/wheels/b9/ba/77/163f10f14bd489351530603e750c195b0ceceed2f3be2b32f1

[2020-02-24 14:19:13,577] INFO -   Building wheel for prometheus-client (setup.py): started

/[2020-02-24 14:19:13,863] INFO -   Building wheel for prometheus-client (setup.py): finished with status 'done'

[2020-02-24 14:19:13,863] INFO -   Created wheel for prometheus-client: filename=prometheus_client-0.7.1-py3-none-any.whl size=41402 sha256=84326758be7a3885594d03780341bac8a07951f2c381c813643ffb45ab6750b8

[2020-02-24 14:19:13,863] INFO -   Stored in directory: /root/.cache/pip/wheels/30/0c/26/59ba285bf65dc79d195e9b25e2ddde4c61070422729b0cd914

[2020-02-24 14:19:13,865] INFO -   Building wheel for python-json-logger (setup.py): started

\[2020-02-24 14:19:14,131] INFO -   Building wheel for python-json-logger (setup.py): finished with status 'done'
  Created wheel for python-json-logger: filename=python_json_logger-0.1.11-py2.py3-none-any.whl size=5076 sha256=9d4765cbf094e35ec5c3f556d9d9adfc6b6e4b3397e9c7e5c3e60dfac9f37229
  Stored in directory: /root/.cache/pip/wheels/fa/7f/fd/92ccdbb9d1a65486406e0363d2ba5b4ce52f400a915f602ecb

[2020-02-24 14:19:14,131] INFO -   Building wheel for alembic (setup.py): started

\[2020-02-24 14:19:14,513] INFO -   Building wheel for alembic (setup.py): finished with status 'done'

[2020-02-24 14:19:14,513] INFO -   Created wheel for alembic: filename=alembic-1.4.0-py2.py3-none-any.whl size=157563 sha256=e54d11246d0705c4cb4e4e44f07155b14aa6f3726e680cb02f3f15e58f4af3bf

[2020-02-24 14:19:14,514] INFO -   Stored in directory: /root/.cache/pip/wheels/33/a9/f9/a53f885636269db5b76cf7afa3a1ab86d9d2fe96610d09274e

[2020-02-24 14:19:14,516] INFO -   Building wheel for wrapt (setup.py): started

/[2020-02-24 14:19:15,512] INFO -   Building wheel for wrapt (setup.py): finished with status 'done'

[2020-02-24 14:19:15,514] INFO -   Created wheel for wrapt: filename=wrapt-1.12.0-cp37-cp37m-linux_x86_64.whl size=76405 sha256=f6e20708af6848bc1e885f0f4c63965a742327c10123d0865967c9c7f9ea3713
  Stored in directory: /root/.cache/pip/wheels/e5/78/69/f40ab7cae531c8f07003a9d1b4b81ebec14cda95519c57e7dd

[2020-02-24 14:19:15,516] INFO -   Building wheel for termcolor (setup.py): started

-[2020-02-24 14:19:15,800] INFO -   Building wheel for termcolor (setup.py): finished with status 'done'

[2020-02-24 14:19:15,801] INFO -   Created wheel for termcolor: filename=termcolor-1.1.0-py3-none-any.whl size=4830 sha256=d5c64a354cfd7f0d880a9f820f51106ac9dc75e1dfa34740cad13063a28371a6
  Stored in directory: /root/.cache/pip/wheels/3f/e3/ec/8a8336ff196023622fbcb36de0c5a5c218cbb24111d1d4c7f2

[2020-02-24 14:19:15,804] INFO -   Building wheel for absl-py (setup.py): started

|[2020-02-24 14:19:16,099] INFO -   Building wheel for absl-py (setup.py): finished with status 'done'

[2020-02-24 14:19:16,100] INFO -   Created wheel for absl-py: filename=absl_py-0.9.0-py3-none-any.whl size=121931 sha256=ec9579dddb9f23581938db1c92b20791018b90d28c13319b8273ebafb32ba47e
  Stored in directory: /root/.cache/pip/wheels/cc/af/1a/498a24d0730ef484019e007bb9e8cef3ac00311a672c049a3e

[2020-02-24 14:19:16,102] INFO -   Building wheel for opt-einsum (setup.py): started

/[2020-02-24 14:19:16,414] INFO -   Building wheel for opt-einsum (setup.py): finished with status 'done'

[2020-02-24 14:19:16,415] INFO -   Created wheel for opt-einsum: filename=opt_einsum-3.1.0-py3-none-any.whl size=61681 sha256=3d87fb11189608acc9aa3e2908a18afccdab9c55bfe3147e895f578bd765ccf3

[2020-02-24 14:19:16,415] INFO -   Stored in directory: /root/.cache/pip/wheels/21/e3/31/0d3919995e859eff01713d381aac3b6b43c69915a2942e5c65

|[2020-02-24 14:19:16,418] INFO -   Building wheel for gast (setup.py): started

-[2020-02-24 14:19:16,708] INFO -   Building wheel for gast (setup.py): finished with status 'done'

[2020-02-24 14:19:16,709] INFO -   Created wheel for gast: filename=gast-0.2.2-py3-none-any.whl size=7539 sha256=f33f5a862e18d58e2545aeae152af84c2e2a5db3cf494d6b2af0a7f46ba8f4b3

[2020-02-24 14:19:16,709] INFO -   Stored in directory: /root/.cache/pip/wheels/21/7f/02/420f32a803f7d0967b48dd823da3f558c5166991bfd204eef3

[2020-02-24 14:19:16,715] INFO -   Building wheel for Mako (setup.py): started

-[2020-02-24 14:19:17,040] INFO -   Building wheel for Mako (setup.py): finished with status 'done'

[2020-02-24 14:19:17,042] INFO -   Created wheel for Mako: filename=Mako-1.1.1-py3-none-any.whl size=75409 sha256=206afd9d16a54c9c57a34b5a945b750cf0813dd7bd7c4bb7db16aa86012aff3f

[2020-02-24 14:19:17,042] INFO -   Stored in directory: /root/.cache/pip/wheels/11/fe/fa/3693b62cf5ec2b2784b6496734f0ee3e2321eb66d66607e5f9

[2020-02-24 14:19:17,043] INFO - Successfully built cerberus tabulate sqlalchemy prometheus-client python-json-logger alembic wrapt termcolor absl-py opt-einsum gast Mako

/[2020-02-24 14:19:17,594] INFO - Installing collected packages: cerberus, tabulate, sqlalchemy, protobuf, click, pyparsing, packaging, grpcio, prometheus-client, python-dateutil, pytz, pandas, ruamel.yaml.clib, ruamel.yaml, websocket-client, docker, humanfriendly, python-json-logger, docutils, jmespath, botocore, s3transfer, boto3, configparser, Werkzeug, itsdangerous, MarkupSafe, Jinja2, flask, Mako, python-editor, alembic, bentoml, keras-preprocessing, tensorflow-estimator, h5py, keras-applications, absl-py, markdown, cachetools, pyasn1, rsa, pyasn1-modules, google-auth, oauthlib, requests-oauthlib, google-auth-oauthlib, tensorboard, wrapt, termcolor, google-pasta, opt-einsum, gast, astor, tensorflow, joblib, scikit-learn, Pillow

\[2020-02-24 14:20:04,607] INFO - Successfully installed Jinja2-2.11.1 Mako-1.1.1 MarkupSafe-1.1.1 Pillow-7.0.0 Werkzeug-1.0.0 absl-py-0.9.0 alembic-1.4.0 astor-0.8.1 bentoml-0.6.2 boto3-1.12.6 botocore-1.15.6 cachetools-4.0.0 cerberus-1.3.2 click-7.0 configparser-4.0.2 docker-4.2.0 docutils-0.15.2 flask-1.1.1 gast-0.2.2 google-auth-1.11.2 google-auth-oauthlib-0.4.1 google-pasta-0.1.8 grpcio-1.27.2 h5py-2.10.0 humanfriendly-7.1.1 itsdangerous-1.1.0 jmespath-0.9.5 joblib-0.14.1 keras-applications-1.0.8 keras-preprocessing-1.1.0 markdown-3.2.1 oauthlib-3.1.0 opt-einsum-3.1.0 packaging-20.1 pandas-1.0.1 prometheus-client-0.7.1 protobuf-3.11.3 pyasn1-0.4.8 pyasn1-modules-0.2.8 pyparsing-2.4.6 python-dateutil-2.8.0 python-editor-1.0.4 python-json-logger-0.1.11 pytz-2019.3 requests-oauthlib-1.3.0 rsa-4.0 ruamel.yaml-0.16.10 ruamel.yaml.clib-0.2.0 s3transfer-0.3.3 scikit-learn-0.22.1 sqlalchemy-1.3.13 tabulate-0.8.6 tensorboard-2.1.0 tensorflow-2.1.0 tensorflow-estimator-2.1.0 termcolor-1.1.0 websocket-client-0.57.0 wrapt-1.12.0

/[2020-02-24 14:20:38,046] INFO -  ---> 144fa1bdca72

[2020-02-24 14:20:38,046] INFO - Step 9/11 : RUN if [ -f /bento/bentoml_init.sh ]; then /bin/bash -c /bento/bentoml_init.sh; fi
[2020-02-24 14:20:38,047] INFO - 

\[2020-02-24 14:20:38,201] INFO -  ---> Running in 9b10cbac6693

/[2020-02-24 14:20:39,603] INFO -  ---> efa7604cda10

[2020-02-24 14:20:39,604] INFO - Step 10/11 : RUN if [ -f /opt/program/setup.sh ]; then /bin/bash -c /opt/program/setup.sh; fi
[2020-02-24 14:20:39,604] INFO - 

[2020-02-24 14:20:39,687] INFO -  ---> Running in 7a17fd648d45

-[2020-02-24 14:20:41,142] INFO -  ---> 5f4c15679f2c

[2020-02-24 14:20:41,143] INFO - Step 11/11 : ENV PATH="/opt/program:${PATH}"
[2020-02-24 14:20:41,143] INFO - 

/[2020-02-24 14:20:41,245] INFO -  ---> Running in b2165707cc46

\[2020-02-24 14:20:41,461] INFO -  ---> 2c7e9c0a53bf

[2020-02-24 14:20:41,478] INFO - Successfully built 2c7e9c0a53bf

[2020-02-24 14:20:41,482] INFO - Successfully tagged 192023623294.dkr.ecr.us-west-2.amazonaws.com/echoservicer-sagemaker:20200224141541_D891E3

|[2020-02-24 14:23:27,955] INFO - ApplyDeployment (tf2-echo, namespace bobo) succeeded
Successfully created AWS Sagemaker deployment tf2-echo
{
  "namespace": "bobo",
  "name": "tf2-echo",
  "spec": {
    "bentoName": "EchoServicer",
    "bentoVersion": "20200224141541_D891E3",
    "operator": "AWS_SAGEMAKER",
    "sagemakerOperatorConfig": {
      "region": "us-west-2",
      "instanceType": "ml.m4.xlarge",
      "instanceCount": 1,
      "apiName": "predict"
    }
  },
  "state": {
    "state": "RUNNING",
    "infoJson": {
      "EndpointName": "bobo-tf2-echo",
      "EndpointArn": "arn:aws:sagemaker:us-west-2:192023623294:endpoint/bobo-tf2-echo",
      "EndpointConfigName": "bobo-tf2-echo-EchoServicer-20200224141541-D891E3",
      "ProductionVariants": [
        {
          "VariantName": "bobo-tf2-echo-EchoServicer-20200224141541-D891E3",
          "DeployedImages": [
            {
              "SpecifiedImage": "192023623294.dkr.ecr.us-west-2.amazonaws.com/echoservicer-sagemaker:20200224141541_D891E3",
              "ResolvedImage": "192023623294.dkr.ecr.us-west-2.amazonaws.com/[email protected]:5bb688c36f3bd5852d5fb6c3281fb8ccffdbd56bc14b1c867fd0ba6e81658791",
              "ResolutionTime": "2020-02-24 14:23:30.748000-08:00"
            }
          ],
          "CurrentWeight": 1.0,
          "DesiredWeight": 1.0,
          "CurrentInstanceCount": 1,
          "DesiredInstanceCount": 1
        }
      ],
      "EndpointStatus": "InService",
      "CreationTime": "2020-02-24 14:23:27.907000-08:00",
      "LastModifiedTime": "2020-02-24 14:32:52.807000-08:00",
      "ResponseMetadata": {
        "RequestId": "3b9e3fb3-2393-4ec8-ae86-12937997b6d9",
        "HTTPStatusCode": 200,
        "HTTPHeaders": {
          "x-amzn-requestid": "3b9e3fb3-2393-4ec8-ae86-12937997b6d9",
          "content-type": "application/x-amz-json-1.1",
          "content-length": "783",
          "date": "Mon, 24 Feb 2020 22:32:56 GMT"
        },
        "RetryAttempts": 0
      }
    },
    "timestamp": "2020-02-24T22:32:56.562589Z"
  },
  "createdAt": "2020-02-24T22:16:41.516237Z",
  "lastUpdatedAt": "2020-02-24T22:16:41.516285Z"
}
In [50]:
!bentoml sagemaker get tf2-echo
{
  "namespace": "bobo",
  "name": "tf2-echo",
  "spec": {
    "bentoName": "EchoServicer",
    "bentoVersion": "20200224141541_D891E3",
    "operator": "AWS_SAGEMAKER",
    "sagemakerOperatorConfig": {
      "region": "us-west-2",
      "instanceType": "ml.m4.xlarge",
      "instanceCount": 1,
      "apiName": "predict"
    }
  },
  "state": {
    "state": "RUNNING",
    "infoJson": {
      "EndpointName": "bobo-tf2-echo",
      "EndpointArn": "arn:aws:sagemaker:us-west-2:192023623294:endpoint/bobo-tf2-echo",
      "EndpointConfigName": "bobo-tf2-echo-EchoServicer-20200224141541-D891E3",
      "ProductionVariants": [
        {
          "VariantName": "bobo-tf2-echo-EchoServicer-20200224141541-D891E3",
          "DeployedImages": [
            {
              "SpecifiedImage": "192023623294.dkr.ecr.us-west-2.amazonaws.com/echoservicer-sagemaker:20200224141541_D891E3",
              "ResolvedImage": "192023623294.dkr.ecr.us-west-2.amazonaws.com/[email protected]:5bb688c36f3bd5852d5fb6c3281fb8ccffdbd56bc14b1c867fd0ba6e81658791",
              "ResolutionTime": "2020-02-24 14:23:30.748000-08:00"
            }
          ],
          "CurrentWeight": 1.0,
          "DesiredWeight": 1.0,
          "CurrentInstanceCount": 1,
          "DesiredInstanceCount": 1
        }
      ],
      "EndpointStatus": "InService",
      "CreationTime": "2020-02-24 14:23:27.907000-08:00",
      "LastModifiedTime": "2020-02-24 14:32:52.807000-08:00",
      "ResponseMetadata": {
        "RequestId": "9da57574-bf36-471a-ac2f-a0948cca7930",
        "HTTPStatusCode": 200,
        "HTTPHeaders": {
          "x-amzn-requestid": "9da57574-bf36-471a-ac2f-a0948cca7930",
          "content-type": "application/x-amz-json-1.1",
          "content-length": "783",
          "date": "Mon, 24 Feb 2020 22:33:15 GMT"
        },
        "RetryAttempts": 0
      }
    },
    "timestamp": "2020-02-24T22:33:15.706807Z"
  },
  "createdAt": "2020-02-24T22:16:41.516237Z",
  "lastUpdatedAt": "2020-02-24T22:16:41.516285Z"
}
In [51]:
!aws sagemaker-runtime invoke-endpoint --endpoint-name bobo-tf2-echo --content-type 'application/json' \
--body '{"instances": [[1, 2]]}' \
output.json && cat output.json
{
    "ContentType": "application/json",
    "InvokedProductionVariant": "bobo-tf2-echo-EchoServicer-20200224141541-D891E3"
}
[[1, 2]]
In [52]:
!bentoml sagemaker delete tf2-echo
Successfully deleted AWS Sagemaker deployment "tf2-echo"

Additional: Serve with tf-serving

Bentoml TensorFlow handler and artifact is following the API of tensorflow-serving REST API.
To install tensorflow-serving, see: https://www.tensorflow.org/tfx/serving/setup

In [28]:
TMP_MODEL_DIR = "/tmp/test-echo-model"
TMP_MODEL_VERSION = "1"
TMP_MODEL_DIR_V = f"{TMP_MODEL_DIR}/{TMP_MODEL_VERSION}"
MODEL_NAME = "echo_model"

tf.saved_model.save(custom_model, TMP_MODEL_DIR_V)
!tensorflow_model_server --rest_api_port=5001 --model_name={MODEL_NAME} --model_base_path={TMP_MODEL_DIR}
INFO:tensorflow:Assets written to: /tmp/test-echo-model/2/assets
2019-12-20 12:03:01.458521: I tensorflow_serving/model_servers/server.cc:85] Building single TensorFlow model file config:  model_name: echo_model model_base_path: /tmp/test-echo-model
2019-12-20 12:03:01.458658: I tensorflow_serving/model_servers/server_core.cc:462] Adding/updating models.
2019-12-20 12:03:01.458673: I tensorflow_serving/model_servers/server_core.cc:573]  (Re-)adding model: echo_model
2019-12-20 12:03:01.559267: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: echo_model version: 2}
2019-12-20 12:03:01.559323: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: echo_model version: 2}
2019-12-20 12:03:01.559349: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: echo_model version: 2}
2019-12-20 12:03:01.559384: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /tmp/test-echo-model/2
2019-12-20 12:03:01.560368: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-12-20 12:03:01.561479: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-20 12:03:01.597759: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:202] Restoring SavedModel bundle.
2019-12-20 12:03:01.607093: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:151] Running initialization op on SavedModel bundle at path: /tmp/test-echo-model/2
2019-12-20 12:03:01.609348: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 49959 microseconds.
2019-12-20 12:03:01.609474: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:105] No warmup data file found at /tmp/test-echo-model/2/assets.extra/tf_serving_warmup_requests
2019-12-20 12:03:01.609596: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: echo_model version: 2}
2019-12-20 12:03:01.610932: I tensorflow_serving/model_servers/server.cc:353] Running gRPC ModelServer at 0.0.0.0:8500 ...
[evhttp_server.cc : 223] NET_LOG: Couldn't bind to port 5001
[evhttp_server.cc : 63] NET_LOG: Serer has not been terminated. Force termination now.
[evhttp_server.cc : 258] NET_LOG: Server is not running ...
2019-12-20 12:03:01.612138: E tensorflow_serving/model_servers/server.cc:375] Failed to start HTTP Server at localhost:5001
^C
In [32]:
import requests
import json

TMP_MODEL_DIR = "/tmp/test-echo-model"
TMP_MODEL_VERSION = "1"
TMP_MODEL_DIR_V = f"{TMP_MODEL_DIR}/{TMP_MODEL_VERSION}"
MODEL_NAME = "echo_model"
headers = {"content-type": "application/json"}
data = json.dumps(
    {"instances": [[1, 2, 2, 3], [2, 3, 3, 4]]}
)
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))
json_response = requests.post(f'http://127.0.0.1:5001/v{TMP_MODEL_VERSION}/models/{MODEL_NAME}:predict',
                              data=data, headers=headers)
print(json_response)
print(json_response.text)
Data: {"instances": [[1, 2, 2, 3], [2, 3, 3, 4]]} ... , 3, 4]]}
<Response [200]>
{
    "predictions": [[1.0, 2.0, 2.0, 3.0], [2.0, 3.0, 3.0, 4.0]
    ]
}