Getting Started with BentoML

BentoML is an open source framework for serving and deploying machine learning models. It provides high-level APIs for defining a prediction service and packaging trained models, source code, dependencies, and configurations into a production-system-friendly format that is ready for production deployment.

This is a quick tutorial on how to use BentoML to create a prediction service with a trained sklearn model, serving the model via a REST API server and deploy it to AWS Lambda as a serverless endpoint.

Impression

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

BentoML requires python 3.6 or above, install via pip:

In [ ]:
# Install BentoML
!pip install bentoml

# Also install scikit-learn, we will use a sklean model as an example
!pip install pandas sklearn

Let's get started with a simple scikit-learn model as an example:

In [3]:
from sklearn import svm
from sklearn import datasets

clf = svm.SVC(gamma='scale')
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)
Out[3]:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

Creating a Prediction Service with BentoML

The first step of creating a prediction service with BentoML, is to write a prediction service class inheriting from bentoml.BentoService, and declaratively listing the dependencies, model artifacts and writing your service API call back function. Here is what a simple prediction service looks like:

In [4]:
%%writefile iris_classifier.py
from bentoml import BentoService, api, env, artifacts
from bentoml.artifact import SklearnModelArtifact
from bentoml.handlers import DataframeHandler

@artifacts([SklearnModelArtifact('model')])
@env(pip_dependencies=["scikit-learn"])
class IrisClassifier(BentoService):

    @api(DataframeHandler)
    def predict(self, df):
        return self.artifacts.model.predict(df)
Overwriting iris_classifier.py

The bentoml.api and DataframeHandler here tells BentoML, that following by it, is the service API callback function, and pandas.Dataframe is its expected input format.

The bentoml.env decorator allows user to specify the dependencies and environment settings for this prediction service. Here we are creating the prediction service based on a scikit learn model, so we add it to the list of pip dependencies.

Last but not least, bentoml.artifact declares the required trained model to be bundled with this prediction service. Here it is using the built-in SklearnModelArtifact and simply naming it 'model'. BentoML also provide model artifact for other frameworks such as PytorchModelArtifact, KerasModelArtifact, FastaiModelArtifact, and XgboostModelArtifact etc.

Saving a versioned BentoService bundle

In [5]:
# 1) import the custom BentoService defined above
from iris_classifier import IrisClassifier

# 2) `pack` it with required artifacts
svc = IrisClassifier.pack(model=clf)

# 3) save BentoSerivce to a BentoML bundle
saved_path = svc.save()
[2019-11-26 12:53:16,256] INFO - BentoService bundle 'IrisClassifier:20191126125258_4AB1D4' created at: /private/var/folders/ns/vc9qhmqx5dx_9fws7d869lqh0000gn/T/bentoml-temp-itiurepz
[2019-11-26 12:53:16,435] INFO - BentoService bundle 'IrisClassifier:20191126125258_4AB1D4' created at: /Users/chaoyuyang/bentoml/repository/IrisClassifier/20191126125258_4AB1D4

That's it. You've just created a BentoService SavedBundle, it's a versioned file archive that is ready for production deployment. It contains the BentoService you defined, as well as the packed trained model artifacts, pre-processing code, dependencies and other configurations in a single file directory.

Model Serving via REST API

From a BentoService SavedBundle, you can start a REST API server by providing the file path to the saved bundle:

In [6]:
# Note that REST API serving **does not work in Google Colab** due to unable to access Colab's VM
!bentoml serve {saved_path}
 * Serving Flask app "IrisClassifier" (lazy loading)
 * Environment: production
   WARNING: Do not use the development server in a production environment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [26/Nov/2019 12:53:23] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [26/Nov/2019 12:53:23] "GET /static/swagger-ui-bundle.js HTTP/1.1" 200 -
127.0.0.1 - - [26/Nov/2019 12:53:23] "GET /static/swagger-ui.css HTTP/1.1" 200 -
127.0.0.1 - - [26/Nov/2019 12:53:24] "GET /docs.json HTTP/1.1" 200 -
127.0.0.1 - - [26/Nov/2019 12:53:40] "POST /predict HTTP/1.1" 200 -
^C

View documentations for REST APIs

The REST API server provides a simply web UI for you to test and debug. If you are running this command on your local machine, visit http://127.0.0.1:5000 in your browser and try out sending API request to the server.

BentoML API Server Web UI Screenshot

Send prediction request to REST API server

You can also send prediction request with curl from command line:

curl -i \
--header "Content-Type: application/json" \
--request POST \
--data '[[5.1, 3.5, 1.4, 0.2]]' \
localhost:5000/predict

Or with python and request library:

import requests
response = requests.post("http://127.0.0.1:5000/predict", json=[[5.1, 3.5, 1.4, 0.2]])
print(response.text)

Containerize REST API server with Docker

The BentoService SavedBundle is structured to work as a docker build context, that can be directed used to build a docker image for API server. Simply use it as the docker build context directory:

In [7]:
!cd {saved_path} && docker build -t iris-classifier .
Sending build context to Docker daemon  25.09kB
Step 1/12 : FROM continuumio/miniconda3:4.7.12
 ---> 406f2b43ea59
Step 2/12 : ENTRYPOINT [ "/bin/bash", "-c" ]
 ---> Using cache
 ---> 91b6d992bb33
Step 3/12 : EXPOSE 5000
 ---> Using cache
 ---> 73391506dd63
Step 4/12 : RUN set -x      && apt-get update      && apt-get install --no-install-recommends --no-install-suggests -y libpq-dev build-essential      && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 6fcef6d29bc7
Step 5/12 : RUN conda update conda -y       && conda install pip numpy scipy       && pip install gunicorn six
 ---> Using cache
 ---> 2a4c788d47dd
Step 6/12 : COPY . /bento
 ---> 88bae1d2b9da
Step 7/12 : WORKDIR /bento
 ---> Running in e0fe26bfa9d5
Removing intermediate container e0fe26bfa9d5
 ---> a4bb030d820a
Step 8/12 : RUN conda env update -n base -f /bento/environment.yml
 ---> Running in f62648e04909
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

Downloading and Extracting Packages
setuptools-42.0.1    | 670 KB    | ########## | 100% 
cffi-1.13.2          | 225 KB    | ########## | 100% 
python-3.7.3         | 32.1 MB   | ########## | 100% 
six-1.13.0           | 27 KB     | ########## | 100% 
tqdm-4.39.0          | 52 KB     | ########## | 100% 
pyopenssl-19.1.0     | 87 KB     | ########## | 100% 
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
#
# To activate this environment, use
#
#     $ conda activate base
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Removing intermediate container f62648e04909
 ---> abf7b8fc7702
Step 9/12 : RUN pip install -r /bento/requirements.txt
 ---> Running in 244adf8d6562
Collecting bentoml==0.5.2
  Downloading https://files.pythonhosted.org/packages/6d/9e/a612288dd4296b7c40f4e2d6826c94986947afa6918f9088f0de87d71286/BentoML-0.5.2-py3-none-any.whl (521kB)
Collecting scikit-learn
  Downloading https://files.pythonhosted.org/packages/9f/c5/e5267eb84994e9a92a2c6a6ee768514f255d036f3c8378acfa694e9f2c99/scikit_learn-0.21.3-cp37-cp37m-manylinux1_x86_64.whl (6.7MB)
Requirement already satisfied: gunicorn in /opt/conda/lib/python3.7/site-packages (from bentoml==0.5.2->-r /bento/requirements.txt (line 1)) (20.0.0)
Collecting sqlalchemy>=1.3.0
  Downloading https://files.pythonhosted.org/packages/34/5c/0e1d7ad0ca52544bb12f9cb8d5cc454af45821c92160ffedd38db0a317f6/SQLAlchemy-1.3.11.tar.gz (6.0MB)
Collecting flask
  Downloading https://files.pythonhosted.org/packages/9b/93/628509b8d5dc749656a9641f4caf13540e2cdec85276964ff8f43bbb1d3b/Flask-1.1.1-py2.py3-none-any.whl (94kB)
Collecting cerberus
  Downloading https://files.pythonhosted.org/packages/90/a7/71c6ed2d46a81065e68c007ac63378b96fa54c7bb614d653c68232f9c50c/Cerberus-1.3.2.tar.gz (52kB)
Collecting pathlib2
  Downloading https://files.pythonhosted.org/packages/e9/45/9c82d3666af4ef9f221cbb954e1d77ddbb513faf552aea6df5f37f1a4859/pathlib2-2.3.5-py2.py3-none-any.whl
Collecting alembic
  Downloading https://files.pythonhosted.org/packages/84/64/493c45119dce700a4b9eeecc436ef9e8835ab67bae6414f040cdc7b58f4b/alembic-1.3.1.tar.gz (1.1MB)
Requirement already satisfied: numpy in /opt/conda/lib/python3.7/site-packages (from bentoml==0.5.2->-r /bento/requirements.txt (line 1)) (1.17.3)
Collecting grpcio
  Downloading https://files.pythonhosted.org/packages/b5/68/070ee7609b452e950bd5af35f7161f0ceb0abd61cf16ff3b23c852d4594b/grpcio-1.25.0-cp37-cp37m-manylinux2010_x86_64.whl (2.4MB)
Collecting configparser
  Downloading https://files.pythonhosted.org/packages/7a/2a/95ed0501cf5d8709490b1d3a3f9b5cf340da6c433f896bbe9ce08dbe6785/configparser-4.0.2-py2.py3-none-any.whl
Collecting python-json-logger
  Downloading https://files.pythonhosted.org/packages/80/9d/1c3393a6067716e04e6fcef95104c8426d262b4adaf18d7aa2470eab028d/python-json-logger-0.1.11.tar.gz
Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from bentoml==0.5.2->-r /bento/requirements.txt (line 1)) (1.13.0)
Collecting docker
  Downloading https://files.pythonhosted.org/packages/cc/ca/699d4754a932787ef353a157ada74efd1ceb6d1fc0bfb7989ae1e7b33111/docker-4.1.0-py2.py3-none-any.whl (139kB)
Collecting boto3
  Downloading https://files.pythonhosted.org/packages/67/17/567f679dac6ec93611ce05637094e9736afa40e41c9ed0aaefd90de6f7e8/boto3-1.10.28-py2.py3-none-any.whl (128kB)
Requirement already satisfied: requests in /opt/conda/lib/python3.7/site-packages (from bentoml==0.5.2->-r /bento/requirements.txt (line 1)) (2.22.0)
Collecting prometheus-client
  Downloading https://files.pythonhosted.org/packages/b3/23/41a5a24b502d35a4ad50a5bb7202a5e1d9a0364d0c12f56db3dbf7aca76d/prometheus_client-0.7.1.tar.gz
Collecting pandas
  Downloading https://files.pythonhosted.org/packages/63/e0/a1b39cdcb2c391f087a1538bc8a6d62a82d0439693192aef541d7b123769/pandas-0.25.3-cp37-cp37m-manylinux1_x86_64.whl (10.4MB)
Collecting tabulate
  Downloading https://files.pythonhosted.org/packages/c4/41/523f6a05e6dc3329a5660f6a81254c6cd87e5cfb5b7482bae3391d86ec3a/tabulate-0.8.6.tar.gz (45kB)
Collecting ruamel.yaml>=0.15.0
  Downloading https://files.pythonhosted.org/packages/fa/90/ecff85a2e9c497e2fa7142496e10233556b5137db5bd46f3f3b006935ca8/ruamel.yaml-0.16.5-py2.py3-none-any.whl (123kB)
Collecting humanfriendly
  Downloading https://files.pythonhosted.org/packages/90/df/88bff450f333114680698dc4aac7506ff7cab164b794461906de31998665/humanfriendly-4.18-py2.py3-none-any.whl (73kB)
Collecting click>=7.0
  Downloading https://files.pythonhosted.org/packages/fa/37/45185cb5abbc30d7257104c434fe0b07e5a195a6847506c074527aa599ec/Click-7.0-py2.py3-none-any.whl (81kB)
Collecting protobuf>=3.6.0
  Downloading https://files.pythonhosted.org/packages/82/c0/371cf368e2d8b1b7bcf9f9bafd7cec962487e654ad8296d8e0ad62011537/protobuf-3.11.0-cp37-cp37m-manylinux1_x86_64.whl (1.3MB)
Collecting packaging
  Downloading https://files.pythonhosted.org/packages/cf/94/9672c2d4b126e74c4496c6b3c58a8b51d6419267be9e70660ba23374c875/packaging-19.2-py2.py3-none-any.whl
Collecting joblib>=0.11
  Downloading https://files.pythonhosted.org/packages/8f/42/155696f85f344c066e17af287359c9786b436b1bf86029bb3411283274f3/joblib-0.14.0-py2.py3-none-any.whl (294kB)
Requirement already satisfied: scipy>=0.17.0 in /opt/conda/lib/python3.7/site-packages (from scikit-learn->-r /bento/requirements.txt (line 2)) (1.3.1)
Requirement already satisfied: setuptools>=3.0 in /opt/conda/lib/python3.7/site-packages (from gunicorn->bentoml==0.5.2->-r /bento/requirements.txt (line 1)) (42.0.1.post20191125)
Collecting itsdangerous>=0.24
  Downloading https://files.pythonhosted.org/packages/76/ae/44b03b253d6fade317f32c24d100b3b35c2239807046a4c953c7b89fa49e/itsdangerous-1.1.0-py2.py3-none-any.whl
Collecting Jinja2>=2.10.1
  Downloading https://files.pythonhosted.org/packages/65/e0/eb35e762802015cab1ccee04e8a277b03f1d8e53da3ec3106882ec42558b/Jinja2-2.10.3-py2.py3-none-any.whl (125kB)
Collecting Werkzeug>=0.15
  Downloading https://files.pythonhosted.org/packages/ce/42/3aeda98f96e85fd26180534d36570e4d18108d62ae36f87694b476b83d6f/Werkzeug-0.16.0-py2.py3-none-any.whl (327kB)
Collecting Mako
  Downloading https://files.pythonhosted.org/packages/b0/3c/8dcd6883d009f7cae0f3157fb53e9afb05a0d3d33b3db1268ec2e6f4a56b/Mako-1.1.0.tar.gz (463kB)
Collecting python-editor>=0.3
  Downloading https://files.pythonhosted.org/packages/c6/d3/201fc3abe391bbae6606e6f1d598c15d367033332bd54352b12f35513717/python_editor-1.0.4-py3-none-any.whl
Collecting python-dateutil
  Downloading https://files.pythonhosted.org/packages/d4/70/d60450c3dd48ef87586924207ae8907090de0b306af2bce5d134d78615cb/python_dateutil-2.8.1-py2.py3-none-any.whl (227kB)
Collecting websocket-client>=0.32.0
  Downloading https://files.pythonhosted.org/packages/29/19/44753eab1fdb50770ac69605527e8859468f3c0fd7dc5a76dd9c4dbd7906/websocket_client-0.56.0-py2.py3-none-any.whl (200kB)
Collecting botocore<1.14.0,>=1.13.28
  Downloading https://files.pythonhosted.org/packages/8e/b3/ab9044d3aa14208f5b6f69665d5f82ad9b028809666e4e70dac1ab0bfd3a/botocore-1.13.28-py2.py3-none-any.whl (5.7MB)
Collecting s3transfer<0.3.0,>=0.2.0
  Downloading https://files.pythonhosted.org/packages/16/8a/1fc3dba0c4923c2a76e1ff0d52b305c44606da63f718d14d3231e21c51b0/s3transfer-0.2.1-py2.py3-none-any.whl (70kB)
Collecting jmespath<1.0.0,>=0.7.1
  Downloading https://files.pythonhosted.org/packages/83/94/7179c3832a6d45b266ddb2aac329e101367fbdb11f425f13771d27f225bb/jmespath-0.9.4-py2.py3-none-any.whl
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.7/site-packages (from requests->bentoml==0.5.2->-r /bento/requirements.txt (line 1)) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.7/site-packages (from requests->bentoml==0.5.2->-r /bento/requirements.txt (line 1)) (2019.9.11)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.7/site-packages (from requests->bentoml==0.5.2->-r /bento/requirements.txt (line 1)) (1.24.2)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/lib/python3.7/site-packages (from requests->bentoml==0.5.2->-r /bento/requirements.txt (line 1)) (2.8)
Collecting pytz>=2017.2
  Downloading https://files.pythonhosted.org/packages/e7/f9/f0b53f88060247251bf481fa6ea62cd0d25bf1b11a87888e53ce5b7c8ad2/pytz-2019.3-py2.py3-none-any.whl (509kB)
Collecting ruamel.yaml.clib>=0.1.2; platform_python_implementation == "CPython" and python_version < "3.8"
  Downloading https://files.pythonhosted.org/packages/40/80/da16b691d5e259dd9919a10628e541fca321cb4b078fbb88e1c7c22aa42d/ruamel.yaml.clib-0.2.0-cp37-cp37m-manylinux1_x86_64.whl (547kB)
Collecting pyparsing>=2.0.2
  Downloading https://files.pythonhosted.org/packages/c0/0c/fc2e007d9a992d997f04a80125b0f183da7fb554f1de701bbb70a8e7d479/pyparsing-2.4.5-py2.py3-none-any.whl (67kB)
Collecting MarkupSafe>=0.23
  Downloading https://files.pythonhosted.org/packages/98/7b/ff284bd8c80654e471b769062a9b43cc5d03e7a615048d96f4619df8d420/MarkupSafe-1.1.1-cp37-cp37m-manylinux1_x86_64.whl
Collecting docutils<0.16,>=0.10
  Downloading https://files.pythonhosted.org/packages/22/cd/a6aa959dca619918ccb55023b4cb151949c64d4d5d55b3f4ffd7eee0c6e8/docutils-0.15.2-py3-none-any.whl (547kB)
Building wheels for collected packages: sqlalchemy, cerberus, alembic, python-json-logger, prometheus-client, tabulate, Mako
  Building wheel for sqlalchemy (setup.py): started
  Building wheel for sqlalchemy (setup.py): finished with status 'done'
  Created wheel for sqlalchemy: filename=SQLAlchemy-1.3.11-cp37-cp37m-linux_x86_64.whl size=1216877 sha256=c3d55af5227bfa57bd1fb33fd4ee0f7f839387d70a9cb5e9d089367871faeb5d
  Stored in directory: /root/.cache/pip/wheels/a3/67/7d/6c41104a1a08ff1a25e260d3edec3ac19203141d1aaa2f0975
  Building wheel for cerberus (setup.py): started
  Building wheel for cerberus (setup.py): finished with status 'done'
  Created wheel for cerberus: filename=Cerberus-1.3.2-cp37-none-any.whl size=54336 sha256=1c524400d7a6fcdcd584dbee923bae4c91fb05c2079b1601b630e49a9cb2e5a8
  Stored in directory: /root/.cache/pip/wheels/e9/38/1f/f2cc84182676f3ae7134b9b2d744f9c235b24d2ddc8f7fe465
  Building wheel for alembic (setup.py): started
  Building wheel for alembic (setup.py): finished with status 'done'
  Created wheel for alembic: filename=alembic-1.3.1-py2.py3-none-any.whl size=144523 sha256=1050d97f8c11d766863d2adbcf3cb1ded9029961b1a53794eadce6792575d9c3
  Stored in directory: /root/.cache/pip/wheels/b2/d4/19/5ab879d30af7cbc79e6dcc1d421795b1aa9d78f455b0412ef7
  Building wheel for python-json-logger (setup.py): started
  Building wheel for python-json-logger (setup.py): finished with status 'done'
  Created wheel for python-json-logger: filename=python_json_logger-0.1.11-py2.py3-none-any.whl size=5076 sha256=f478fd7b32980268a9a191772050cd1812967d303c3327696a57f1b3ab3048e3
  Stored in directory: /root/.cache/pip/wheels/97/f7/a1/752e22bb30c1cfe38194ea0070a5c66e76ef4d06ad0c7dc401
  Building wheel for prometheus-client (setup.py): started
  Building wheel for prometheus-client (setup.py): finished with status 'done'
  Created wheel for prometheus-client: filename=prometheus_client-0.7.1-cp37-none-any.whl size=41402 sha256=6fe66bcc1d4f615a52c10f40f9d22d3aa1c0b43b1856f117f8e27b2db4c3f813
  Stored in directory: /root/.cache/pip/wheels/1c/54/34/fd47cd9b308826cc4292b54449c1899a30251ef3b506bc91ea
  Building wheel for tabulate (setup.py): started
  Building wheel for tabulate (setup.py): finished with status 'done'
  Created wheel for tabulate: filename=tabulate-0.8.6-cp37-none-any.whl size=23274 sha256=17cbc396f28cffb1de534307d83ca442f3f25c0da9c60fac28203b33fb15826d
  Stored in directory: /root/.cache/pip/wheels/9c/9b/f4/eb243fdb89676ec00588e8c54bb54360724c06e7fafe95278e
  Building wheel for Mako (setup.py): started
  Building wheel for Mako (setup.py): finished with status 'done'
  Created wheel for Mako: filename=Mako-1.1.0-cp37-none-any.whl size=75360 sha256=cebc330fb300656344b7470fd095c7b6e42d5fd31b4cd8e01849dab94e90549b
  Stored in directory: /root/.cache/pip/wheels/98/32/7b/a291926643fc1d1e02593e0d9e247c5a866a366b8343b7aa27
Successfully built sqlalchemy cerberus alembic python-json-logger prometheus-client tabulate Mako
ERROR: botocore 1.13.28 has requirement python-dateutil<2.8.1,>=2.1; python_version >= "2.7", but you'll have python-dateutil 2.8.1 which is incompatible.
Installing collected packages: sqlalchemy, click, itsdangerous, MarkupSafe, Jinja2, Werkzeug, flask, cerberus, pathlib2, Mako, python-editor, python-dateutil, alembic, grpcio, configparser, python-json-logger, websocket-client, docker, docutils, jmespath, botocore, s3transfer, boto3, prometheus-client, pytz, pandas, tabulate, ruamel.yaml.clib, ruamel.yaml, humanfriendly, protobuf, pyparsing, packaging, bentoml, joblib, scikit-learn
Successfully installed Jinja2-2.10.3 Mako-1.1.0 MarkupSafe-1.1.1 Werkzeug-0.16.0 alembic-1.3.1 bentoml-0.5.2 boto3-1.10.28 botocore-1.13.28 cerberus-1.3.2 click-7.0 configparser-4.0.2 docker-4.1.0 docutils-0.15.2 flask-1.1.1 grpcio-1.25.0 humanfriendly-4.18 itsdangerous-1.1.0 jmespath-0.9.4 joblib-0.14.0 packaging-19.2 pandas-0.25.3 pathlib2-2.3.5 prometheus-client-0.7.1 protobuf-3.11.0 pyparsing-2.4.5 python-dateutil-2.8.1 python-editor-1.0.4 python-json-logger-0.1.11 pytz-2019.3 ruamel.yaml-0.16.5 ruamel.yaml.clib-0.2.0 s3transfer-0.2.1 scikit-learn-0.21.3 sqlalchemy-1.3.11 tabulate-0.8.6 websocket-client-0.56.0
Removing intermediate container 244adf8d6562
 ---> 1f4c515d2966
Step 10/12 : RUN if [ -f /bento/bentoml_init.sh ]; then /bin/bash -c /bento/bentoml_init.sh; fi
 ---> Running in 93b1e32316cc
Removing intermediate container 93b1e32316cc
 ---> 1fbec0040243
Step 11/12 : RUN if [ -f /bento/setup.sh ]; then /bin/bash -c /bento/setup.sh; fi
 ---> Running in 54e01ffc60bc
Removing intermediate container 54e01ffc60bc
 ---> 934d729e5807
Step 12/12 : CMD ["bentoml serve-gunicorn /bento"]
 ---> Running in 15693d59564b
Removing intermediate container 15693d59564b
 ---> 56f469df7eca
Successfully built 56f469df7eca
Successfully tagged iris-classifier:latest

Note that docker is note available in Google Colab, download the notebook, ensure docker is installed and try it locally.

Next, you can docker push the image to your choice of registry for deployment, or run it locally for development and testing:

In [8]:
!docker run -p 5000:5000 iris-classifier
[2019-11-26 20:57:06 +0000] [1] [INFO] Starting gunicorn 20.0.0
[2019-11-26 20:57:06 +0000] [1] [INFO] Listening at: http://0.0.0.0:5000 (1)
[2019-11-26 20:57:06 +0000] [1] [INFO] Using worker: sync
[2019-11-26 20:57:06 +0000] [10] [INFO] Booting worker with pid: 10
[2019-11-26 20:57:06 +0000] [11] [INFO] Booting worker with pid: 11
[2019-11-26 20:57:06 +0000] [12] [INFO] Booting worker with pid: 12
^C
[2019-11-26 20:57:17 +0000] [1] [INFO] Handling signal: int
/opt/conda/lib/python3.7/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator SVC from version 0.21.2 when using version 0.21.3. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
[2019-11-26 20:57:17 +0000] [12] [INFO] Worker exiting (pid: 12)
/opt/conda/lib/python3.7/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator SVC from version 0.21.2 when using version 0.21.3. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
[2019-11-26 20:57:17 +0000] [10] [INFO] Worker exiting (pid: 10)
/opt/conda/lib/python3.7/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator SVC from version 0.21.2 when using version 0.21.3. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
[2019-11-26 20:57:17 +0000] [11] [INFO] Worker exiting (pid: 11)

Load saved BentoService

bentoml.load is the enssential API for loading a Bento into your python application:

In [9]:
import bentoml
import pandas as pd

bento_svc = bentoml.load(saved_path)

# Test loaded bentoml service:
bento_svc.predict([X[0]])
[2019-11-26 12:57:20,718] WARNING - Module `iris_classifier` already loaded, using existing imported module.
Out[9]:
memmap([0])

Distribute BentoML SavedBundle as PyPI package

The BentoService SavedBundle is pip-installable and can be directly distributed as a PyPI package if you plan to use the model in your python applications. You can install it as as a system-wide python package with pip:

In [10]:
!pip install {saved_path}
Processing /Users/chaoyuyang/bentoml/repository/IrisClassifier/20191126125258_4AB1D4
Requirement already satisfied: bentoml==0.5.2 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from IrisClassifier===20191126125258-4AB1D4) (0.5.2)
Requirement already satisfied: scikit-learn in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from IrisClassifier===20191126125258-4AB1D4) (0.21.2)
Requirement already satisfied: numpy in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.16.4)
Requirement already satisfied: packaging in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (19.0)
Requirement already satisfied: requests in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (2.22.0)
Requirement already satisfied: configparser in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (3.7.4)
Requirement already satisfied: flask in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.0.2)
Requirement already satisfied: sqlalchemy>=1.3.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.3.8)
Requirement already satisfied: cerberus in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.3.1)
Requirement already satisfied: docker in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (4.1.0)
Requirement already satisfied: python-json-logger in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.1.11)
Requirement already satisfied: pathlib2 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (2.3.4)
Requirement already satisfied: six in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.12.0)
Requirement already satisfied: tabulate in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.8.2)
Requirement already satisfied: pandas in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.24.2)
Requirement already satisfied: ruamel.yaml>=0.15.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.15.99)
Requirement already satisfied: alembic in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.9.10)
Requirement already satisfied: prometheus-client in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.7.1)
Requirement already satisfied: boto3 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.9.188)
Requirement already satisfied: gunicorn in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (20.0.0)
Requirement already satisfied: click>=7.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (7.0)
Requirement already satisfied: humanfriendly in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (4.18)
Requirement already satisfied: protobuf>=3.6.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (3.10.0)
Requirement already satisfied: grpcio in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.25.0)
Requirement already satisfied: joblib>=0.11 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from scikit-learn->IrisClassifier===20191126125258-4AB1D4) (0.13.2)
Requirement already satisfied: scipy>=0.17.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from scikit-learn->IrisClassifier===20191126125258-4AB1D4) (1.3.0)
Requirement already satisfied: pyparsing>=2.0.2 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from packaging->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (2.4.0)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from requests->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from requests->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (2019.9.11)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from requests->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.24.3)
Requirement already satisfied: idna<2.9,>=2.5 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from requests->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (2.8)
Requirement already satisfied: Jinja2>=2.10 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from flask->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (2.10.1)
Requirement already satisfied: Werkzeug>=0.14 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from flask->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.15.4)
Requirement already satisfied: itsdangerous>=0.24 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from flask->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.1.0)
Requirement already satisfied: websocket-client>=0.32.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from docker->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.56.0)
Requirement already satisfied: pytz>=2011k in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from pandas->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (2019.1)
Requirement already satisfied: python-dateutil>=2.5.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from pandas->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (2.8.0)
Requirement already satisfied: python-editor>=0.3 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from alembic->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.0.4)
Requirement already satisfied: Mako in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from alembic->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.0.10)
Requirement already satisfied: botocore<1.13.0,>=1.12.188 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from boto3->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.12.234)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from boto3->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.9.4)
Requirement already satisfied: s3transfer<0.3.0,>=0.2.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from boto3->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.2.1)
Requirement already satisfied: setuptools>=3.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from gunicorn->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (41.4.0)
Requirement already satisfied: MarkupSafe>=0.23 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from Jinja2>=2.10->flask->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (1.1.1)
Requirement already satisfied: docutils<0.16,>=0.10 in /Users/chaoyuyang/.local/lib/python3.7/site-packages (from botocore<1.13.0,>=1.12.188->boto3->bentoml==0.5.2->IrisClassifier===20191126125258-4AB1D4) (0.14)
Building wheels for collected packages: IrisClassifier
  Building wheel for IrisClassifier (setup.py) ... done
  Created wheel for IrisClassifier: filename=IrisClassifier-20191126125258_4AB1D4-cp37-none-any.whl size=5263 sha256=9424107f80e2a0a76b7734f7ddf3a765bb27310268703e6c8d0e075536e84e61
  Stored in directory: /private/var/folders/ns/vc9qhmqx5dx_9fws7d869lqh0000gn/T/pip-ephem-wheel-cache-5y170jlo/wheels/69/02/91/646c9c350cf887e495e9be46faae3c359b267cc4fea408e85c
Successfully built IrisClassifier
Installing collected packages: IrisClassifier
  Found existing installation: IrisClassifier 20191126114724-FBB606
    Uninstalling IrisClassifier-20191126114724-FBB606:
      Successfully uninstalled IrisClassifier-20191126114724-FBB606
Successfully installed IrisClassifier-20191126125258-4AB1D4
In [11]:
# Your bentoML model class name will become packaged name
import IrisClassifier

installed_svc = IrisClassifier.load()
installed_svc.predict([X[0]])
Out[11]:
memmap([0])

This also allow users to upload their BentoService to pypi.org as public python package or to their organization's private PyPi index to share with other developers.

cd {saved_path} & python setup.py sdist upload

You will have to configure ".pypirc" file before uploading to pypi index. You can find more information about distributing python package at: https://docs.python.org/3.7/distributing/index.html#distributing-index

Model Serving via CLI

pip install {saved_path} also installs a CLI tool for accessing the BentoML service, print CLI help document with --help:

In [12]:
!IrisClassifier --help
Usage: IrisClassifier [OPTIONS] COMMAND [ARGS]...

  BentoML CLI tool

Options:
  -q, --quiet  Hide process logs and only print command results
  --verbose    Print verbose debugging information for BentoML developer
  --version    Show the version and exit.
  --help       Show this message and exit.

Commands:
  <API_NAME>      Run API function
  info            List APIs
  open-api-spec   Display OpenAPI/Swagger JSON specs
  serve           Start local rest server
  serve-gunicorn  Start local gunicorn server

Printing more information about this ML service with info command:

In [ ]:
!IrisClassifier info

You can also print help and docs on individual commands:

In [ ]:
!IrisClassifier predict --help

Each service API you defined in the BentoService will be exposed as a CLI command with the same name as the API function:

In [ ]:
!IrisClassifier predict --input='[[5.1, 3.5, 1.4, 0.2]]'

BentoML cli also supports reading input data from csv or json files, in either local machine or remote HTTP/S3 location:

In [13]:
# Writing test data to a csv file
pd.DataFrame(iris.data).to_csv('iris_data.csv', index=False)

# Invoke predict from command lien
!IrisClassifier predict --input='./iris_data.csv'
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2
 2 2]

Alternatively, you can also use the bentoml cli to load and run a BentoML service archive without installing it:

In [ ]:
!bentoml info {saved_path}
In [14]:
!bentoml predict {saved_path} --input='[[5.1, 3.5, 1.4, 0.2]]'
[0]

Deploy REST API server to the cloud

BentoML has a built-in deployment management tool called YataiService. YataiService can be deployed separately to manage all your teams' trained models, BentoService bundles, and active deployments in the cloud or in your own kubernetes cluster. You can also create simple model serving deployments with just the BentoML cli, which launches a local YataiService backed by SQLite database on your machine.

Now let's deploy the IrisClassifier to AWS Lambda as a serverless endpoint.

First you need to install the aws-sam-cli package, which is required by BentoML to work with AWS Lambda deployment:

    pip install -U aws-sam-cli==0.31.1

You will also need to configure your AWS account and credentials if you don't have it configured on your machine. You can do this either via environment variables or through the aws configure command: install aws cli command via pip install awscli and follow detailed instructions here.

Now you can run the bentoml deploy command, to create a AWS Lambda deployment, hosting the BentService you've created:

In [16]:
!bentoml deployment create quick-start-guide-deployment \
    -b=IrisClassifier:{svc.version} \
    --platform=aws-lambda \
[2019-11-26 12:59:06,859] INFO - Building lambda project
[2019-11-26 13:00:12,142] INFO - Packaging lambda project
[2019-11-26 13:00:33,427] INFO - Deploying lambda project

Waiting for changeset to be created..
Waiting for stack create/update to complete
Successfully created/updated stack - default-quick-start-guide-deployment
[2019-11-26 13:01:32,152] INFO - Finish deployed lambda project, fetching latest status
Successfully created deployment quick-start-guide-deployment
{
  "namespace": "default",
  "name": "quick-start-guide-deployment",
  "spec": {
    "bentoName": "IrisClassifier",
    "bentoVersion": "20191126125258_4AB1D4",
    "operator": "AWS_LAMBDA",
    "awsLambdaOperatorConfig": {
      "region": "us-west-2",
      "memorySize": 1024,
      "timeout": 6
    }
  },
  "state": {
    "state": "RUNNING",
    "infoJson": {
      "endpoints": [
        "https://apcjn5h648.execute-api.us-west-2.amazonaws.com/Prod/predict"
      ],
      "s3_bucket": "btml-default-quick-start-guide-deployment-2b99b3"
    },
    "timestamp": "2019-11-26T21:01:32.544927Z"
  },
  "createdAt": "2019-11-26T20:59:02.611005Z",
  "lastUpdatedAt": "2019-11-26T20:59:02.611042Z"
}

Here the 'quick-starrt-guide-deployment' is the deployment name, you can reference the deployment by this name and query its status. For example, to get current deployment status:

In [17]:
!bentoml deployment describe quick-start-guide-deployment
{
  "namespace": "default",
  "name": "quick-start-guide-deployment",
  "spec": {
    "bentoName": "IrisClassifier",
    "bentoVersion": "20191126125258_4AB1D4",
    "operator": "AWS_LAMBDA",
    "awsLambdaOperatorConfig": {
      "region": "us-west-2",
      "memorySize": 1024,
      "timeout": 6
    }
  },
  "state": {
    "state": "RUNNING",
    "infoJson": {
      "endpoints": [
        "https://apcjn5h648.execute-api.us-west-2.amazonaws.com/Prod/predict"
      ],
      "s3_bucket": "btml-default-quick-start-guide-deployment-2b99b3"
    },
    "timestamp": "2019-11-26T21:02:55.201073Z"
  },
  "createdAt": "2019-11-26T20:59:02.611005Z",
  "lastUpdatedAt": "2019-11-26T20:59:02.611042Z"
}

To send request to your AWS Lambda deployment, grab the endpoint URL from the json output above:

In [20]:
!curl -i \
--header "Content-Type: application/json" \
--request POST \
--data '[[5.1, 3.5, 1.4, 0.2]]' \
https://apcjn5h648.execute-api.us-west-2.amazonaws.com/Prod/predict











[0]

And to delete an active deployment:

In [21]:
!bentoml deployment delete quick-start-guide-deployment
Successfully deleted deployment "quick-start-guide-deployment"

BentoML by default stores the deployment metadata on the local machine. For team settings, we recommend hosting a shared BentoML Yatai server for your entire team to track all BentoService saved bundle and deployments they've created in a central place.

Summary

This is what it looks like when using BentoML to serve and deploy a model, as a prediction service running in the cloud. BentoML also supports many other Machine Learning frameworks, as well as many other deployment platforms. You can find more BentoML example notebooks here.