BentoML makes moving trained ML models to production easy:
BentoML is a framework for serving, managing, and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable teams to deliver prediction services in a fast, repeatable, and scalable way.
Before reading this example project, be sure to check out the Getting started guide to learn about the basic concepts in BentoML.
This notebook demonstrates how to use BentoML to turn a H2O model into a docker image containing a REST API server serving this model, as well as distributing your model as a command line tool or a pip-installable PyPI package.
%reload_ext autoreload
%autoreload 2
%matplotlib inline
!pip install -q bentoml "h2o>=3.24.0.2"
import h2o
import bentoml
h2o.init()
This show case considers prostate cancer data and tries to find an algorithm to prognose a certain phase of cancer. The dataset was collected at the Ohio State University Comprehensive Cancer Center and includes demographic and medical data from each of the 380 patients as well as a classifier identifying if the patients tumor has already penetrated the prostatic capsule. This latter event is a clear sign for an advanced cancer state and also helps the doctor to decide on biopsy and treatment methods.
In this show case a deep learning algorithm is used to classify the tumors of the patients into 'penetrating prostatic capsule' and 'not penetrating prostatic capsule'.
prostate = h2o.import_file(path="https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv")
prostate.describe()
# import the deep learning estimator module
from h2o.estimators.deeplearning import H2ODeepLearningEstimator
# transform the target variable into a factor
prostate["CAPSULE"] = prostate["CAPSULE"].asfactor()
# construct and define the estimator object
model = H2ODeepLearningEstimator(activation = "Tanh", hidden = [10, 10, 10], epochs = 100)
# train the model on the whole prostate dataset
model.train(x = list(set(prostate.columns) - set(["ID","CAPSULE"])), y ="CAPSULE", training_frame = prostate)
model.show()
predictions=model.predict(prostate)
predictions.show()
%%writefile h2o_model_service.py
import pandas as pd
import h2o
import bentoml
from bentoml.frameworks.h2o import H2oModelArtifact
from bentoml.adapters import DataframeInput
@bentoml.artifacts([H2oModelArtifact('model')])
@bentoml.env(
pip_packages=['pandas', 'h2o==3.24.0.2'],
conda_channels=['h2oai'],
conda_dependencies=['h2o==3.24.0.2']
)
class H2oModelService(bentoml.BentoService):
@bentoml.api(input=DataframeInput(), batch=True)
def predict(self, df):
hf = h2o.H2OFrame(df)
predictions = self.artifacts.model.predict(hf)
return predictions.as_data_frame()
# 1) import the custom BentoService defined above
from h2o_model_service import H2oModelService
# 2) `pack` it with required artifacts
bento_svc = H2oModelService()
bento_svc.pack('model', model)
# 3) save your BentoSerivce
saved_path = bento_svc.save()
print(saved_path)
To start a REST API model server with the BentoService saved above, use the bentoml serve command:
!bentoml serve {saved_path}
If you are running this notebook from Google Colab, you can start the dev server with --run-with-ngrok
option, to gain acccess to the API endpoint via a public endpoint managed by ngrok:
!bentoml serve H2oModelService:latest --run-with-ngrok
Run the following command in terminal to make a HTTP request to the API server:
curl -i \
--header "Content-Type: text/csv" \
--request POST \
--data 'ID,CAPSULE,AGE,RACE,DPROS,DCAPS,PSA,VOL,GLEASON\n
1,0,65,1,2,1,1.4,0,6\n
2,0,72,1,3,2,6.7,0,7\n' \
localhost:5000/predict
One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.
Note that docker is not available in Google Colab. You will need to download and run this notebook locally to try out this containerization with docker feature.
If you already have docker configured, simply run the follow command to product a docker container serving the IrisClassifier prediction service created above:
!bentoml containerize H2oModelService:latest
!docker run -p 5000:5000 h2omodelservice
bentoml.load is the API for loading a BentoML packaged model in python:
import bentoml
import pandas as pd
# Load saved BentoService archive from file directory
loaded_bento_svc = bentoml.load(saved_path)
# Access the predict function of loaded BentoService
df = pd.read_csv("https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv")
loaded_bento_svc.predict(df)
BentoML cli supports loading and running a packaged model from CLI. With the DataframeInput adapter, the CLI command supports reading input Dataframe data from CLI argument or local csv or json files:
!wget https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv
!bentoml run H2oModelService:latest predict \
--input-file prostate.csv
If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:
If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:
Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy: