BentoML Example: H2O Classification

BentoML is an open source platform for machine learning model serving and deployment.

This notebook demonstrates how to use BentoML to turn a H2O model into a docker image containing a REST API server serving this model, as well as distributing your model as a command line tool or a pip-installable PyPI package.

Impression

In [ ]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
In [ ]:
!pip install bentoml
!pip install h2o
In [ ]:
import h2o
import bentoml

h2o.init()

This show case considers prostate cancer data and tries to find an algorithm to prognose a certain phase of cancer. The dataset was collected at the Ohio State University Comprehensive Cancer Center and includes demographic and medical data from each of the 380 patients as well as a classifier identifying if the patients tumor has already penetrated the prostatic capsule. This latter event is a clear sign for an advanced cancer state and also helps the doctor to decide on biopsy and treatment methods.

In this show case a deep learning algorithm is used to classify the tumors of the patients into 'penetrating prostatic capsule' and 'not penetrating prostatic capsule'.

Prepare Dataset & Model Training

In [ ]:
prostate = h2o.import_file(path="https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv")
prostate.describe()
In [ ]:
# import the deep learning estimator module
from h2o.estimators.deeplearning import H2ODeepLearningEstimator
# transform the target variable into a factor
prostate["CAPSULE"] = prostate["CAPSULE"].asfactor()
# construct and define the estimator object 
model = H2ODeepLearningEstimator(activation = "Tanh", hidden = [10, 10, 10], epochs = 100)
# train the model on the whole prostate dataset
model.train(x = list(set(prostate.columns) - set(["ID","CAPSULE"])), y ="CAPSULE", training_frame = prostate)
model.show()
In [ ]:
predictions=model.predict(prostate)
predictions.show()

Define BentoService for model serving

In [ ]:
%%writefile h2o_model_service.py
import pandas as pd
import h2o
import bentoml
from bentoml.artifact import H2oModelArtifact
from bentoml.handlers import DataframeHandler

@bentoml.artifacts([H2oModelArtifact('model')])
@bentoml.env(pip_dependencies=['h2o'])
class H2oModelService(bentoml.BentoService):

    @bentoml.api(DataframeHandler)
    def predict(self, df):     
        hf = h2o.H2OFrame(df)
        predictions = self.artifacts.model.predict(hf)
        return predictions.as_data_frame()

Save BentoService to file archive

In [ ]:
# 1) import the custom BentoService defined above
from h2o_model_service import H2oModelService

# 2) `pack` it with required artifacts
bento_svc = H2oModelService.pack(model=model)

# 3) save your BentoSerivce
saved_path = bento_svc.save()

Load BentoService from archive

In [ ]:
import bentoml
import pandas as pd

# Load saved BentoService archive from file directory
loaded_bento_svc = bentoml.load(saved_path)

# Access the predict function of loaded BentoService
df = pd.read_csv("https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv")
loaded_bento_svc.predict(df)

"pip install" a BentoService archive

BentoML user can directly pip install saved BentoML archive with pip install $SAVED_PATH, and use it as a regular python package.

For demo purpurse, copy generated model to ./model folder

In [ ]:
!pip install {saved_path}
In [ ]:
# Your bentoML model class name will become packaged name
import H2oModelService

ms = H2oModelService.load() # call load to ensure all artifacts are loaded
ms.predict(pd.read_csv('https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv'))

Use installed BentoService as CLI tool

In [ ]:
!H2oModelService --help
In [ ]:
!H2oModelService info
In [ ]:
!H2oModelService predict --help
In [ ]:
!H2oModelService predict --input https://raw.githubusercontent.com/multicode/h2o-notebook/master/prostate.csv

Model Serving via REST API

In your termnial, run the following command to start the REST API server:

In [ ]:
!bentoml serve {saved_path}

Send prediction request to REST API server

Run the following command in terminal to make a HTTP request to the API server:

curl -i \
--header "Content-Type: text/csv" \
--request POST \
--data 'ID,CAPSULE,AGE,RACE,DPROS,DCAPS,PSA,VOL,GLEASON\n
1,0,65,1,2,1,1.4,0,6\n
2,0,72,1,3,2,6.7,0,7\n' \
localhost:5000/predict

Containerize REST API server with Docker

** Note: docker is not available when running in Google Colaboratory

Build the docker image

In [ ]:
!cd "./model" && docker build -t h2o-model .

Run the server with docker image

In [ ]:
!docker run -p 5000:5000 h2o-model