BentoML Example: Fast AI with Tabular data

This notebook is based on fastai's cours v3 lesson 4. We are going to train a model that predict salary range base on the data we provided.

BentoML is an open source platform for machine learning model serving and deployment. In this project we will use BentoML to package the trained fast.ai model, and build a containerized REST API model server.

Impression

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
In [ ]:
!pip install fastai
!pip install bentoml
In [2]:
from fastai.tabular import *

Prepare Training Data

In [3]:
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')
In [4]:
dep_var = 'salary'
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [FillMissing, Categorify, Normalize]
In [5]:
test = TabularList.from_df(df.iloc[800:1000].copy(), path=path, cat_names=cat_names, cont_names=cont_names)
In [6]:
data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_by_idx(list(range(800,1000)))
                           .label_from_df(cols=dep_var)
                           .add_test(test)
                           .databunch())
In [7]:
data.show_batch(rows=10)
workclass education marital-status occupation relationship race education-num_na age fnlwgt education-num target
State-gov Masters Never-married Prof-specialty Not-in-family Asian-Pac-Islander False -1.0692 -0.3982 1.5334 <50k
Private Some-college Never-married Sales Own-child White False -1.4357 -0.8773 -0.0312 <50k
Private Prof-school Married-civ-spouse Prof-specialty Wife White False -0.5561 -0.7896 1.9245 >=50k
Private HS-grad Married-civ-spouse Prof-specialty Wife Black False -0.8493 -0.2715 -0.4224 <50k
Self-emp-inc Some-college Married-civ-spouse Prof-specialty Husband White False 0.0303 -0.5977 -0.0312 <50k
Private Bachelors Married-civ-spouse Prof-specialty Husband White False 0.1036 -1.5616 1.1422 >=50k
Private HS-grad Separated Craft-repair Unmarried Amer-Indian-Eskimo False 1.4962 0.1539 -0.4224 <50k
Private HS-grad Married-civ-spouse Exec-managerial Husband White True -0.4828 -0.5755 -0.0312 <50k
Federal-gov Bachelors Divorced Adm-clerical Own-child White False 0.2502 -0.6996 1.1422 <50k
Self-emp-not-inc Some-college Married-civ-spouse Craft-repair Husband White False -0.3362 1.5115 -0.0312 <50k

Model Training

In [8]:
learn = tabular_learner(data, layers=[200,100], metrics=accuracy)
In [9]:
learn.fit(1, 1e-2)
epoch train_loss valid_loss accuracy time
0 0.369473 0.401396 0.790000 00:02
In [10]:
row = df.iloc[0] # sample input date for testing

learn.predict(row)
Out[10]:
(Category >=50k, tensor(1), tensor([0.3250, 0.6750]))

Create BentoService for model serving

In [11]:
%%writefile tabular_csv.py

from bentoml import env, api, artifacts, BentoService
from bentoml.artifact import FastaiModelArtifact
from bentoml.handlers import DataframeHandler


@env(pip_dependencies=['fastai'])
@artifacts([FastaiModelArtifact('model')])
class TabularModel(BentoService):
    
    @api(DataframeHandler)
    def predict(self, df):
        results = []
        for _, row in df.iterrows():       
            prediction = self.artifacts.model.predict(row)
            results.append(prediction[0].obj)
        return results
Overwriting tabular_csv.py

Save BentoService to file archive

In [12]:
# 1) import the custom BentoService defined above
from tabular_csv import TabularModel

# 2) `pack` it with required artifacts
svc = TabularModel.pack(model=learn)

# 3) save your BentoSerivce
saved_path = svc.save()
[2019-10-24 15:08:09,321] INFO - Successfully saved Bento 'TabularModel:20191024150753_25C9AB' to path: /Users/chaoyuyang/bentoml/repository/TabularModel/20191024150753_25C9AB

Install saved BentoService as PyPI package

In [13]:
!pip install {saved_path}
Processing /Users/chaoyuyang/bentoml/repository/TabularModel/20191024150753_25C9AB
Building wheels for collected packages: TabularModel
  Building wheel for TabularModel (setup.py) ... done
  Stored in directory: /private/var/folders/ns/vc9qhmqx5dx_9fws7d869lqh0000gn/T/pip-ephem-wheel-cache-0ji6sbpu/wheels/02/6d/d4/454ee8d2e19512660a7a5d74a1753fa1199549d7bbbebfed5a
Successfully built TabularModel
Installing collected packages: TabularModel
  Found existing installation: TabularModel 20191014172300-C209CA
    Uninstalling TabularModel-20191014172300-C209CA:
      Successfully uninstalled TabularModel-20191014172300-C209CA
Successfully installed TabularModel-20191024150753-25C9AB
In [14]:
# Use CSV data
!TabularModel predict --input=https://raw.githubusercontent.com/bentoml/gallery/master/fast-ai/salary-range-prediction/test.csv
['>=50k']
In [ ]:
# Use json data
!TabularModel predict --input=https://raw.githubusercontent.com/bentoml/gallery/master/fast-ai/salary-range-prediction/test.json
['<50k']

Model Serving via REST API

Note: Running as local rest api server does not work with Google Colab, please copy this notebook to run it locally

In [ ]:
!bentoml serve {saved_path}
 * Serving Flask app "TabularModel" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [24/Oct/2019 15:08:25] "POST /predict HTTP/1.1" 200 -
127.0.0.1 - - [24/Oct/2019 15:08:27] "POST /predict HTTP/1.1" 200 -

Send prediction requeset to the REST API server

JSON Request

curl -X POST \
  http://localhost:5000/predict \
  -H 'Content-Type: application/json' \
  -d '[{
  "age": 49,
  "workclass": "Private",
  "fnlwgt": 101320,
  "education": "Assoc-acdm",
  "education-num": 12.0,
  "marital-status": "Married-civ-spouse",
  "occupation": "",
  "relationship": "Wift",
  "race": "White",
  "sex": "Female",
  "capital-gain": 0,
  "capital-loss": 1902,
  "hours-per-week": 40,
  "native-country": "United-States",
  "salary": ">=50k"
}]'

CSV Request

curl -X POST "http://127.0.0.1:5000/predict" \
    -H "Content-Type: text/csv" \
    --data-binary @test.csv