BentoML Example: Keras Text Classification

BentoML is an open source platform for machine learning model serving and deployment.

This notebook demonstrates how to use BentoML to turn a Keras model into a docker image containing a REST API server serving this model, how to use your ML service built with BentoML as a CLI tool, and how to distribute it a pypi package.

This notebook is built based on Keras's IMDB LSTM tutorial here.

Impression

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
In [ ]:
!pip install bentoml
!pip install tensorflow==1.14.0
!pip install numpy
In [2]:
from __future__ import absolute_import, division, print_function

import numpy as np
import tensorflow as tf
print("Tensorflow Version: %s" % tf.__version__)

from tensorflow import keras
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding
from tensorflow.keras.layers import LSTM
from tensorflow.keras.datasets import imdb

import bentoml
print("BentoML Version: %s" % bentoml.__version__)
Tensorflow Version: 1.14.0
BentoML Version: 0.4.4
In [3]:
max_features = 1000
maxlen = 80 # cut texts after this number of words (among top max_features most common words)
batch_size = 300
index_from=3 # word index offset

Prepare Dataset

Download the IMDB dataset

In [4]:
# A dictionary mapping words to an integer index
imdb.load_data(num_words=max_features)
word_index = imdb.get_word_index()

# The first indices are reserved
word_index = {k:(v+index_from) for k,v in word_index.items()} 
word_index["<PAD>"] = 0
word_index["<START>"] = 1
word_index["<UNK>"] = 2  # unknown

# Use decode_review to look at original review text in training/testing data
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
def decode_review(encoded_text):
    return ' '.join([reverse_word_index.get(i, '?') for i in encoded_text])
In [5]:
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features, index_from=index_from)
In [6]:
x_train = sequence.pad_sequences(x_train,
                                 value=word_index["<PAD>"],
                                 padding='post',
                                 maxlen=maxlen)

x_test = sequence.pad_sequences(x_test,
                                value=word_index["<PAD>"],
                                padding='post',
                                maxlen=maxlen)

Model Training & Evaluation

In [7]:
model = Sequential()
model.add(Embedding(max_features, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))

model.summary()
WARNING: Logging before flag parsing goes to stderr.
W1014 17:10:04.254548 4541836608 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/keras/initializers.py:119: calling RandomUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:10:04.273470 4541836608 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, None, 128)         128000    
_________________________________________________________________
lstm (LSTM)                  (None, 128)               131584    
_________________________________________________________________
dense (Dense)                (None, 1)                 129       
=================================================================
Total params: 259,713
Trainable params: 259,713
Non-trainable params: 0
_________________________________________________________________
In [8]:
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=1, # for demo purpose :P
          validation_data=(x_test, y_test))
W1014 17:10:04.728079 4541836608 deprecation.py:323] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Train on 25000 samples, validate on 25000 samples
25000/25000 [==============================] - 35s 1ms/sample - loss: 0.5724 - acc: 0.6998 - val_loss: 0.4532 - val_acc: 0.7958
Out[8]:
<tensorflow.python.keras.callbacks.History at 0x1a38940240>
In [9]:
score, acc = model.evaluate(x_test, y_test,
                            batch_size=batch_size)

print('Test score:', score)
print('Test accuracy:', acc)
25000/25000 [==============================] - 8s 305us/sample - loss: 0.4532 - acc: 0.7958
Test score: 0.453183061003685
Test accuracy: 0.79584

Define BentoService for model serving

In [10]:
%%writefile text_classification_service.py
import pandas as pd
import numpy as np
from tensorflow import keras
from tensorflow.keras.preprocessing import sequence, text
from bentoml import api, env, BentoService, artifacts
from bentoml.artifact import KerasModelArtifact, PickleArtifact
from bentoml.handlers import JsonHandler

max_features = 1000

@artifacts([
    KerasModelArtifact('model'),
    PickleArtifact('word_index')
])
@env(pip_dependencies=['tensorflow==1.14.0', 'numpy', 'pandas'])
class TextClassificationService(BentoService):
   
    def word_to_index(self, word):
        if word in self.artifacts.word_index and self.artifacts.word_index[word] <= max_features:
            return self.artifacts.word_index[word]
        else:
            return self.artifacts.word_index["<UNK>"]
    
    def preprocessing(self, text_str):
        sequence = text.text_to_word_sequence(text_str)
        return list(map(self.word_to_index, sequence))
    
    @api(JsonHandler)
    def predict(self, parsed_json):
        if type(parsed_json) == list:
            input_data = list(map(self.preprocessing, parsed_json))
        else: # expecting type(parsed_json) == dict:
            input_data = [self.preprocessing(parsed_json['text'])]

        input_data = sequence.pad_sequences(input_data,
                                            value=self.artifacts.word_index["<PAD>"],
                                            padding='post',
                                            maxlen=80)

        return self.artifacts.model.predict_classes(input_data)
Overwriting text_classification_service.py

Save BentoService to file archive

In [11]:
# 1) import the custom BentoService defined above
from text_classification_service import TextClassificationService

# 2) `pack` it with required artifacts
bento_svc = TextClassificationService.pack(model=model, word_index=word_index)

# 3) save your BentoSerivce
saved_path = bento_svc.save()
W1014 17:10:48.681390 4541836608 deprecation_wrapper.py:119] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/bentoml/artifact/keras_model_artifact.py:90: The name tf.keras.backend.get_session is deprecated. Please use tf.compat.v1.keras.backend.get_session instead.

[2019-10-14 17:11:10,915] INFO - Successfully saved Bento 'TextClassificationService:20191014171048_7AB28F' to path: /Users/chaoyuyang/bentoml/repository/TextClassificationService/20191014171048_7AB28F

Test packed BentoML service

In [12]:
bento_svc.predict({ 'text': 'bad worst terrible' })
Out[12]:
array([[0]], dtype=int32)
In [13]:
bento_svc.predict(['the best movie I have ever seen', 'This is a bad movie'])
Out[13]:
array([[1],
       [1]], dtype=int32)

Load BentoML Service from archive

In [14]:
import bentoml

loaded_bento_svc = bentoml.load(saved_path)
[2019-10-14 17:11:11,437] WARNING - Module `text_classification_service` already loaded, using existing imported module.
W1014 17:11:11.451318 4541836608 deprecation_wrapper.py:119] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/bentoml/artifact/keras_model_artifact.py:104: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.

W1014 17:11:11.467543 4541836608 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:11:11.468544 4541836608 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling Orthogonal.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:11:11.469242 4541836608 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
In [15]:
loaded_bento_svc.predict({ "text": "the best movie I have ever seen" })
Out[15]:
array([[1]], dtype=int32)
In [16]:
loaded_bento_svc.predict(['the best movie I have ever seen', 'This is a bad movie'])
Out[16]:
array([[1],
       [1]], dtype=int32)

Run REST API server locally

A saved BentoML service archive can be loaded as a REST API server with bentoml cli:

In [17]:
!bentoml serve {saved_path}
2019-10-14 17:11:20.920271: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
WARNING: Logging before flag parsing goes to stderr.
W1014 17:11:20.920843 4630154560 deprecation_wrapper.py:119] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/bentoml/artifact/keras_model_artifact.py:104: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.

W1014 17:11:20.923239 4630154560 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/keras/initializers.py:119: calling RandomUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:11:20.936422 4630154560 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:11:20.936594 4630154560 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:11:20.936901 4630154560 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling Orthogonal.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:11:20.937156 4630154560 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:11:21.365413 4630154560 deprecation.py:323] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W1014 17:11:22.301192 4630154560 deprecation_wrapper.py:119] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/bentoml/artifact/keras_model_artifact.py:90: The name tf.keras.backend.get_session is deprecated. Please use tf.compat.v1.keras.backend.get_session instead.

 * Serving Flask app "TextClassificationService" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
I1014 17:11:22.737357 4630154560 _internal.py:122]  * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
I1014 17:11:49.733807 123145487081472 _internal.py:122] 127.0.0.1 - - [14/Oct/2019 17:11:49] "GET / HTTP/1.1" 200 -
I1014 17:11:49.805727 123145487081472 _internal.py:122] 127.0.0.1 - - [14/Oct/2019 17:11:49] "GET /static/swagger-ui.css HTTP/1.1" 200 -
I1014 17:11:49.806001 123145492336640 _internal.py:122] 127.0.0.1 - - [14/Oct/2019 17:11:49] "GET /static/swagger-ui-bundle.js HTTP/1.1" 200 -
I1014 17:11:50.272161 123145487081472 _internal.py:122] 127.0.0.1 - - [14/Oct/2019 17:11:50] "GET /docs.json HTTP/1.1" 200 -
I1014 17:11:50.437061 123145487081472 _internal.py:122] 127.0.0.1 - - [14/Oct/2019 17:11:50] "GET /favicon.ico HTTP/1.1" 404 -
I1014 17:12:00.931077 123145487081472 _internal.py:122] 127.0.0.1 - - [14/Oct/2019 17:12:00] "POST /predict HTTP/1.1" 200 -
^C

Send prediction request to REST API server

Run the following command in terminal to make a HTTP request to the API server

curl -i \
--header "Content-Type: application/json" \
--request POST \
--data '{"text": "best movie ever"}' \
localhost:5000/predict

"pip install" a BentoML archive

BentoML user can directly pip install saved BentoML archive with pip install $SAVED_PATH, and use it as a regular python package.

In [18]:
!pip install {saved_path}
Processing /Users/chaoyuyang/bentoml/repository/TextClassificationService/20191014171048_7AB28F
Building wheels for collected packages: TextClassificationService
  Building wheel for TextClassificationService (setup.py) ... done
  Stored in directory: /private/var/folders/ns/vc9qhmqx5dx_9fws7d869lqh0000gn/T/pip-ephem-wheel-cache-1qogzaiz/wheels/9e/61/f6/f72c31447059fe65311fb4a124d078c2056a8f48fff00306e3
Successfully built TextClassificationService
Installing collected packages: TextClassificationService
  Found existing installation: TextClassificationService 20191014161633-179F7E
    Uninstalling TextClassificationService-20191014161633-179F7E:
      Successfully uninstalled TextClassificationService-20191014161633-179F7E
Successfully installed TextClassificationService-20191014171048-7AB28F
In [19]:
import TextClassificationService

installed_svc = TextClassificationService.load()
In [20]:
installed_svc.predict({ 'text': 'the best movie I have ever seen' })
Out[20]:
array([[1]], dtype=int32)
In [21]:
installed_svc.predict({ 'text': 'This is a bad movie' })
Out[21]:
array([[1]], dtype=int32)

CLI access

pip install $SAVED_PATH also installs a CLI tool for accessing the BentoML service

In [ ]:
!TextClassificationService --help
In [22]:
!TextClassificationService info
{
  "name": "TextClassificationService",
  "version": "20191014171048_7AB28F",
  "created_at": "2019-10-15T00:11:10.905089Z",
  "env": {
    "conda_env": "name: bentoml-custom-conda-env\nchannels:\n- defaults\ndependencies:\n- python=3.7.3\n- pip\n- pip:\n  - bentoml[api_server]==0.4.4\n",
    "pip_dependencies": "bentoml==0.4.4\ntensorflow==1.14.0\nnumpy\npandas",
    "python_version": "3.7.3"
  },
  "artifacts": [
    {
      "name": "model",
      "artifact_type": "KerasModelArtifact"
    },
    {
      "name": "word_index",
      "artifact_type": "PickleArtifact"
    }
  ],
  "apis": [
    {
      "name": "predict",
      "handler_type": "JsonHandler",
      "docs": "BentoML generated API endpoint"
    }
  ]
}

Run 'predict' api with json data:

In [23]:
!TextClassificationService predict --input='{"text": "bad movie"}'
2019-10-14 17:12:28.122621: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
WARNING: Logging before flag parsing goes to stderr.
W1014 17:12:28.123153 4740504896 deprecation_wrapper.py:119] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/bentoml/artifact/keras_model_artifact.py:104: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.

W1014 17:12:28.125826 4740504896 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/keras/initializers.py:119: calling RandomUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:12:28.142730 4740504896 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:12:28.143008 4740504896 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:12:28.143602 4740504896 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling Orthogonal.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:12:28.144154 4740504896 deprecation.py:506] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W1014 17:12:28.613065 4740504896 deprecation.py:323] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W1014 17:12:29.547184 4740504896 deprecation_wrapper.py:119] From /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages/bentoml/artifact/keras_model_artifact.py:90: The name tf.keras.backend.get_session is deprecated. Please use tf.compat.v1.keras.backend.get_session instead.

[[1]]