BentoML makes moving trained ML models to production easy:
BentoML is a framework for serving, managing, and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable teams to deliver prediction services in a fast, repeatable, and scalable way. Before reading this example project, be sure to check out the Getting started guide to learn about the basic concepts in BentoML.
This notebook demonstrates how to use BentoML to turn a PyTorch model into a docker image containing a REST API server serving this model, how to use your ML service built with BentoML as a CLI tool, and how to distribute it a pypi package.
This example was built based on https://github.com/baldassarreFe/zalando-pytorch/blob/master/notebooks/4.0-fb-autoencoder.ipynb, if you are familiar with this, jump start to Model Serving using BentoML
%reload_ext autoreload
%autoreload 2
%matplotlib inline
!pip install -q bentoml "torch==1.6.0" "torchvision==0.7.0" "sklearn>=0.23.2" "pillow==7.2.0" "pandas>=1.1.1" "numpy>=1.16.0"
import bentoml
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import transforms
from torch.autograd import Variable
from sklearn.manifold import TSNE
from sklearn.metrics import accuracy_score
print("Torch version: ", torch.__version__)
print("CUDA: ", torch.cuda.is_available())
Torch version: 1.4.0 CUDA: True
PyTorch supports FashionMNIST now, so we can import it directly.
from torchvision.datasets import FashionMNIST
FASHION_MNIST_CLASSES = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
Load train and test set in batches of 1000.
The 28x28
images are scaled up to 29x29
so that combining convolutions and transposed convolutions would not chop off pixels from the reconstructed images.
batch_size = 1000
train_dataset = FashionMNIST(
'../data', train=True, download=True,
transform=transforms.Compose([transforms.CenterCrop((29, 29)), transforms.ToTensor()]))
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_dataset = FashionMNIST(
'../data', train=False, download=True,
transform=transforms.Compose([transforms.CenterCrop((29, 29)), transforms.ToTensor()]))
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=True)
Note that in this section we'll never use the image labels, the whole training is unsupervised.
The two components of the autoencoder are defined subclassing nn.Module
, that gives more flexibility than nn.Sequential
.
A series of convolutions with kernel_size=5
and stride=2
is used to squeeze the images into a volume of 40x1x1, then a fully connected layer turns this vector in a vector of size embedding_size
, that can be specified externally.
The decoder takes up from where the encoder left, first transforming back the embedding of size embedding_size
into a volume of size 40x1x1, then applying a series of Transposed Convolutions to yield an image of the same size of the original input.
At this time we can show some images in this Dataloader.
class Encoder(nn.Module):
def __init__(self, embedding_size):
super(Encoder, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5, stride=2)
self.conv2 = nn.Conv2d(10, 20, kernel_size=5, stride=2)
self.conv3 = nn.Conv2d(20, 40, kernel_size=5, stride=2)
self.fully = nn.Linear(40, embedding_size)
def forward(self, x):
# 1x29x29
x = torch.relu(self.conv1(x))
# 10x13x13
x = torch.relu(self.conv2(x))
# 20x5x5
x = torch.relu(self.conv3(x))
# 40x1x1
x = x.view(x.data.shape[0], 40)
# 40
x = self.fully(x)
# output_size
return x
class Decoder(nn.Module):
def __init__(self, input_size):
super(Decoder, self).__init__()
self.fully = nn.Linear(input_size, 40)
self.conv1 = nn.ConvTranspose2d(40, 20, kernel_size=5, stride=2)
self.conv2 = nn.ConvTranspose2d(20, 10, kernel_size=5, stride=2)
self.conv3 = nn.ConvTranspose2d(10, 1, kernel_size=5, stride=2)
def forward(self, x):
x = self.fully(x)
x = x.view(x.data.shape[0], 40, 1, 1)
x = torch.relu(self.conv1(x))
x = torch.relu(self.conv2(x))
x = torch.sigmoid(self.conv3(x))
return x
We are going to use an embedding size of 20, this number has no particular reason, except that it is in the same range of the number of classes. Naively, the network could learn to encode coarse-grained information (i.e. the kind of dress) in half of the embedding vector and then use the other half for fine-grained information.
embedding_size = 20
encoder = Encoder(embedding_size)
decoder = Decoder(embedding_size)
autoencoder = nn.Sequential(encoder, decoder)
A 29x29 black and white image passed through the autoencoder should give the same output dimension
x = Variable(torch.ones(1, 1, 29, 29))
e = encoder(x)
d = decoder(e)
print('Input\t ', list(x.data.shape))
print('Embedding', list(e.data.shape))
print('Output\t ', list(d.data.shape))
Input [1, 1, 29, 29] Embedding [1, 20] Output [1, 1, 29, 29]
autoencoder.train()
loss_fn = nn.MSELoss()
optimizer = optim.Adam(autoencoder.parameters())
epoch_loss = []
for epoch in range(5):
batch_loss = []
for batch_num, (data, _) in enumerate(train_loader):
data = Variable(data)
optimizer.zero_grad()
output = autoencoder(data)
loss = loss_fn(output, data)
loss.backward()
optimizer.step()
batch_loss.append(loss.item())
epoch_loss.append(sum(batch_loss) / len(batch_loss))
print('Epoch {}:\tloss {:.4f}'.format(epoch, epoch_loss[-1]))
Epoch 0: loss 0.1321 Epoch 1: loss 0.0710 Epoch 2: loss 0.0438 Epoch 3: loss 0.0373 Epoch 4: loss 0.0321
plt.plot(epoch_loss)
plt.title('Final value {:.4f}'.format(epoch_loss[-1]))
plt.xlabel('Epoch')
plt.grid(True)
Reconsruction evaluation on a single batch
autoencoder.eval()
data, targets = next(test_loader.__iter__())
encodings = encoder(Variable(data))
outputs = decoder(encodings)
print('Test loss: {:.4f}'.format(loss_fn(outputs, Variable(data)).item()))
Test loss: 0.0295
fig, axes = plt.subplots(8, 8, figsize=(16, 16))
axes = axes.ravel()
zip_these = axes[::2], axes[1::2], data.numpy().squeeze(), outputs.data.numpy().squeeze(), targets
for ax1, ax2, original, reconstructed, target in zip(*zip_these):
ax1.imshow(original, cmap='gray')
ax1.axis('off')
ax1.set_title(FASHION_MNIST_CLASSES[target])
ax2.imshow(reconstructed, cmap='gray')
ax2.axis('off')
The embeddings are 20-dimensional, t-SNE is used to visualize them as clusters in 2D space.
Even though the autoencoder learned the embeddings in a completely unsupervised way we can observe the emergence of clusters:
pca = TSNE(n_components=2)
encodings_2 = pca.fit_transform(encodings.data.numpy())
plt.figure(figsize=(10, 10))
for k in range(len(FASHION_MNIST_CLASSES)):
class_indexes = (targets.numpy() == k)
plt.scatter(encodings_2[class_indexes, 0], encodings_2[class_indexes, 1], label=FASHION_MNIST_CLASSES[k])
plt.legend();
Once trained in an unsupervised fashion, the encoder module can be used to generate fashion embeddings (see what I did here?), that can then be used to train a simple classifier on the original labels.
The weights of the encoder are freezed, so only the classifier will be trained.
(later on, when the classifier starts performing decently, we could unfreeze them and do some fine-tuning)
for param in encoder.parameters():
param.requires_grad = False
classifier = nn.Sequential(
encoder,
nn.Linear(embedding_size, 15),
nn.ReLU(),
nn.Linear(15, len(FASHION_MNIST_CLASSES)),
nn.LogSoftmax()
)
classifier.train()
loss_fn = nn.NLLLoss()
optimizer = optim.Adam([p for p in classifier.parameters() if p.requires_grad])
epoch_loss = []
for epoch in range(5):
batch_loss = []
for batch_num, (data, targets) in enumerate(train_loader):
data, targets = Variable(data), Variable(targets)
optimizer.zero_grad()
output = classifier(data)
loss = loss_fn(output, targets)
loss.backward()
optimizer.step()
batch_loss.append(loss.item())
epoch_loss.append(sum(batch_loss) / len(batch_loss))
accuracy = accuracy_score(targets.data.numpy(), output.data.numpy().argmax(axis=1))
print('Epoch {}:\tloss {:.4f}\taccuracy {:.2%}'.format(epoch, epoch_loss[-1], accuracy))
/opt/conda/envs/bentoml-dev-py36/lib/python3.6/site-packages/torch/nn/modules/container.py:100: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument. input = module(input)
Epoch 0: loss 2.3179 accuracy 45.70% Epoch 1: loss 1.2234 accuracy 59.00% Epoch 2: loss 1.0394 accuracy 62.90% Epoch 3: loss 0.9558 accuracy 63.00% Epoch 4: loss 0.9095 accuracy 68.10%
plt.plot(epoch_loss)
plt.title('Final value {:.4f}'.format(epoch_loss[-1]))
plt.xlabel('Epoch')
plt.grid(True)
Reconsruction evaluation on a single batch
classifier.eval()
data, targets = next(test_loader.__iter__())
outputs = classifier(Variable(data))
log_probs, output_classes = outputs.max(dim=1)
accuracy = accuracy_score(targets.numpy(), output_classes.data.numpy())
print('Accuracy: {:.2%}'.format(accuracy))
Accuracy: 64.70%
/opt/conda/envs/bentoml-dev-py36/lib/python3.6/site-packages/torch/nn/modules/container.py:100: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument. input = module(input)
fig, axex = plt.subplots(8, 8, figsize=(16, 16))
zip_these = axex.ravel(), log_probs.data.exp(), output_classes.data, targets, data.numpy().squeeze()
for ax, prob, output_class, target, img in zip(*zip_these):
ax.imshow(img, cmap='gray' if output_class == target else 'autumn')
ax.axis('off')
ax.set_title('{} {:.1%}'.format(FASHION_MNIST_CLASSES[output_class], prob))
%%writefile pytorch_fashion_mnist.py
from typing import BinaryIO, List
import bentoml
from PIL import Image
import torch
from torchvision import transforms
from bentoml.frameworks.pytorch import PytorchModelArtifact
from bentoml.adapters import FileInput, JsonOutput
FASHION_MNIST_CLASSES = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
@bentoml.env(pip_packages=['torch', 'numpy', 'torchvision', 'scikit-learn'])
@bentoml.artifacts([PytorchModelArtifact('classifier')])
class PyTorchFashionClassifier(bentoml.BentoService):
@bentoml.utils.cached_property # reuse transformer
def transform(self):
return transforms.Compose([transforms.CenterCrop((29, 29)), transforms.ToTensor()])
@bentoml.api(input=FileInput(), output=JsonOutput(), batch=True)
def predict(self, file_streams: List[BinaryIO]) -> List[str]:
img_tensors = []
for fs in file_streams:
img = Image.open(fs).convert(mode="L").resize((28, 28))
img_tensors.append(self.transform(img))
outputs = self.artifacts.classifier(torch.stack(img_tensors))
_, output_classes = outputs.max(dim=1)
return [FASHION_MNIST_CLASSES[output_class] for output_class in output_classes]
Overwriting pytorch_fashion_mnist.py
# 1) import the custom BentoService defined above
from pytorch_fashion_mnist import PyTorchFashionClassifier
# 2) `pack` it with required artifacts
bento_svc = PyTorchFashionClassifier()
bento_svc.pack('classifier', classifier)
# 3) save your BentoSerivce
saved_path = bento_svc.save()
[2020-09-23 11:40:29,693] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle. [2020-09-23 11:40:29,733] WARNING - BentoML by default does not include spacy and torchvision package when using PytorchModelArtifact. To make sure BentoML bundle those packages if they are required for your model, either import those packages in BentoService definition file or manually add them via `@env(pip_packages=['torchvision'])` when defining a BentoService [2020-09-23 11:40:29,734] WARNING - pip package requirement torch already exist [2020-09-23 11:40:31,270] INFO - Detected non-PyPI-released BentoML installed, copying local BentoML modulefiles to target saved bundle path..
/opt/conda/envs/bentoml-dev-py36/lib/python3.6/site-packages/setuptools/distutils_patch.py:26: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first. "Distutils was imported before Setuptools. This usage is discouraged " /opt/conda/envs/bentoml-dev-py36/lib/python3.6/site-packages/setuptools/dist.py:458: UserWarning: Normalizing '0.9.0.pre+7.g8af1c8b' to '0.9.0rc0+7.g8af1c8b' warnings.warn(tmpl.format(**locals())) warning: no previously-included files matching '*~' found anywhere in distribution warning: no previously-included files matching '*.pyo' found anywhere in distribution warning: no previously-included files matching '.git' found anywhere in distribution warning: no previously-included files matching '.ipynb_checkpoints' found anywhere in distribution warning: no previously-included files matching '__pycache__' found anywhere in distribution warning: no directories found matching 'bentoml/yatai/web/dist' no previously-included directories found matching 'e2e_tests' no previously-included directories found matching 'tests' no previously-included directories found matching 'benchmark'
UPDATING BentoML-0.9.0rc0+7.g8af1c8b/bentoml/_version.py set BentoML-0.9.0rc0+7.g8af1c8b/bentoml/_version.py to '0.9.0.pre+7.g8af1c8b' [2020-09-23 11:40:32,018] INFO - BentoService bundle 'PyTorchFashionClassifier:20200923114030_0CC108' saved to: /home/bentoml/bentoml/repository/PyTorchFashionClassifier/20200923114030_0CC108
To start a REST API model server with the BentoService saved above, use the bentoml serve command:
!bentoml serve PyTorchFashionClassifier:latest
[2020-09-23 11:42:36,540] INFO - Getting latest version PyTorchFashionClassifier:20200923114030_0CC108 [2020-09-23 11:42:36,541] INFO - Starting BentoML API server in development mode.. [2020-09-23 11:42:37,708] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle. [2020-09-23 11:42:37,722] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.9.0.pre, but loading from BentoML version 0.9.0.pre+7.g8af1c8b [2020-09-23 11:42:38,187] WARNING - BentoML by default does not include spacy and torchvision package when using PytorchModelArtifact. To make sure BentoML bundle those packages if they are required for your model, either import those packages in BentoService definition file or manually add them via `@env(pip_packages=['torchvision'])` when defining a BentoService [2020-09-23 11:42:38,188] WARNING - pip package requirement torch already exist * Serving Flask app "PyTorchFashionClassifier" (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) /opt/conda/envs/bentoml-dev-py36/lib/python3.6/site-packages/torch/nn/modules/container.py:100: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument. input = module(input) [2020-09-23 11:42:50,983] INFO - {'service_name': 'PyTorchFashionClassifier', 'service_version': '20200923114030_0CC108', 'api': 'predict', 'task': {'data': {'name': 'sample_image.png'}, 'task_id': 'bf7a3fda-4130-47b5-9ce7-b9a9baabcfec', 'http_headers': (('Host', '127.0.0.1:5000'), ('User-Agent', 'curl/7.72.0'), ('Content-Length', '133877'), ('Accept', '*/*'), ('Content-Type', 'multipart/form-data; boundary=------------------------2302fdd2ee5a02cc'))}, 'result': {'data': '"Ankle boot"', 'http_status': 200, 'http_headers': (('Content-Type', 'application/json'),)}, 'request_id': 'bf7a3fda-4130-47b5-9ce7-b9a9baabcfec'} 127.0.0.1 - - [23/Sep/2020 11:42:50] "POST /predict HTTP/1.1" 200 - ^C
If you are running this notebook from Google Colab, you can start the dev server with --run-with-ngrok
option, to gain acccess to the API endpoint via a public endpoint managed by ngrok:
!bentoml serve PyTorchFashionClassifier:latest --run-with-ngrok
[2020-09-23 11:41:46,832] INFO - Getting latest version PyTorchFashionClassifier:20200923114030_0CC108 [2020-09-23 11:41:46,833] INFO - Starting BentoML API server in development mode.. [2020-09-23 11:41:47,995] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle. [2020-09-23 11:41:48,010] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.9.0.pre, but loading from BentoML version 0.9.0.pre+7.g8af1c8b [2020-09-23 11:41:48,473] WARNING - BentoML by default does not include spacy and torchvision package when using PytorchModelArtifact. To make sure BentoML bundle those packages if they are required for your model, either import those packages in BentoService definition file or manually add them via `@env(pip_packages=['torchvision'])` when defining a BentoService [2020-09-23 11:41:48,474] WARNING - pip package requirement torch already exist * Serving Flask app "PyTorchFashionClassifier" (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) 7=ngrok by @inconshreveable (Ctrl+C to quit) Session Status connecting Version 2.3.35 Region United States (us) Web Interface http://127.0.0.1:4040 Connections ttl opn rt1 rt5 p50 p90 0 0 0.00 0.00 0.00 0.00 /opt/conda/envs/bentoml-dev-py36/lib/python3.6/site-packages/torch/nn/modules/container.py:100: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument. input = module(input) [2020-09-23 11:42:00,516] INFO - {'service_name': 'PyTorchFashionClassifier', 'service_version': '20200923114030_0CC108', 'api': 'predict', 'task': {'data': {'name': 'sample_image.png'}, 'task_id': 'a5ca2ed5-1a5f-444f-a118-ba6f9217db9e', 'http_headers': (('Host', '127.0.0.1:5000'), ('User-Agent', 'curl/7.72.0'), ('Content-Length', '133877'), ('Accept', '*/*'), ('Content-Type', 'multipart/form-data; boundary=------------------------502072ce3c68f738'))}, 'result': {'data': '"Ankle boot"', 'http_status': 200, 'http_headers': (('Content-Type', 'application/json'),)}, 'request_id': 'a5ca2ed5-1a5f-444f-a118-ba6f9217db9e'} 127.0.0.1 - - [23/Sep/2020 11:42:00] "POST /predict HTTP/1.1" 200 - Exception in thread Thread-1: Traceback (most recent call last): File "/opt/conda/envs/bentoml-dev-py36/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/opt/conda/envs/bentoml-dev-py36/lib/python3.6/threading.py", line 1182, in run self.function(*self.args, **self.kwargs) File "/home/bentoml/BentoML/bentoml/utils/flask_ngrok.py", line 90, in start_ngrok ngrok_address = _run_ngrok(port) File "/home/bentoml/BentoML/bentoml/utils/flask_ngrok.py", line 47, in _run_ngrok tunnel_url = j['tunnels'][0]['public_url'] # Do the parsing of the get IndexError: list index out of range Session Status online SsExpires7 hours, 59 minutesVrsion2.3.35 Rgio United States (us) Web Interface http://127.0.0.1:4040Frwardng htps://ab6aa3a6575b.ngrok.io-> http://localhost: Connections ttl opn rt1 rt5 p50 p90 0 0 0.00 0.00 0.00 0.00 :/ab6a3a6575b.ngrok.io -> htp:/localhost:5Forwarding https://ab6aa3a6575b.ngrok.io -> http://localhost: Connectionsttlopnrt1 rt5 p5 p9 0 0 0.00 0.00 0.00 0.00 [2020-09-23 11:42:02,526] INFO - {'service_name': 'PyTorchFashionClassifier', 'service_version': '20200923114030_0CC108', 'api': 'predict', 'task': {'data': {'name': 'sample_image.png'}, 'task_id': 'e591c974-f161-4bf2-8d4f-22ea8b0b7f80', 'http_headers': (('Host', '127.0.0.1:5000'), ('User-Agent', 'curl/7.72.0'), ('Content-Length', '133877'), ('Accept', '*/*'), ('Content-Type', 'multipart/form-data; boundary=------------------------2577acc72fed9629'))}, 'result': {'data': '"Ankle boot"', 'http_status': 200, 'http_headers': (('Content-Type', 'application/json'),)}, 'request_id': 'e591c974-f161-4bf2-8d4f-22ea8b0b7f80'} 127.0.0.1 - - [23/Sep/2020 11:42:02] "POST /predict HTTP/1.1" 200 - 8>
Sending POST request from termnial:
curl -X POST "http://127.0.0.1:5000/predict" -F image=@sample_image.png
curl -X POST "http://127.0.0.1:5000/predict" -H "Content-Type: image/png" --data-binary @sample_image.png
Go visit http://127.0.0.1:5000/ from your browser, click /predict
-> Try it out
-> Choose File
-> Execute
to sumbit an image from your computer
One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.
Note that docker is not available in Google Colab. You will need to download and run this notebook locally to try out this containerization with docker feature.
If you already have docker configured, simply run the follow command to product a docker container serving the IrisClassifier prediction service created above:
!bentoml containerize PyTorchFashionClassifier:latest -t pytorch-fashion-mnist:latest
[2020-09-23 11:48:08,611] INFO - Getting latest version PyTorchFashionClassifier:20200923114030_0CC108
Found Bento: /home/bentoml/bentoml/repository/PyTorchFashionClassifier/20200923114030_0CC108
[2020-09-23 11:48:08,628] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle.
[2020-09-23 11:48:08,642] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.9.0.pre, but loading from BentoML version 0.9.0.pre+7.g8af1c8b
Building Docker image pytorch-fashion-mnist:latest from PyTorchFashionClassifier:latest
-Step 1/15 : FROM bentoml/model-server:0.9.0.pre-py36
---> 4aac43d10e50
Step 2/15 : ARG EXTRA_PIP_INSTALL_ARGS=
---> Using cache
---> 790054f5ad85
Step 3/15 : ENV EXTRA_PIP_INSTALL_ARGS $EXTRA_PIP_INSTALL_ARGS
---> Using cache
---> 85b0a1b40542
Step 4/15 : COPY environment.yml requirements.txt setup.sh* bentoml-init.sh python_version* /bento/
---> Using cache
---> 402d59d511dd
Step 5/15 : WORKDIR /bento
---> Using cache
---> 1c1ac445d3fb
Step 6/15 : RUN chmod +x /bento/bentoml-init.sh
---> Using cache
---> 8a52a24d4cce
Step 7/15 : RUN if [ -f /bento/bentoml-init.sh ]; then bash -c /bento/bentoml-init.sh; fi
---> Using cache
---> 4b5cabf794af
Step 8/15 : COPY . /bento
---> Using cache
---> 28d18e3337dd
Step 9/15 : RUN if [ -d /bento/bundled_pip_dependencies ]; then pip install -U bundled_pip_dependencies/* ;fi
---> Using cache
---> e9298aab0108
Step 10/15 : ENV PORT 5000
---> Using cache
---> 6198b75aecbb
Step 11/15 : EXPOSE $PORT
---> Using cache
---> ca4e3a6bf3e6
Step 12/15 : COPY docker-entrypoint.sh /usr/local/bin/
---> Using cache
---> 62bb9d1d6295
Step 13/15 : RUN chmod +x /usr/local/bin/docker-entrypoint.sh
---> Using cache
---> 8643f9e6b6f5
Step 14/15 : ENTRYPOINT [ "docker-entrypoint.sh" ]
---> Using cache
---> 6d0bdfc6d739
Step 15/15 : CMD ["bentoml", "serve-gunicorn", "/bento"]
---> Using cache
---> 9c43cb9fcf07
Successfully built 9c43cb9fcf07
Successfully tagged pytorch-fashion-mnist:latest
Finished building pytorch-fashion-mnist:latest from PyTorchFashionClassifier:latest
!docker run -p 5000:5000 pytorch-fashion-mnist
[2020-09-23 03:48:13,508] INFO - Starting BentoML API server in production mode.. [2020-09-23 03:48:13,728] INFO - get_gunicorn_num_of_workers: 3, calculated by cpu count [2020-09-23 03:48:13 +0000] [1] [INFO] Starting gunicorn 20.0.4 [2020-09-23 03:48:13 +0000] [1] [INFO] Listening at: http://0.0.0.0:5000 (1) [2020-09-23 03:48:13 +0000] [1] [INFO] Using worker: sync [2020-09-23 03:48:13 +0000] [12] [INFO] Booting worker with pid: 12 [2020-09-23 03:48:13 +0000] [13] [INFO] Booting worker with pid: 13 [2020-09-23 03:48:13 +0000] [14] [INFO] Booting worker with pid: 14 [2020-09-23 03:48:14,747] WARNING - Using BentoML not from official PyPI release. In order to find the same version of BentoML when deploying your BentoService, you must set the 'core/bentoml_deploy_version' config to a http/git location of your BentoML fork, e.g.: 'bentoml_deploy_version = git+https://github.com/{username}/bentoml.git@{branch}' [2020-09-23 03:48:14,767] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.9.0.pre, but loading from BentoML version 0.9.0.pre+7.g8af1c8b [2020-09-23 03:48:14,767] WARNING - Saved BentoService Python version mismatch: loading BentoService bundle created with Python version 3.6.10, but current environment version is 3.6.12. [2020-09-23 03:48:14,773] WARNING - Using BentoML not from official PyPI release. In order to find the same version of BentoML when deploying your BentoService, you must set the 'core/bentoml_deploy_version' config to a http/git location of your BentoML fork, e.g.: 'bentoml_deploy_version = git+https://github.com/{username}/bentoml.git@{branch}' [2020-09-23 03:48:14,791] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.9.0.pre, but loading from BentoML version 0.9.0.pre+7.g8af1c8b [2020-09-23 03:48:14,791] WARNING - Saved BentoService Python version mismatch: loading BentoService bundle created with Python version 3.6.10, but current environment version is 3.6.12. [2020-09-23 03:48:14,864] WARNING - Using BentoML not from official PyPI release. In order to find the same version of BentoML when deploying your BentoService, you must set the 'core/bentoml_deploy_version' config to a http/git location of your BentoML fork, e.g.: 'bentoml_deploy_version = git+https://github.com/{username}/bentoml.git@{branch}' [2020-09-23 03:48:14,882] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.9.0.pre, but loading from BentoML version 0.9.0.pre+7.g8af1c8b [2020-09-23 03:48:14,883] WARNING - Saved BentoService Python version mismatch: loading BentoService bundle created with Python version 3.6.10, but current environment version is 3.6.12. [2020-09-23 03:48:15,102] WARNING - BentoML by default does not include spacy and torchvision package when using PytorchModelArtifact. To make sure BentoML bundle those packages if they are required for your model, either import those packages in BentoService definition file or manually add them via `@env(pip_packages=['torchvision'])` when defining a BentoService [2020-09-23 03:48:15,103] WARNING - pip package requirement torch already exist [2020-09-23 03:48:15,107] WARNING - BentoML by default does not include spacy and torchvision package when using PytorchModelArtifact. To make sure BentoML bundle those packages if they are required for your model, either import those packages in BentoService definition file or manually add them via `@env(pip_packages=['torchvision'])` when defining a BentoService [2020-09-23 03:48:15,107] WARNING - pip package requirement torch already exist [2020-09-23 03:48:15,209] WARNING - BentoML by default does not include spacy and torchvision package when using PytorchModelArtifact. To make sure BentoML bundle those packages if they are required for your model, either import those packages in BentoService definition file or manually add them via `@env(pip_packages=['torchvision'])` when defining a BentoService [2020-09-23 03:48:15,209] WARNING - pip package requirement torch already exist ^C [2020-09-23 03:49:53 +0000] [1] [INFO] Handling signal: int [2020-09-23 03:49:53 +0000] [12] [INFO] Worker exiting (pid: 12) [2020-09-23 03:49:53 +0000] [14] [INFO] Worker exiting (pid: 14) [2020-09-23 03:49:53 +0000] [13] [INFO] Worker exiting (pid: 13)
BentoML cli supports loading and running a packaged model from CLI. With the DataframeInput adapter, the CLI command supports reading input Dataframe data from CLI argument or local csv or json files:
!bentoml run PyTorchFashionClassifier:latest predict --input-file sample_image.png
[2020-09-23 11:57:28,103] INFO - Getting latest version PyTorchFashionClassifier:20200923114030_0CC108 [2020-09-23 11:57:28,167] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle. [2020-09-23 11:57:28,182] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.9.0.pre, but loading from BentoML version 0.9.0.pre+7.g8af1c8b [2020-09-23 11:57:28,646] WARNING - BentoML by default does not include spacy and torchvision package when using PytorchModelArtifact. To make sure BentoML bundle those packages if they are required for your model, either import those packages in BentoService definition file or manually add them via `@env(pip_packages=['torchvision'])` when defining a BentoService [2020-09-23 11:57:28,647] WARNING - pip package requirement torch already exist /opt/conda/envs/bentoml-dev-py36/lib/python3.6/site-packages/torch/nn/modules/container.py:100: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument. input = module(input) [2020-09-23 11:57:28,711] INFO - {'service_name': 'PyTorchFashionClassifier', 'service_version': '20200923114030_0CC108', 'api': 'predict', 'task': {'data': {'uri': 'file:///home/bentoml/lab/gallery/pytorch/fashion-mnist/sample_image.png', 'name': 'sample_image.png'}, 'task_id': 'a02ae2b7-2fb7-4880-87ea-ac9f53878c1b', 'cli_args': ('--input-file', 'sample_image.png')}, 'result': {'data': '"Ankle boot"', 'http_status': 200, 'http_headers': (('Content-Type', 'application/json'),)}, 'request_id': 'a02ae2b7-2fb7-4880-87ea-ac9f53878c1b'} "Ankle boot"
If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:
If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:
Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy: