BentoML Example

XGBoost League of legend Win prediction

This is a BentoML Demo Project demonstrating how to train a League of Legend win prdiction model, and use BentoML to package and serve the model for building applictions.

BentoML is an open source platform for machine learning model serving and deployment.

Example notebook built based on https://slundberg.github.io/shap/notebooks/League%20of%20Legends%20Win%20Prediction%20with%20XGBoost.html

Impression

In [ ]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

import warnings
warnings.filterwarnings("ignore")
In [1]:
!pip install bentoml
!pip install numpy xgboost sklearn matplotlib kaggle
Requirement already satisfied: bentoml in /Users/chaoyuyang/workspace/BentoML (0.3.1)
Requirement already satisfied: ruamel.yaml>=0.15.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (0.15.99)
Requirement already satisfied: numpy in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (1.16.4)
Requirement already satisfied: flask in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (1.1.1)
Requirement already satisfied: gunicorn in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (19.9.0)
Requirement already satisfied: six in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (1.12.0)
Requirement already satisfied: click in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (7.0)
Requirement already satisfied: pandas in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (0.24.2)
Requirement already satisfied: dill in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (0.3.0)
Requirement already satisfied: prometheus_client in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (0.7.1)
Requirement already satisfied: python-json-logger in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (0.1.11)
Requirement already satisfied: boto3 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (1.9.188)
Requirement already satisfied: pathlib2 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (2.3.4)
Requirement already satisfied: requests in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (2.22.0)
Requirement already satisfied: packaging in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (19.0)
Requirement already satisfied: docker in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (4.0.2)
Requirement already satisfied: configparser in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (3.7.4)
Requirement already satisfied: sqlalchemy in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (1.3.4)
Requirement already satisfied: protobuf>=3.6.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from bentoml) (3.7.0)
Requirement already satisfied: Werkzeug>=0.15 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from flask->bentoml) (0.15.4)
Requirement already satisfied: itsdangerous>=0.24 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from flask->bentoml) (1.1.0)
Requirement already satisfied: Jinja2>=2.10.1 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from flask->bentoml) (2.10.1)
Requirement already satisfied: python-dateutil>=2.5.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from pandas->bentoml) (2.8.0)
Requirement already satisfied: pytz>=2011k in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from pandas->bentoml) (2019.1)
Requirement already satisfied: botocore<1.13.0,>=1.12.188 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from boto3->bentoml) (1.12.188)
Requirement already satisfied: s3transfer<0.3.0,>=0.2.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from boto3->bentoml) (0.2.1)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /Users/chaoyuyang/.local/lib/python3.7/site-packages (from boto3->bentoml) (0.9.3)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from requests->bentoml) (1.25.3)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from requests->bentoml) (3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from requests->bentoml) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from requests->bentoml) (2019.6.16)
Requirement already satisfied: pyparsing>=2.0.2 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from packaging->bentoml) (2.4.0)
Requirement already satisfied: websocket-client>=0.32.0 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from docker->bentoml) (0.56.0)
Requirement already satisfied: setuptools in /Users/chaoyuyang/.local/lib/python3.7/site-packages (from protobuf>=3.6.0->bentoml) (40.6.3)
Requirement already satisfied: MarkupSafe>=0.23 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from Jinja2>=2.10.1->flask->bentoml) (1.1.1)
Requirement already satisfied: docutils>=0.10 in /Users/chaoyuyang/.local/lib/python3.7/site-packages (from botocore<1.13.0,>=1.12.188->boto3->bentoml) (0.14)
Requirement already satisfied: numpy in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (1.16.4)
Requirement already satisfied: xgboost in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (0.90)
Requirement already satisfied: sklearn in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (0.0)
Requirement already satisfied: matplotlib in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (3.1.1)
Requirement already satisfied: scipy in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from xgboost) (1.3.0)
Requirement already satisfied: scikit-learn in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from sklearn) (0.21.2)
Requirement already satisfied: python-dateutil>=2.1 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from matplotlib) (2.8.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from matplotlib) (1.0.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from matplotlib) (2.4.0)
Requirement already satisfied: cycler>=0.10 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from matplotlib) (0.10.0)
Requirement already satisfied: joblib>=0.11 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from scikit-learn->sklearn) (0.13.2)
Requirement already satisfied: six>=1.5 in /Users/chaoyuyang/anaconda3/envs/bentoml-dev/lib/python3.7/site-packages (from python-dateutil>=2.1->matplotlib) (1.12.0)
Requirement already satisfied: setuptools in /Users/chaoyuyang/.local/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib) (40.6.3)
In [2]:
import pandas as pd
import numpy as np
import xgboost as xgb
import matplotlib.pyplot as pl
from sklearn.model_selection import train_test_split

Download Data

This notebook uses data from kaggle paololol/league-of-legends-ranked-matches

You can set your Kaggle credential below and download the dataset automatically. The kaggle key can be created by going to the 'Account' tab of your user profile (https://www.kaggle.com//account) and select 'Create API Token'. This will trigger the download of kaggle.json, a file containing your API credentials, and fill it in the cell below.

Alternativelly, you can download it manually from here and place unzip'd data in this folder.

In [15]:
%%bash
export KAGGLE_USERNAME=
export KAGGLE_KEY=

if [ ! -f ./league-of-legends-ranked-matches.zip ]; then
    kaggle datasets download paololol/league-of-legends-ranked-matches
    unzip -n league-of-legends-ranked-matches.zip
fi
Downloading league-of-legends-ranked-matches.zip to /Users/chaoyuyang/workspace/gallery/xgboost/league-of-legend-win-prediction

Archive:  league-of-legends-ranked-matches.zip
  inflating: champs.csv              
  inflating: matches.csv             
  inflating: participants.csv        
  inflating: stats1.csv              
  inflating: stats2.csv              
  inflating: teambans.csv            
  inflating: teamstats.csv           
100%|██████████| 183M/183M [00:10<00:00, 14.4MB/s]

Load data

In [16]:
# read in the data
matches = pd.read_csv("matches.csv")
participants = pd.read_csv("participants.csv")
stats1 = pd.read_csv("stats1.csv", low_memory=False)
stats2 = pd.read_csv("stats2.csv", low_memory=False)
stats = pd.concat([stats1,stats2])

# merge into a single DataFrame
a = pd.merge(participants, matches, left_on="matchid", right_on="id")
allstats_orig = pd.merge(a, stats, left_on="matchid", right_on="id")
allstats = allstats_orig.copy()

# drop games that lasted less than 10 minutes
allstats = allstats.loc[allstats["duration"] >= 10*60,:]

# Convert string-based categories to numeric values
cat_cols = ["role", "position", "version", "platformid"]
for c in cat_cols:
    allstats[c] = allstats[c].astype('category')
    allstats[c] = allstats[c].cat.codes
allstats["wardsbought"] = allstats["wardsbought"].astype(np.int32)

X = allstats.drop(["win"], axis=1)
y = allstats["win"]

# convert all features we want to consider as rates
rate_features = [
    "kills", "deaths", "assists", "killingsprees", "doublekills",
    "triplekills", "quadrakills", "pentakills", "legendarykills",
    "totdmgdealt", "magicdmgdealt", "physicaldmgdealt", "truedmgdealt",
    "totdmgtochamp", "magicdmgtochamp", "physdmgtochamp", "truedmgtochamp",
    "totheal", "totunitshealed", "dmgtoobj", "timecc", "totdmgtaken",
    "magicdmgtaken" , "physdmgtaken", "truedmgtaken", "goldearned", "goldspent",
    "totminionskilled", "neutralminionskilled", "ownjunglekills",
    "enemyjunglekills", "totcctimedealt", "pinksbought", "wardsbought",
    "wardsplaced", "wardskilled"
]
for feature_name in rate_features:
    X[feature_name] /= X["duration"] / 60 # per minute rate

# convert to fraction of game
X["longesttimespentliving"] /= X["duration"]

# define friendly names for the features
full_names = {
    "kills": "Kills per min.",
    "deaths": "Deaths per min.",
    "assists": "Assists per min.",
    "killingsprees": "Killing sprees per min.",
    "longesttimespentliving": "Longest time living as % of game",
    "doublekills": "Double kills per min.",
    "triplekills": "Triple kills per min.",
    "quadrakills": "Quadra kills per min.",
    "pentakills": "Penta kills per min.",
    "legendarykills": "Legendary kills per min.",
    "totdmgdealt": "Total damage dealt per min.",
    "magicdmgdealt": "Magic damage dealt per min.",
    "physicaldmgdealt": "Physical damage dealt per min.",
    "truedmgdealt": "True damage dealt per min.",
    "totdmgtochamp": "Total damage to champions per min.",
    "magicdmgtochamp": "Magic damage to champions per min.",
    "physdmgtochamp": "Physical damage to champions per min.",
    "truedmgtochamp": "True damage to champions per min.",
    "totheal": "Total healing per min.",
    "totunitshealed": "Total units healed per min.",
    "dmgtoobj": "Damage to objects per min.",
    "timecc": "Time spent with crown control per min.",
    "totdmgtaken": "Total damage taken per min.",
    "magicdmgtaken": "Magic damage taken per min.",
    "physdmgtaken": "Physical damage taken per min.",
    "truedmgtaken": "True damage taken per min.",
    "goldearned": "Gold earned per min.",
    "goldspent": "Gold spent per min.",
    "totminionskilled": "Total minions killed per min.",
    "neutralminionskilled": "Neutral minions killed per min.",
    "ownjunglekills": "Own jungle kills per min.",
    "enemyjunglekills": "Enemy jungle kills per min.",
    "totcctimedealt": "Total crown control time dealt per min.",
    "pinksbought": "Pink wards bought per min.",
    "wardsbought": "Wards bought per min.",
    "wardsplaced": "Wards placed per min.",
    "turretkills": "# of turret kills",
    "inhibkills": "# of inhibitor kills",
    "dmgtoturrets": "Damage to turrets"
}
feature_names = [full_names.get(n, n) for n in X.columns]
X.columns = feature_names

# create train/validation split
Xt, Xv, yt, yv = train_test_split(X,y, test_size=0.2, random_state=10)
dt = xgb.DMatrix(Xt, label=yt.values)
dv = xgb.DMatrix(Xv, label=yv.values)

Train the XGBoost model

In [17]:
params = {
    "eta": 0.5,
    "max_depth": 4,
    "objective": "binary:logistic",
    "silent": 1,
    "base_score": np.mean(yt),
    "eval_metric": "logloss"
}
model = xgb.train(params, dt, 300, [(dt, "train"),(dv, "valid")], early_stopping_rounds=5, verbose_eval=25)
[0]	train-logloss:0.543193	valid-logloss:0.541947
Multiple eval metrics have been passed: 'valid-logloss' will be used for early stopping.

Will train until valid-logloss hasn't improved in 5 rounds.
[25]	train-logloss:0.286065	valid-logloss:0.286603
[50]	train-logloss:0.251781	valid-logloss:0.253591
[75]	train-logloss:0.233638	valid-logloss:0.236202
[100]	train-logloss:0.221021	valid-logloss:0.2243
[125]	train-logloss:0.211856	valid-logloss:0.215922
[150]	train-logloss:0.2028	valid-logloss:0.207597
[175]	train-logloss:0.195752	valid-logloss:0.2011
[200]	train-logloss:0.1898	valid-logloss:0.195689
[225]	train-logloss:0.183355	valid-logloss:0.189494
[250]	train-logloss:0.1781	valid-logloss:0.184515
[275]	train-logloss:0.172431	valid-logloss:0.179242
[299]	train-logloss:0.167329	valid-logloss:0.174417
In [18]:
Xt[:3]
Out[18]:
id_x matchid player championid ss1 ss2 role position id_y gameid ... Neutral minions killed per min. Own jungle kills per min. Enemy jungle kills per min. Total crown control time dealt per min. champlvl Pink wards bought per min. Wards bought per min. Wards placed per min. wardskilled firstblood
1215555 1501034 150933 10 59 4 11 2 2 150933 3162804935 ... 0.023086 0.023086 0.000000 7.572143 18 0.069257 0.0 0.831089 0.253944 0
1427835 1713614 172357 6 35 11 14 3 1 172357 3186087472 ... 0.028262 0.028262 0.000000 10.937353 16 0.000000 0.0 0.197833 0.000000 0
1204118 1489597 149786 3 34 4 14 4 2 149786 3193266242 ... 0.882817 0.693642 0.189175 11.192853 18 0.063058 0.0 0.567525 0.189175 0

3 rows × 71 columns

In [25]:
model.predict(xgb.DMatrix(Xt[:3]))
Out[25]:
array([0.26397082, 0.05626516, 0.011282  ], dtype=float32)

Create ML service with BentoML

In [21]:
%%writefile lol_win_predictions.py

from bentoml import api, env, BentoService, artifacts
from bentoml.artifact import XgboostModelArtifact
from bentoml.handlers import DataframeHandler

import xgboost as xgb

@env(pip_dependencies=['xgboost'])
@artifacts([XgboostModelArtifact('model')])
class LeagueWinPrediction(BentoService):
    
    @api(DataframeHandler)
    def predict(self, df):
        dmatrix = xgb.DMatrix(df)
        return self.artifacts.model.predict(dmatrix)
Writing lol_win_predictions.py
In [22]:
# 1) import the custom BentoService defined above
from lol_win_predictions import LeagueWinPrediction

# 2) `pack` it with required artifacts
bento_svc = LeagueWinPrediction.pack(model=model)

# 3) save your BentoSerivce
saved_path = bento_svc.save()
[2019-08-02 20:21:59,449] INFO - Searching for dependant modules of lol_win_predictions:/Users/chaoyuyang/workspace/gallery/xgboost/league-of-legend-win-prediction/lol_win_predictions.py
[2019-08-02 20:22:19,387] INFO - Copying local python module '/Users/chaoyuyang/workspace/gallery/xgboost/league-of-legend-win-prediction/lol_win_predictions.py'
[2019-08-02 20:22:19,389] INFO - Done copying local python dependant modules
[2019-08-02 20:22:19,480] INFO - BentoService LeagueWinPrediction:2019_08_02_bc041570 saved to /tmp/bent_archive/LeagueWinPrediction/2019_08_02_bc041570
In [26]:
from bentoml import load

svc = load(saved_path)

print(svc.predict(Xt[:3]))
[2019-08-02 20:22:26,974] WARNING - Module `lol_win_predictions` already loaded, using existing imported module.
[0.26397082 0.05626516 0.011282  ]

Using BentoML archive as CLI tool

In [27]:
!pip install {saved_path}
Processing /tmp/bent_archive/LeagueWinPrediction/2019_08_02_bc041570
Building wheels for collected packages: LeagueWinPrediction
  Building wheel for LeagueWinPrediction (setup.py) ... done
  Stored in directory: /private/var/folders/ns/vc9qhmqx5dx_9fws7d869lqh0000gn/T/pip-ephem-wheel-cache-biki_9_b/wheels/3e/ff/49/21f15e9b7e99a81b164fb2cadf8de9c6c1d135e5a858f467f7
Successfully built LeagueWinPrediction
Installing collected packages: LeagueWinPrediction
Successfully installed LeagueWinPrediction-2019-08-02-bc041570
In [28]:
Xt[:3].to_csv('test.csv')
In [29]:
!LeagueWinPrediction predict --input=test.csv
[0.99658453 0.01096581 0.06465738]

Use archive as REST API server

notes: This doesn't work with Google Colab right now, because we can't access the local port from it.

In [ ]:
!bentoml serve {saved_path}

Make requeset to the REST server

After navigate to the location of this notebook, copy and paste the following code to your terminal and run it to make request

curl -i \
--request POST \
--header "Content-Type: text/csv" \
-d @test.csv \
localhost:5000/predict
In [ ]: