BentoML makes moving trained ML models to production easy:
BentoML is a framework for serving, managing, and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable teams to deliver prediction services in a fast, repeatable, and scalable way.
Before reading this example project, be sure to check out the Getting started guide to learn about the basic concepts in BentoML.
This notebook demonstrates how to use BentoML to turn a H2O model into a docker image containing a REST API server serving this model, as well as distributing your model as a command line tool or a pip-installable PyPI package.
The notebook was built based on: https://github.com/kguruswamy/H2O3-Driverless-AI-Code-Examples/blob/master/Lending%20Club%20Data%20-%20H2O3%20Auto%20ML%20-%20Python%20Tutorial.ipynb
%reload_ext autoreload
%autoreload 2
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")
!pip install -q bentoml "h2o>=3.24.0.2" "xlrd>=1.2.0" "sklearn>=0.23.2" "pandas>=1.1.1" "numpy>=1.18.4"
import h2o
import bentoml
import numpy as np
import pandas as pd
import requests
import math
from sklearn import model_selection
h2o.init(strict_version_check=False)
Checking whether there is an H2O instance running at http://localhost:54321 ..... not found. Attempting to start a local H2O server... Java Version: java version "9.0.1"; Java(TM) SE Runtime Environment (build 9.0.1+11); Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) Starting server from /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/backend/bin/h2o.jar Ice root: /var/folders/kn/xnc9k74x03567n1mx2tfqnpr0000gn/T/tmpm34g1lnd JVM stdout: /var/folders/kn/xnc9k74x03567n1mx2tfqnpr0000gn/T/tmpm34g1lnd/h2o_bozhaoyu_started_from_python.out JVM stderr: /var/folders/kn/xnc9k74x03567n1mx2tfqnpr0000gn/T/tmpm34g1lnd/h2o_bozhaoyu_started_from_python.err Server is running at http://127.0.0.1:54321 Connecting to H2O server at http://127.0.0.1:54321 ... successful.
H2O cluster uptime: | 02 secs |
H2O cluster timezone: | America/Los_Angeles |
H2O data parsing timezone: | UTC |
H2O cluster version: | 3.24.0.2 |
H2O cluster version age: | 1 year, 5 months and 5 days !!! |
H2O cluster name: | H2O_from_python_bozhaoyu_392ekt |
H2O cluster total nodes: | 1 |
H2O cluster free memory: | 4 Gb |
H2O cluster total cores: | 8 |
H2O cluster allowed cores: | 8 |
H2O cluster status: | accepting new members, healthy |
H2O connection url: | http://127.0.0.1:54321 |
H2O connection proxy: | None |
H2O internal security: | False |
H2O API Extensions: | Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4 |
Python version: | 3.7.3 final |
%%bash
# Download training dataset
if [ ! -f ./LoanStats3c.csv.zip ]; then
curl -O https://resources.lendingclub.com/LoanStats3c.csv.zip
fi
pd.set_option('expand_frame_repr', True)
pd.set_option('max_colwidth',9999)
pd.set_option('display.max_columns',9999)
pd.set_option('display.max_rows',9999)
data_dictionary = pd.read_excel("https://resources.lendingclub.com/LCDataDictionary.xlsx")
data_dictionary
LoanStatNew | Description | |
---|---|---|
0 | acc_now_delinq | The number of accounts on which the borrower is now delinquent. |
1 | acc_open_past_24mths | Number of trades opened in past 24 months. |
2 | addr_state | The state provided by the borrower in the loan application |
3 | all_util | Balance to credit limit on all trades |
4 | annual_inc | The self-reported annual income provided by the borrower during registration. |
5 | annual_inc_joint | The combined self-reported annual income provided by the co-borrowers during registration |
6 | application_type | Indicates whether the loan is an individual application or a joint application with two co-borrowers |
7 | avg_cur_bal | Average current balance of all accounts |
8 | bc_open_to_buy | Total open to buy on revolving bankcards. |
9 | bc_util | Ratio of total current balance to high credit/credit limit for all bankcard accounts. |
10 | chargeoff_within_12_mths | Number of charge-offs within 12 months |
11 | collection_recovery_fee | post charge off collection fee |
12 | collections_12_mths_ex_med | Number of collections in 12 months excluding medical collections |
13 | delinq_2yrs | The number of 30+ days past-due incidences of delinquency in the borrower's credit file for the past 2 years |
14 | delinq_amnt | The past-due amount owed for the accounts on which the borrower is now delinquent. |
15 | desc | Loan description provided by the borrower |
16 | dti | A ratio calculated using the borrower’s total monthly debt payments on the total debt obligations, excluding mortgage and the requested LC loan, divided by the borrower’s self-reported monthly income. |
17 | dti_joint | A ratio calculated using the co-borrowers' total monthly payments on the total debt obligations, excluding mortgages and the requested LC loan, divided by the co-borrowers' combined self-reported monthly income |
18 | earliest_cr_line | The month the borrower's earliest reported credit line was opened |
19 | emp_length | Employment length in years. Possible values are between 0 and 10 where 0 means less than one year and 10 means ten or more years. |
20 | emp_title | The job title supplied by the Borrower when applying for the loan.* |
21 | fico_range_high | The upper boundary range the borrower’s FICO at loan origination belongs to. |
22 | fico_range_low | The lower boundary range the borrower’s FICO at loan origination belongs to. |
23 | funded_amnt | The total amount committed to that loan at that point in time. |
24 | funded_amnt_inv | The total amount committed by investors for that loan at that point in time. |
25 | grade | LC assigned loan grade |
26 | home_ownership | The home ownership status provided by the borrower during registration or obtained from the credit report. Our values are: RENT, OWN, MORTGAGE, OTHER |
27 | id | A unique LC assigned ID for the loan listing. |
28 | il_util | Ratio of total current balance to high credit/credit limit on all install acct |
29 | initial_list_status | The initial listing status of the loan. Possible values are – W, F |
30 | inq_fi | Number of personal finance inquiries |
31 | inq_last_12m | Number of credit inquiries in past 12 months |
32 | inq_last_6mths | The number of inquiries in past 6 months (excluding auto and mortgage inquiries) |
33 | installment | The monthly payment owed by the borrower if the loan originates. |
34 | int_rate | Interest Rate on the loan |
35 | issue_d | The month which the loan was funded |
36 | last_credit_pull_d | The most recent month LC pulled credit for this loan |
37 | last_fico_range_high | The upper boundary range the borrower’s last FICO pulled belongs to. |
38 | last_fico_range_low | The lower boundary range the borrower’s last FICO pulled belongs to. |
39 | last_pymnt_amnt | Last total payment amount received |
40 | last_pymnt_d | Last month payment was received |
41 | loan_amnt | The listed amount of the loan applied for by the borrower. If at some point in time, the credit department reduces the loan amount, then it will be reflected in this value. |
42 | loan_status | Current status of the loan |
43 | max_bal_bc | Maximum current balance owed on all revolving accounts |
44 | member_id | A unique LC assigned Id for the borrower member. |
45 | mo_sin_old_il_acct | Months since oldest bank installment account opened |
46 | mo_sin_old_rev_tl_op | Months since oldest revolving account opened |
47 | mo_sin_rcnt_rev_tl_op | Months since most recent revolving account opened |
48 | mo_sin_rcnt_tl | Months since most recent account opened |
49 | mort_acc | Number of mortgage accounts. |
50 | mths_since_last_delinq | The number of months since the borrower's last delinquency. |
51 | mths_since_last_major_derog | Months since most recent 90-day or worse rating |
52 | mths_since_last_record | The number of months since the last public record. |
53 | mths_since_rcnt_il | Months since most recent installment accounts opened |
54 | mths_since_recent_bc | Months since most recent bankcard account opened. |
55 | mths_since_recent_bc_dlq | Months since most recent bankcard delinquency |
56 | mths_since_recent_inq | Months since most recent inquiry. |
57 | mths_since_recent_revol_delinq | Months since most recent revolving delinquency. |
58 | next_pymnt_d | Next scheduled payment date |
59 | num_accts_ever_120_pd | Number of accounts ever 120 or more days past due |
60 | num_actv_bc_tl | Number of currently active bankcard accounts |
61 | num_actv_rev_tl | Number of currently active revolving trades |
62 | num_bc_sats | Number of satisfactory bankcard accounts |
63 | num_bc_tl | Number of bankcard accounts |
64 | num_il_tl | Number of installment accounts |
65 | num_op_rev_tl | Number of open revolving accounts |
66 | num_rev_accts | Number of revolving accounts |
67 | num_rev_tl_bal_gt_0 | Number of revolving trades with balance >0 |
68 | num_sats | Number of satisfactory accounts |
69 | num_tl_120dpd_2m | Number of accounts currently 120 days past due (updated in past 2 months) |
70 | num_tl_30dpd | Number of accounts currently 30 days past due (updated in past 2 months) |
71 | num_tl_90g_dpd_24m | Number of accounts 90 or more days past due in last 24 months |
72 | num_tl_op_past_12m | Number of accounts opened in past 12 months |
73 | open_acc | The number of open credit lines in the borrower's credit file. |
74 | open_acc_6m | Number of open trades in last 6 months |
75 | open_il_12m | Number of installment accounts opened in past 12 months |
76 | open_il_24m | Number of installment accounts opened in past 24 months |
77 | open_act_il | Number of currently active installment trades |
78 | open_rv_12m | Number of revolving trades opened in past 12 months |
79 | open_rv_24m | Number of revolving trades opened in past 24 months |
80 | out_prncp | Remaining outstanding principal for total amount funded |
81 | out_prncp_inv | Remaining outstanding principal for portion of total amount funded by investors |
82 | pct_tl_nvr_dlq | Percent of trades never delinquent |
83 | percent_bc_gt_75 | Percentage of all bankcard accounts > 75% of limit. |
84 | policy_code | publicly available policy_code=1\nnew products not publicly available policy_code=2 |
85 | pub_rec | Number of derogatory public records |
86 | pub_rec_bankruptcies | Number of public record bankruptcies |
87 | purpose | A category provided by the borrower for the loan request. |
88 | pymnt_plan | Indicates if a payment plan has been put in place for the loan |
89 | recoveries | post charge off gross recovery |
90 | revol_bal | Total credit revolving balance |
91 | revol_util | Revolving line utilization rate, or the amount of credit the borrower is using relative to all available revolving credit. |
92 | sub_grade | LC assigned loan subgrade |
93 | tax_liens | Number of tax liens |
94 | term | The number of payments on the loan. Values are in months and can be either 36 or 60. |
95 | title | The loan title provided by the borrower |
96 | tot_coll_amt | Total collection amounts ever owed |
97 | tot_cur_bal | Total current balance of all accounts |
98 | tot_hi_cred_lim | Total high credit/credit limit |
99 | total_acc | The total number of credit lines currently in the borrower's credit file |
100 | total_bal_ex_mort | Total credit balance excluding mortgage |
101 | total_bal_il | Total current balance of all installment accounts |
102 | total_bc_limit | Total bankcard high credit/credit limit |
103 | total_cu_tl | Number of finance trades |
104 | total_il_high_credit_limit | Total installment high credit/credit limit |
105 | total_pymnt | Payments received to date for total amount funded |
106 | total_pymnt_inv | Payments received to date for portion of total amount funded by investors |
107 | total_rec_int | Interest received to date |
108 | total_rec_late_fee | Late fees received to date |
109 | total_rec_prncp | Principal received to date |
110 | total_rev_hi_lim | Total revolving high credit/credit limit |
111 | url | URL for the LC page with listing data. |
112 | verification_status | Indicates if income was verified by LC, not verified, or if the income source was verified |
113 | verified_status_joint | Indicates if the co-borrowers' joint income was verified by LC, not verified, or if the income source was verified |
114 | zip_code | The first 3 numbers of the zip code provided by the borrower in the loan application. |
115 | revol_bal_joint | Sum of revolving credit balance of the co-borrowers, net of duplicate balances |
116 | sec_app_fico_range_low | FICO range (high) for the secondary applicant |
117 | sec_app_fico_range_high | FICO range (low) for the secondary applicant |
118 | sec_app_earliest_cr_line | Earliest credit line at time of application for the secondary applicant |
119 | sec_app_inq_last_6mths | Credit inquiries in the last 6 months at time of application for the secondary applicant |
120 | sec_app_mort_acc | Number of mortgage accounts at time of application for the secondary applicant |
121 | sec_app_open_acc | Number of open trades at time of application for the secondary applicant |
122 | sec_app_revol_util | Ratio of total current balance to high credit/credit limit for all revolving accounts |
123 | sec_app_open_act_il | Number of currently active installment trades at time of application for the secondary applicant |
124 | sec_app_num_rev_accts | Number of revolving accounts at time of application for the secondary applicant |
125 | sec_app_chargeoff_within_12_mths | Number of charge-offs within last 12 months at time of application for the secondary applicant |
126 | sec_app_collections_12_mths_ex_med | Number of collections within last 12 months excluding medical collections at time of application for the secondary applicant |
127 | sec_app_mths_since_last_major_derog | Months since most recent 90-day or worse rating at time of application for the secondary applicant |
128 | hardship_flag | Flags whether or not the borrower is on a hardship plan |
129 | hardship_type | Describes the hardship plan offering |
130 | hardship_reason | Describes the reason the hardship plan was offered |
131 | hardship_status | Describes if the hardship plan is active, pending, canceled, completed, or broken |
132 | deferral_term | Amount of months that the borrower is expected to pay less than the contractual monthly payment amount due to a hardship plan |
133 | hardship_amount | The interest payment that the borrower has committed to make each month while they are on a hardship plan |
134 | hardship_start_date | The start date of the hardship plan period |
135 | hardship_end_date | The end date of the hardship plan period |
136 | payment_plan_start_date | The day the first hardship plan payment is due. For example, if a borrower has a hardship plan period of 3 months, the start date is the start of the three-month period in which the borrower is allowed to make interest-only payments. |
137 | hardship_length | The number of months the borrower will make smaller payments than normally obligated due to a hardship plan |
138 | hardship_dpd | Account days past due as of the hardship plan start date |
139 | hardship_loan_status | Loan Status as of the hardship plan start date |
140 | orig_projected_additional_accrued_interest | The original projected additional interest amount that will accrue for the given hardship payment plan as of the Hardship Start Date. This field will be null if the borrower has broken their hardship payment plan. |
141 | hardship_payoff_balance_amount | The payoff balance amount as of the hardship plan start date |
142 | hardship_last_payment_amount | The last payment amount as of the hardship plan start date |
143 | disbursement_method | The method by which the borrower receives their loan. Possible values are: CASH, DIRECT_PAY |
144 | debt_settlement_flag | Flags whether or not the borrower, who has charged-off, is working with a debt-settlement company. |
145 | debt_settlement_flag_date | The most recent date that the Debt_Settlement_Flag has been set |
146 | settlement_status | The status of the borrower’s settlement plan. Possible values are: COMPLETE, ACTIVE, BROKEN, CANCELLED, DENIED, DRAFT |
147 | settlement_date | The date that the borrower agrees to the settlement plan |
148 | settlement_amount | The loan amount that the borrower has agreed to settle for |
149 | settlement_percentage | The settlement amount as a percentage of the payoff balance amount on the loan |
150 | settlement_term | The number of months that the borrower will be on the settlement plan |
151 | NaN | NaN |
152 | NaN | * Employer Title replaces Employer Name for all loans listed after 9/23/2013 |
# Very first row has non-header data and hence skipping it. Read to a data frame
# Fix the Mon-Year on one column to be readable
def parse_dates(x):
return datetime.strptime(x, "%b-%d")
lc = pd.read_csv("LoanStats3c.csv.zip", skiprows=1,verbose=False, parse_dates=['issue_d'],low_memory=False)
lc.shape
(235631, 144)
lc.loan_status.unique()
array(['Fully Paid', 'Charged Off', 'Current', 'In Grace Period', 'Late (31-120 days)', 'Default', 'Late (16-30 days)', nan], dtype=object)
# Keep just "Fully Paid" and "Charged Off" to make it a simple 'Yes' or 'No' - binary classification problem
lc = lc[lc.loan_status.isin(['Fully Paid','Charged Off'])]
lc.loan_status.unique()
array(['Fully Paid', 'Charged Off'], dtype=object)
# Drop the columns from the data frame that are Target Leakage ones
# Target Leakage columns are generally created in hindsight by analysts/data engineers/operations after an outcome
# was detected in historical data. If we don't remove them now, they would climb to the top of the feature list after a model is built and
# falsely increase the accuracy to 95% :)
#
# In Production or real life scoring environment, don't expect these columns to be available at scoring time
# , that is,when someone applies for a loan. So we don't train on those columns ...
ignored_cols = [
'out_prncp', # Remaining outstanding principal for total amount funded
'out_prncp_inv', # Remaining outstanding principal for portion of total amount
# funded by investors
'total_pymnt', # Payments received to date for total amount funded
'total_pymnt_inv', # Payments received to date for portion of total amount
# funded by investors
'total_rec_prncp', # Principal received to date
'total_rec_int', # Interest received to date
'total_rec_late_fee', # Late fees received to date
'recoveries', # post charge off gross recovery
'collection_recovery_fee', # post charge off collection fee
'last_pymnt_d', # Last month payment was received
'last_pymnt_amnt', # Last total payment amount received
'next_pymnt_d', # Next scheduled payment date
'last_credit_pull_d', # The most recent month LC pulled credit for this loan
'settlement_term', # The number of months that the borrower will be on the settlement plan
'settlement_date', # The date that the borrower agrees to the settlement plan
'settlement_amount', # The loan amount that the borrower has agreed to settle for
'settlement_percentage', # The settlement amount as a percentage of the payoff balance amount on the loan
'settlement_status', # The status of the borrower’s settlement plan. Possible values are:
# COMPLETE, ACTIVE, BROKEN, CANCELLED, DENIED, DRAF
'debt_settlement_flag', # Flags whether or not the borrower, who has charged-off, is working with
# a debt-settlement company.
'debt_settlement_flag_date' # The most recent date that the Debt_Settlement_Flag has been set
]
lc = lc.drop(columns=ignored_cols, axis = 1)
# After dropping Target Leakage columns, we have 223K rows and 125 columns
lc.shape
(235543, 124)
import csv
import os
train_path = os.getcwd() + "/train_lc.csv.zip"
test_path = os.getcwd() + "/test_lc.csv.zip"
train_lc, test_lc = model_selection.train_test_split(lc, test_size=0.2, random_state=10,stratify=lc['loan_status'])
train_lc.to_csv(train_path, index=False,compression="zip")
test_lc.to_csv(test_path, index=False,compression="zip")
print('Train LC shape', train_lc.shape)
print('Test LC shape', test_lc.shape)
# These two CSV files were created in the previous section
train_path = os.getcwd()+"/train_lc.csv.zip"
test_path = os.getcwd()+ "/test_lc.csv.zip"
train = h2o.load_dataset(train_path)
test = h2o.load_dataset(test_path)
train.describe()
Train LC shape (188434, 124) Test LC shape (47109, 124) Parse progress: |█████████████████████████████████████████████████████████| 100% Parse progress: |█████████████████████████████████████████████████████████| 100% Rows:188434 Cols:124
id | member_id | loan_amnt | funded_amnt | funded_amnt_inv | term | int_rate | installment | grade | sub_grade | emp_title | emp_length | home_ownership | annual_inc | verification_status | issue_d | loan_status | pymnt_plan | url | desc | purpose | title | zip_code | addr_state | dti | delinq_2yrs | earliest_cr_line | inq_last_6mths | mths_since_last_delinq | mths_since_last_record | open_acc | pub_rec | revol_bal | revol_util | total_acc | initial_list_status | collections_12_mths_ex_med | mths_since_last_major_derog | policy_code | application_type | annual_inc_joint | dti_joint | verification_status_joint | acc_now_delinq | tot_coll_amt | tot_cur_bal | open_acc_6m | open_act_il | open_il_12m | open_il_24m | mths_since_rcnt_il | total_bal_il | il_util | open_rv_12m | open_rv_24m | max_bal_bc | all_util | total_rev_hi_lim | inq_fi | total_cu_tl | inq_last_12m | acc_open_past_24mths | avg_cur_bal | bc_open_to_buy | bc_util | chargeoff_within_12_mths | delinq_amnt | mo_sin_old_il_acct | mo_sin_old_rev_tl_op | mo_sin_rcnt_rev_tl_op | mo_sin_rcnt_tl | mort_acc | mths_since_recent_bc | mths_since_recent_bc_dlq | mths_since_recent_inq | mths_since_recent_revol_delinq | num_accts_ever_120_pd | num_actv_bc_tl | num_actv_rev_tl | num_bc_sats | num_bc_tl | num_il_tl | num_op_rev_tl | num_rev_accts | num_rev_tl_bal_gt_0 | num_sats | num_tl_120dpd_2m | num_tl_30dpd | num_tl_90g_dpd_24m | num_tl_op_past_12m | pct_tl_nvr_dlq | percent_bc_gt_75 | pub_rec_bankruptcies | tax_liens | tot_hi_cred_lim | total_bal_ex_mort | total_bc_limit | total_il_high_credit_limit | revol_bal_joint | sec_app_earliest_cr_line | sec_app_inq_last_6mths | sec_app_mort_acc | sec_app_open_acc | sec_app_revol_util | sec_app_open_act_il | sec_app_num_rev_accts | sec_app_chargeoff_within_12_mths | sec_app_collections_12_mths_ex_med | sec_app_mths_since_last_major_derog | hardship_flag | hardship_type | hardship_reason | hardship_status | deferral_term | hardship_amount | hardship_start_date | hardship_end_date | payment_plan_start_date | hardship_length | hardship_dpd | hardship_loan_status | orig_projected_additional_accrued_interest | hardship_payoff_balance_amount | hardship_last_payment_amount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
type | int | int | int | int | int | enum | real | real | enum | enum | enum | enum | enum | real | enum | time | enum | enum | int | string | enum | enum | enum | enum | real | int | time | int | int | int | int | int | int | real | int | enum | int | int | int | enum | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | real | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | real | real | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | enum | enum | enum | enum | int | real | time | time | time | int | int | enum | real | real | real |
mins | NaN | NaN | 1000.0 | 1000.0 | 950.0 | 0.06 | 23.36 | 3000.0 | 1388534400000.0 | NaN | NaN | 0.0 | 0.0 | -820540800000.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 1.0 | NaN | NaN | NaN | 0.0 | 0.0 | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | NaN | NaN | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 16.7 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.0 | 1.47 | 1485907200000.0 | 1491004800000.0 | 1485907200000.0 | 3.0 | 0.0 | 4.41 | 174.15 | 0.04 | ||||||||||||||||||||
mean | 0.0 | 0.0 | 14884.780480168118 | 14884.780480168118 | 14879.984769203027 | 0.13768163813324627 | 443.02308527123586 | 74842.1021655859 | 1403772635222.944 | 0.0 | NaN | 18.038431917806744 | 0.3439559739749727 | 878010488536.0388 | 0.7575596760669507 | 33.40950190769873 | 70.73781512605042 | 11.671577316195588 | 0.2224598533173423 | 16517.317517008567 | 0.5562211189754319 | 26.019354256662876 | 0.015474914293598825 | 42.4452214452214 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0056200048823460734 | 280.02962841100884 | 139916.78260292622 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 30777.311727183012 | 0.0 | 0.0 | 0.0 | 4.403568358151916 | 13425.63404110787 | 8488.266061096774 | 64.59277420047215 | 0.010783616544784914 | 9.796506999798337 | 128.53483025519225 | 185.81975121262616 | 13.078101616481097 | 8.003645838861342 | 1.853216510820764 | 24.440105868864123 | 39.5963653177332 | 6.918806067907228 | 35.46866324059305 | 0.5053122048038038 | 3.686240275109585 | 5.803241453240905 | 4.646629589139977 | 8.54615939798552 | 8.57417451203075 | 8.277009456892086 | 15.304217922455612 | 5.767600326904917 | 11.622318689833035 | 0.0009551360519945328 | 0.0036511457592578833 | 0.09438848615430329 | 2.0082309986520546 | 94.24337699141348 | 50.68730479940775 | 0.13524098623390737 | 0.05543054862710553 | 170413.48133033328 | 48481.2246834436 | 20070.11836505092 | 39932.08656611851 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.0 | 114.21573863636364 | 1509994881818.182 | 1516242681818.182 | 1510936200000.0 | 3.0 | 14.59375 | 339.8909328358209 | 7947.003380681818 | 187.48076704545457 | ||||||||||||||||||||
maxs | NaN | NaN | 35000.0 | 35000.0 | 35000.0 | 0.2606 | 1409.99 | 7500000.0 | 1417392000000.0 | NaN | NaN | 39.99 | 22.0 | 1320105600000.0 | 6.0 | 188.0 | 121.0 | 84.0 | 63.0 | 2560703.0 | 8.923 | 156.0 | 20.0 | 188.0 | 1.0 | NaN | NaN | NaN | 3.0 | 9152545.0 | 3840795.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 9999999.0 | NaN | NaN | NaN | 53.0 | 497484.0 | 260250.0 | 255.2 | 7.0 | 65000.0 | 561.0 | 842.0 | 372.0 | 226.0 | 37.0 | 616.0 | 170.0 | 25.0 | 180.0 | 30.0 | 26.0 | 38.0 | 35.0 | 61.0 | 150.0 | 62.0 | 105.0 | 38.0 | 84.0 | 2.0 | 3.0 | 22.0 | 26.0 | 100.0 | 100.0 | 7.0 | 63.0 | 9999999.0 | 2688920.0 | 1090700.0 | 1027358.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.0 | 344.24 | 1556668800000.0 | 1561939200000.0 | 1556668800000.0 | 3.0 | 32.0 | 1032.72 | 20321.15 | 713.04 | ||||||||||||||||||||
sigma | -0.0 | -0.0 | 8444.529842237767 | 8444.529842237767 | 8441.74965734052 | 0.043236714795126696 | 245.55956150039466 | 55879.6254552511 | 8618925772.810982 | -0.0 | NaN | 8.023289119934871 | 0.9000809312845888 | 235685775429.60028 | 1.035364539099023 | 21.777363524527555 | 28.5075895180931 | 5.280407394680721 | 0.6053968655503703 | 21598.80009195033 | 0.23102402224846186 | 11.891471363727664 | 0.14101201867640875 | 20.880974652987707 | 0.0 | -0.0 | -0.0 | -0.0 | 0.07950349603856371 | 21174.68581402208 | 153006.06065658265 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | 38384.00007004573 | -0.0 | -0.0 | -0.0 | 2.8642976762774732 | 16026.702776462502 | 13412.411414410766 | 26.42401766142629 | 0.1173765114760668 | 565.4116442360041 | 51.34992108739378 | 93.00236919417999 | 16.134812986440643 | 8.759052551750663 | 2.1611275634408718 | 30.30743682243691 | 22.573879361083183 | 5.9354275780436705 | 22.304832497335703 | 1.2700994420156504 | 2.1527147426857067 | 3.1405449886489047 | 2.7200394282746365 | 4.812694785276978 | 7.3089541963331675 | 4.318223682539508 | 8.047168776251766 | 3.1221446203671164 | 5.278551769075766 | 0.03211042594205823 | 0.06423450317302083 | 0.49334840182534967 | 1.6084979574620704 | 8.481447276838344 | 34.90775600526837 | 0.3756866400997473 | 0.4123003808101904 | 172512.507823174 | 46113.3699393363 | 20243.507596068586 | 41490.84884582259 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | 0.0 | 77.0094025673706 | 14500662214.885572 | 14471630662.23209 | 14520502785.954859 | 0.0 | 9.499873480938898 | 240.1814929663525 | 4637.06552271656 | 147.17888898625247 | ||||||||||||||||||||
zeros | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60 | 149487 | 34 | 100647 | 222 | 3 | 2 | 155114 | 448 | 480 | 0 | 185745 | 37 | 0 | 0 | 0 | 0 | 187438 | 159941 | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 66 | 0 | 0 | 0 | 7733 | 31 | 3801 | 1595 | 186638 | 187742 | 2 | 0 | 2977 | 3084 | 72408 | 1245 | 97 | 15431 | 157 | 143588 | 3405 | 453 | 1839 | 330 | 5758 | 53 | 0 | 448 | 2 | 182006 | 187787 | 176622 | 32108 | 0 | 32367 | 164655 | 181918 | 3 | 68 | 2036 | 25148 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 68 | 0 | 0 | 0 | ||||||||||||||||||||
missing | 188434 | 188434 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10581 | 9589 | 0 | 0 | 0 | 0 | 0 | 0 | 188434 | 176217 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 92769 | 155114 | 0 | 0 | 0 | 101 | 0 | 0 | 0 | 135238 | 0 | 0 | 188434 | 188434 | 188434 | 0 | 0 | 0 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 0 | 188434 | 188434 | 188434 | 0 | 3 | 1943 | 2074 | 0 | 0 | 5748 | 0 | 0 | 0 | 0 | 1788 | 138691 | 17436 | 120718 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6261 | 0 | 0 | 0 | 0 | 2036 | 0 | 0 | 0 | 0 | 0 | 0 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 0 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188166 | 188082 | 188082 |
0 | nan | nan | 18700.0 | 18700.0 | 18700.0 | 60 months | 0.1629 | 457.64 | D | D2 | Assistant Manager | 4 years | MORTGAGE | 52000.0 | Not Verified | 2014-05-01 00:00:00 | Fully Paid | n | nan | credit_card | Credit card refinancing | 630xx | MO | 11.65 | 0.0 | 1999-08-01 00:00:00 | 5.0 | 59.0 | nan | 20.0 | 0.0 | 16920.0 | 0.502 | 37.0 | w | 0.0 | 59.0 | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 117999.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 33700.0 | nan | nan | nan | 6.0 | 6210.0 | 4113.0 | 80.4 | 0.0 | 0.0 | 177.0 | 123.0 | 1.0 | 1.0 | 1.0 | 8.0 | 59.0 | 1.0 | 59.0 | 1.0 | 5.0 | 6.0 | 9.0 | 10.0 | 17.0 | 16.0 | 19.0 | 6.0 | 20.0 | 0.0 | 0.0 | 0.0 | 2.0 | 91.7 | 50.0 | 0.0 | 0.0 | 141329.0 | 51831.0 | 21000.0 | 39404.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
1 | nan | nan | 20000.0 | 20000.0 | 20000.0 | 36 months | 0.0917 | 637.58 | B | B1 | Engineering | 4 years | RENT | 93000.0 | Source Verified | 2014-08-01 00:00:00 | Fully Paid | n | nan | debt_consolidation | Debt consolidation | 334xx | FL | 19.15 | 0.0 | 1996-09-01 00:00:00 | 0.0 | 43.0 | nan | 9.0 | 0.0 | 10597.0 | 0.609 | 24.0 | f | 1.0 | 43.0 | 1.0 | Individual | nan | nan | nan | 0.0 | 2305.0 | 61086.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 17400.0 | nan | nan | nan | 2.0 | 6787.0 | 2303.0 | 82.1 | 0.0 | 0.0 | 215.0 | 178.0 | 33.0 | 5.0 | 0.0 | 33.0 | nan | 9.0 | nan | 2.0 | 3.0 | 3.0 | 4.0 | 9.0 | 11.0 | 5.0 | 12.0 | 3.0 | 9.0 | 0.0 | 0.0 | 0.0 | 1.0 | 91.7 | 50.0 | 0.0 | 0.0 | 79108.0 | 61086.0 | 12900.0 | 61708.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
2 | nan | nan | 11000.0 | 11000.0 | 11000.0 | 36 months | 0.1099 | 360.08 | B | B2 | Teacher director | 10+ years | RENT | 30000.0 | Not Verified | 2014-02-01 00:00:00 | Fully Paid | n | nan | debt_consolidation | Debt consolidation | 010xx | MA | 27.84 | 0.0 | 1994-11-01 00:00:00 | 1.0 | nan | nan | 10.0 | 0.0 | 11523.0 | 0.546 | 21.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 90.0 | 17778.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 21100.0 | nan | nan | nan | 2.0 | 1778.0 | 2200.0 | 71.8 | 0.0 | 0.0 | 133.0 | 231.0 | 6.0 | 6.0 | 0.0 | 63.0 | nan | 6.0 | nan | 0.0 | 4.0 | 9.0 | 4.0 | 9.0 | 5.0 | 9.0 | 16.0 | 9.0 | 10.0 | 0.0 | 0.0 | 0.0 | 1.0 | 100.0 | 75.0 | 0.0 | 0.0 | 35076.0 | 17778.0 | 7800.0 | 13976.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
3 | nan | nan | 15000.0 | 15000.0 | 15000.0 | 60 months | 0.1561 | 361.67 | D | D1 | Accounting Assistant | < 1 year | RENT | 50000.0 | Verified | 2014-07-01 00:00:00 | Fully Paid | n | nan | credit_card | Credit card refinancing | 076xx | NJ | 20.91 | 0.0 | 1997-07-01 00:00:00 | 0.0 | nan | 94.0 | 24.0 | 1.0 | 7884.0 | 0.276 | 31.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 34486.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 28600.0 | nan | nan | nan | 14.0 | 1499.0 | 5285.0 | 57.0 | 0.0 | 0.0 | 148.0 | 203.0 | 3.0 | 1.0 | 0.0 | 10.0 | nan | 7.0 | nan | 0.0 | 3.0 | 7.0 | 4.0 | 6.0 | 11.0 | 16.0 | 20.0 | 7.0 | 24.0 | 0.0 | 0.0 | 0.0 | 11.0 | 100.0 | 25.0 | 1.0 | 0.0 | 88501.0 | 34486.0 | 12300.0 | 59901.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
4 | nan | nan | 6250.0 | 6250.0 | 6250.0 | 36 months | 0.0949 | 200.18 | B | B2 | Full Time Active Duty Military | 10+ years | MORTGAGE | 78000.0 | Source Verified | 2014-11-01 00:00:00 | Fully Paid | n | nan | debt_consolidation | Debt consolidation | 254xx | WV | 1.14 | 0.0 | 2000-12-01 00:00:00 | 0.0 | 55.0 | nan | 6.0 | 0.0 | 3132.0 | 0.344 | 14.0 | w | 0.0 | 55.0 | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 393615.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 9100.0 | nan | nan | nan | 2.0 | 78723.0 | 2968.0 | 51.3 | 0.0 | 0.0 | 167.0 | 117.0 | 12.0 | 12.0 | 4.0 | 35.0 | 55.0 | 20.0 | 55.0 | 3.0 | 1.0 | 1.0 | 2.0 | 5.0 | 2.0 | 4.0 | 8.0 | 1.0 | 6.0 | 0.0 | 0.0 | 0.0 | 1.0 | 71.4 | 50.0 | 0.0 | 0.0 | 454355.0 | 3132.0 | 6100.0 | 0.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
5 | nan | nan | 15000.0 | 15000.0 | 15000.0 | 36 months | 0.0712 | 463.98 | A | A3 | OWN | 60000.0 | Source Verified | 2014-10-01 00:00:00 | Fully Paid | n | nan | credit_card | Credit card refinancing | 301xx | GA | 8.42 | 0.0 | 1976-11-01 00:00:00 | 0.0 | 40.0 | 20.0 | 9.0 | 2.0 | 15165.0 | 0.442 | 11.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 15165.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 34300.0 | nan | nan | nan | 3.0 | 1685.0 | 14340.0 | 51.2 | 0.0 | 0.0 | nan | 454.0 | 11.0 | 11.0 | 0.0 | 11.0 | nan | 17.0 | 40.0 | 0.0 | 4.0 | 6.0 | 5.0 | 6.0 | 0.0 | 9.0 | 11.0 | 6.0 | 9.0 | 0.0 | 0.0 | 0.0 | 1.0 | 90.9 | 40.0 | 0.0 | 2.0 | 34300.0 | 15165.0 | 29400.0 | 0.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||||
6 | nan | nan | 35000.0 | 35000.0 | 35000.0 | 36 months | 0.1398 | 1195.88 | C | C3 | Teacher | 10+ years | MORTGAGE | 91886.0 | Verified | 2014-08-01 00:00:00 | Fully Paid | n | nan | debt_consolidation | Debt consolidation | 920xx | CA | 20.97 | 0.0 | 1994-07-01 00:00:00 | 2.0 | nan | nan | 10.0 | 0.0 | 38409.0 | 0.662 | 26.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 474161.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 58000.0 | nan | nan | nan | 3.0 | 52685.0 | 14206.0 | 61.1 | 0.0 | 0.0 | 138.0 | 240.0 | 11.0 | 5.0 | 8.0 | 71.0 | nan | 1.0 | nan | 0.0 | 3.0 | 7.0 | 3.0 | 7.0 | 6.0 | 8.0 | 12.0 | 7.0 | 10.0 | 0.0 | 0.0 | 0.0 | 3.0 | 100.0 | 66.7 | 0.0 | 0.0 | 505693.0 | 58911.0 | 36500.0 | 30694.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
7 | nan | nan | 4000.0 | 4000.0 | 3950.0 | 36 months | 0.0917 | 127.52 | B | B1 | Associate Professor | 5 years | MORTGAGE | 55000.0 | Source Verified | 2014-05-01 00:00:00 | Charged Off | n | nan | credit_card | Credit card refinancing | 275xx | NC | 22.81 | 1.0 | 1993-12-01 00:00:00 | 1.0 | 21.0 | nan | 15.0 | 0.0 | 22042.0 | 0.249 | 36.0 | f | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 43231.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 88400.0 | nan | nan | nan | 2.0 | 2882.0 | 56510.0 | 27.9 | 0.0 | 0.0 | 83.0 | 244.0 | 12.0 | 7.0 | 3.0 | 65.0 | 21.0 | 1.0 | 21.0 | 0.0 | 3.0 | 5.0 | 8.0 | 20.0 | 4.0 | 13.0 | 28.0 | 5.0 | 15.0 | 0.0 | 0.0 | 0.0 | 2.0 | 97.2 | 25.0 | 0.0 | 0.0 | 111174.0 | 43231.0 | 78400.0 | 17117.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
8 | nan | nan | 10000.0 | 10000.0 | 10000.0 | 36 months | 0.0649 | 306.45 | A | A2 | owner | 10+ years | OWN | 100000.0 | Source Verified | 2014-06-01 00:00:00 | Fully Paid | n | nan | home_improvement | Home improvement | 890xx | NV | 17.88 | 0.0 | 1991-02-01 00:00:00 | 0.0 | nan | nan | 6.0 | 0.0 | 5414.0 | 0.722 | 23.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 153711.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 7500.0 | nan | nan | nan | 2.0 | 25619.0 | 99.0 | 98.0 | 0.0 | 0.0 | 127.0 | 280.0 | 56.0 | 7.0 | 7.0 | 107.0 | nan | 7.0 | nan | 0.0 | 1.0 | 2.0 | 1.0 | 5.0 | 9.0 | 2.0 | 7.0 | 2.0 | 6.0 | 0.0 | 0.0 | 0.0 | 2.0 | 100.0 | 100.0 | 0.0 | 0.0 | 178016.0 | 83729.0 | 5000.0 | 95922.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
9 | nan | nan | 15000.0 | 15000.0 | 15000.0 | 36 months | 0.1099 | 491.01 | B | B3 | Sr Business Analyst | 10+ years | MORTGAGE | 78200.0 | Not Verified | 2014-05-01 00:00:00 | Fully Paid | n | nan | home_improvement | Home improvement | 786xx | TX | 14.01 | 1.0 | 1995-09-01 00:00:00 | 1.0 | 18.0 | nan | 13.0 | 0.0 | 7559.0 | 0.411 | 28.0 | w | 1.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 735.0 | 270873.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 19600.0 | nan | nan | nan | 3.0 | 22573.0 | 7306.0 | 48.9 | 0.0 | 0.0 | 145.0 | 224.0 | 12.0 | 2.0 | 4.0 | 12.0 | nan | 2.0 | 18.0 | 0.0 | 4.0 | 6.0 | 5.0 | 8.0 | 8.0 | 10.0 | 15.0 | 6.0 | 13.0 | 0.0 | 0.0 | 0.0 | 3.0 | 96.2 | 40.0 | 0.0 | 0.0 | 318951.0 | 32615.0 | 14300.0 | 29351.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan |
import os
# These two CSV files were created in the previous section
train_path = os.getcwd()+"/train_lc.csv.zip"
test_path = os.getcwd()+ "/test_lc.csv.zip"
train = h2o.load_dataset(train_path)
test = h2o.load_dataset(test_path)
Parse progress: |█████████████████████████████████████████████████████████| 100% Parse progress: |█████████████████████████████████████████████████████████| 100%
train.describe()
Rows:188434 Cols:124
id | member_id | loan_amnt | funded_amnt | funded_amnt_inv | term | int_rate | installment | grade | sub_grade | emp_title | emp_length | home_ownership | annual_inc | verification_status | issue_d | loan_status | pymnt_plan | url | desc | purpose | title | zip_code | addr_state | dti | delinq_2yrs | earliest_cr_line | inq_last_6mths | mths_since_last_delinq | mths_since_last_record | open_acc | pub_rec | revol_bal | revol_util | total_acc | initial_list_status | collections_12_mths_ex_med | mths_since_last_major_derog | policy_code | application_type | annual_inc_joint | dti_joint | verification_status_joint | acc_now_delinq | tot_coll_amt | tot_cur_bal | open_acc_6m | open_act_il | open_il_12m | open_il_24m | mths_since_rcnt_il | total_bal_il | il_util | open_rv_12m | open_rv_24m | max_bal_bc | all_util | total_rev_hi_lim | inq_fi | total_cu_tl | inq_last_12m | acc_open_past_24mths | avg_cur_bal | bc_open_to_buy | bc_util | chargeoff_within_12_mths | delinq_amnt | mo_sin_old_il_acct | mo_sin_old_rev_tl_op | mo_sin_rcnt_rev_tl_op | mo_sin_rcnt_tl | mort_acc | mths_since_recent_bc | mths_since_recent_bc_dlq | mths_since_recent_inq | mths_since_recent_revol_delinq | num_accts_ever_120_pd | num_actv_bc_tl | num_actv_rev_tl | num_bc_sats | num_bc_tl | num_il_tl | num_op_rev_tl | num_rev_accts | num_rev_tl_bal_gt_0 | num_sats | num_tl_120dpd_2m | num_tl_30dpd | num_tl_90g_dpd_24m | num_tl_op_past_12m | pct_tl_nvr_dlq | percent_bc_gt_75 | pub_rec_bankruptcies | tax_liens | tot_hi_cred_lim | total_bal_ex_mort | total_bc_limit | total_il_high_credit_limit | revol_bal_joint | sec_app_earliest_cr_line | sec_app_inq_last_6mths | sec_app_mort_acc | sec_app_open_acc | sec_app_revol_util | sec_app_open_act_il | sec_app_num_rev_accts | sec_app_chargeoff_within_12_mths | sec_app_collections_12_mths_ex_med | sec_app_mths_since_last_major_derog | hardship_flag | hardship_type | hardship_reason | hardship_status | deferral_term | hardship_amount | hardship_start_date | hardship_end_date | payment_plan_start_date | hardship_length | hardship_dpd | hardship_loan_status | orig_projected_additional_accrued_interest | hardship_payoff_balance_amount | hardship_last_payment_amount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
type | int | int | int | int | int | enum | real | real | enum | enum | enum | enum | enum | real | enum | time | enum | enum | int | string | enum | enum | enum | enum | real | int | time | int | int | int | int | int | int | real | int | enum | int | int | int | enum | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | real | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | real | real | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | int | enum | enum | enum | enum | int | real | time | time | time | int | int | enum | real | real | real |
mins | NaN | NaN | 1000.0 | 1000.0 | 950.0 | 0.06 | 23.36 | 3000.0 | 1388534400000.0 | NaN | NaN | 0.0 | 0.0 | -820540800000.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 1.0 | NaN | NaN | NaN | 0.0 | 0.0 | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | NaN | NaN | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 16.7 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.0 | 1.47 | 1485907200000.0 | 1491004800000.0 | 1485907200000.0 | 3.0 | 0.0 | 4.41 | 174.15 | 0.04 | ||||||||||||||||||||
mean | 0.0 | 0.0 | 14884.780480168118 | 14884.780480168118 | 14879.984769203027 | 0.13768163813324627 | 443.02308527123586 | 74842.1021655859 | 1403772635222.944 | 0.0 | NaN | 18.038431917806744 | 0.3439559739749727 | 878010488536.0388 | 0.7575596760669507 | 33.40950190769873 | 70.73781512605042 | 11.671577316195588 | 0.2224598533173423 | 16517.317517008567 | 0.5562211189754319 | 26.019354256662876 | 0.015474914293598825 | 42.4452214452214 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0056200048823460734 | 280.02962841100884 | 139916.78260292622 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 30777.311727183012 | 0.0 | 0.0 | 0.0 | 4.403568358151916 | 13425.63404110787 | 8488.266061096774 | 64.59277420047215 | 0.010783616544784914 | 9.796506999798337 | 128.53483025519225 | 185.81975121262616 | 13.078101616481097 | 8.003645838861342 | 1.853216510820764 | 24.440105868864123 | 39.5963653177332 | 6.918806067907228 | 35.46866324059305 | 0.5053122048038038 | 3.686240275109585 | 5.803241453240905 | 4.646629589139977 | 8.54615939798552 | 8.57417451203075 | 8.277009456892086 | 15.304217922455612 | 5.767600326904917 | 11.622318689833035 | 0.0009551360519945328 | 0.0036511457592578833 | 0.09438848615430329 | 2.0082309986520546 | 94.24337699141348 | 50.68730479940775 | 0.13524098623390737 | 0.05543054862710553 | 170413.48133033328 | 48481.2246834436 | 20070.11836505092 | 39932.08656611851 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.0 | 114.21573863636364 | 1509994881818.182 | 1516242681818.182 | 1510936200000.0 | 3.0 | 14.59375 | 339.8909328358209 | 7947.003380681818 | 187.48076704545457 | ||||||||||||||||||||
maxs | NaN | NaN | 35000.0 | 35000.0 | 35000.0 | 0.2606 | 1409.99 | 7500000.0 | 1417392000000.0 | NaN | NaN | 39.99 | 22.0 | 1320105600000.0 | 6.0 | 188.0 | 121.0 | 84.0 | 63.0 | 2560703.0 | 8.923 | 156.0 | 20.0 | 188.0 | 1.0 | NaN | NaN | NaN | 3.0 | 9152545.0 | 3840795.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 9999999.0 | NaN | NaN | NaN | 53.0 | 497484.0 | 260250.0 | 255.2 | 7.0 | 65000.0 | 561.0 | 842.0 | 372.0 | 226.0 | 37.0 | 616.0 | 170.0 | 25.0 | 180.0 | 30.0 | 26.0 | 38.0 | 35.0 | 61.0 | 150.0 | 62.0 | 105.0 | 38.0 | 84.0 | 2.0 | 3.0 | 22.0 | 26.0 | 100.0 | 100.0 | 7.0 | 63.0 | 9999999.0 | 2688920.0 | 1090700.0 | 1027358.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.0 | 344.24 | 1556668800000.0 | 1561939200000.0 | 1556668800000.0 | 3.0 | 32.0 | 1032.72 | 20321.15 | 713.04 | ||||||||||||||||||||
sigma | -0.0 | -0.0 | 8444.529842237767 | 8444.529842237767 | 8441.74965734052 | 0.043236714795126696 | 245.55956150039466 | 55879.6254552511 | 8618925772.810982 | -0.0 | NaN | 8.023289119934871 | 0.9000809312845888 | 235685775429.60028 | 1.035364539099023 | 21.777363524527555 | 28.5075895180931 | 5.280407394680721 | 0.6053968655503703 | 21598.80009195033 | 0.23102402224846186 | 11.891471363727664 | 0.14101201867640875 | 20.880974652987707 | 0.0 | -0.0 | -0.0 | -0.0 | 0.07950349603856371 | 21174.68581402208 | 153006.06065658265 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | 38384.00007004573 | -0.0 | -0.0 | -0.0 | 2.8642976762774732 | 16026.702776462502 | 13412.411414410766 | 26.42401766142629 | 0.1173765114760668 | 565.4116442360041 | 51.34992108739378 | 93.00236919417999 | 16.134812986440643 | 8.759052551750663 | 2.1611275634408718 | 30.30743682243691 | 22.573879361083183 | 5.9354275780436705 | 22.304832497335703 | 1.2700994420156504 | 2.1527147426857067 | 3.1405449886489047 | 2.7200394282746365 | 4.812694785276978 | 7.3089541963331675 | 4.318223682539508 | 8.047168776251766 | 3.1221446203671164 | 5.278551769075766 | 0.03211042594205823 | 0.06423450317302083 | 0.49334840182534967 | 1.6084979574620704 | 8.481447276838344 | 34.90775600526837 | 0.3756866400997473 | 0.4123003808101904 | 172512.507823174 | 46113.3699393363 | 20243.507596068586 | 41490.84884582259 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | -0.0 | 0.0 | 77.0094025673706 | 14500662214.885572 | 14471630662.23209 | 14520502785.954859 | 0.0 | 9.499873480938898 | 240.1814929663525 | 4637.06552271656 | 147.17888898625247 | ||||||||||||||||||||
zeros | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60 | 149487 | 34 | 100647 | 222 | 3 | 2 | 155114 | 448 | 480 | 0 | 185745 | 37 | 0 | 0 | 0 | 0 | 187438 | 159941 | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 66 | 0 | 0 | 0 | 7733 | 31 | 3801 | 1595 | 186638 | 187742 | 2 | 0 | 2977 | 3084 | 72408 | 1245 | 97 | 15431 | 157 | 143588 | 3405 | 453 | 1839 | 330 | 5758 | 53 | 0 | 448 | 2 | 182006 | 187787 | 176622 | 32108 | 0 | 32367 | 164655 | 181918 | 3 | 68 | 2036 | 25148 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 68 | 0 | 0 | 0 | ||||||||||||||||||||
missing | 188434 | 188434 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10581 | 9589 | 0 | 0 | 0 | 0 | 0 | 0 | 188434 | 176217 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 92769 | 155114 | 0 | 0 | 0 | 101 | 0 | 0 | 0 | 135238 | 0 | 0 | 188434 | 188434 | 188434 | 0 | 0 | 0 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 0 | 188434 | 188434 | 188434 | 0 | 3 | 1943 | 2074 | 0 | 0 | 5748 | 0 | 0 | 0 | 0 | 1788 | 138691 | 17436 | 120718 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6261 | 0 | 0 | 0 | 0 | 2036 | 0 | 0 | 0 | 0 | 0 | 0 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 188434 | 0 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188082 | 188166 | 188082 | 188082 |
0 | nan | nan | 18700.0 | 18700.0 | 18700.0 | 60 months | 0.1629 | 457.64 | D | D2 | Assistant Manager | 4 years | MORTGAGE | 52000.0 | Not Verified | 2014-05-01 00:00:00 | Fully Paid | n | nan | credit_card | Credit card refinancing | 630xx | MO | 11.65 | 0.0 | 1999-08-01 00:00:00 | 5.0 | 59.0 | nan | 20.0 | 0.0 | 16920.0 | 0.502 | 37.0 | w | 0.0 | 59.0 | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 117999.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 33700.0 | nan | nan | nan | 6.0 | 6210.0 | 4113.0 | 80.4 | 0.0 | 0.0 | 177.0 | 123.0 | 1.0 | 1.0 | 1.0 | 8.0 | 59.0 | 1.0 | 59.0 | 1.0 | 5.0 | 6.0 | 9.0 | 10.0 | 17.0 | 16.0 | 19.0 | 6.0 | 20.0 | 0.0 | 0.0 | 0.0 | 2.0 | 91.7 | 50.0 | 0.0 | 0.0 | 141329.0 | 51831.0 | 21000.0 | 39404.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
1 | nan | nan | 20000.0 | 20000.0 | 20000.0 | 36 months | 0.0917 | 637.58 | B | B1 | Engineering | 4 years | RENT | 93000.0 | Source Verified | 2014-08-01 00:00:00 | Fully Paid | n | nan | debt_consolidation | Debt consolidation | 334xx | FL | 19.15 | 0.0 | 1996-09-01 00:00:00 | 0.0 | 43.0 | nan | 9.0 | 0.0 | 10597.0 | 0.609 | 24.0 | f | 1.0 | 43.0 | 1.0 | Individual | nan | nan | nan | 0.0 | 2305.0 | 61086.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 17400.0 | nan | nan | nan | 2.0 | 6787.0 | 2303.0 | 82.1 | 0.0 | 0.0 | 215.0 | 178.0 | 33.0 | 5.0 | 0.0 | 33.0 | nan | 9.0 | nan | 2.0 | 3.0 | 3.0 | 4.0 | 9.0 | 11.0 | 5.0 | 12.0 | 3.0 | 9.0 | 0.0 | 0.0 | 0.0 | 1.0 | 91.7 | 50.0 | 0.0 | 0.0 | 79108.0 | 61086.0 | 12900.0 | 61708.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
2 | nan | nan | 11000.0 | 11000.0 | 11000.0 | 36 months | 0.1099 | 360.08 | B | B2 | Teacher director | 10+ years | RENT | 30000.0 | Not Verified | 2014-02-01 00:00:00 | Fully Paid | n | nan | debt_consolidation | Debt consolidation | 010xx | MA | 27.84 | 0.0 | 1994-11-01 00:00:00 | 1.0 | nan | nan | 10.0 | 0.0 | 11523.0 | 0.546 | 21.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 90.0 | 17778.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 21100.0 | nan | nan | nan | 2.0 | 1778.0 | 2200.0 | 71.8 | 0.0 | 0.0 | 133.0 | 231.0 | 6.0 | 6.0 | 0.0 | 63.0 | nan | 6.0 | nan | 0.0 | 4.0 | 9.0 | 4.0 | 9.0 | 5.0 | 9.0 | 16.0 | 9.0 | 10.0 | 0.0 | 0.0 | 0.0 | 1.0 | 100.0 | 75.0 | 0.0 | 0.0 | 35076.0 | 17778.0 | 7800.0 | 13976.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
3 | nan | nan | 15000.0 | 15000.0 | 15000.0 | 60 months | 0.1561 | 361.67 | D | D1 | Accounting Assistant | < 1 year | RENT | 50000.0 | Verified | 2014-07-01 00:00:00 | Fully Paid | n | nan | credit_card | Credit card refinancing | 076xx | NJ | 20.91 | 0.0 | 1997-07-01 00:00:00 | 0.0 | nan | 94.0 | 24.0 | 1.0 | 7884.0 | 0.276 | 31.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 34486.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 28600.0 | nan | nan | nan | 14.0 | 1499.0 | 5285.0 | 57.0 | 0.0 | 0.0 | 148.0 | 203.0 | 3.0 | 1.0 | 0.0 | 10.0 | nan | 7.0 | nan | 0.0 | 3.0 | 7.0 | 4.0 | 6.0 | 11.0 | 16.0 | 20.0 | 7.0 | 24.0 | 0.0 | 0.0 | 0.0 | 11.0 | 100.0 | 25.0 | 1.0 | 0.0 | 88501.0 | 34486.0 | 12300.0 | 59901.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
4 | nan | nan | 6250.0 | 6250.0 | 6250.0 | 36 months | 0.0949 | 200.18 | B | B2 | Full Time Active Duty Military | 10+ years | MORTGAGE | 78000.0 | Source Verified | 2014-11-01 00:00:00 | Fully Paid | n | nan | debt_consolidation | Debt consolidation | 254xx | WV | 1.14 | 0.0 | 2000-12-01 00:00:00 | 0.0 | 55.0 | nan | 6.0 | 0.0 | 3132.0 | 0.344 | 14.0 | w | 0.0 | 55.0 | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 393615.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 9100.0 | nan | nan | nan | 2.0 | 78723.0 | 2968.0 | 51.3 | 0.0 | 0.0 | 167.0 | 117.0 | 12.0 | 12.0 | 4.0 | 35.0 | 55.0 | 20.0 | 55.0 | 3.0 | 1.0 | 1.0 | 2.0 | 5.0 | 2.0 | 4.0 | 8.0 | 1.0 | 6.0 | 0.0 | 0.0 | 0.0 | 1.0 | 71.4 | 50.0 | 0.0 | 0.0 | 454355.0 | 3132.0 | 6100.0 | 0.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
5 | nan | nan | 15000.0 | 15000.0 | 15000.0 | 36 months | 0.0712 | 463.98 | A | A3 | OWN | 60000.0 | Source Verified | 2014-10-01 00:00:00 | Fully Paid | n | nan | credit_card | Credit card refinancing | 301xx | GA | 8.42 | 0.0 | 1976-11-01 00:00:00 | 0.0 | 40.0 | 20.0 | 9.0 | 2.0 | 15165.0 | 0.442 | 11.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 15165.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 34300.0 | nan | nan | nan | 3.0 | 1685.0 | 14340.0 | 51.2 | 0.0 | 0.0 | nan | 454.0 | 11.0 | 11.0 | 0.0 | 11.0 | nan | 17.0 | 40.0 | 0.0 | 4.0 | 6.0 | 5.0 | 6.0 | 0.0 | 9.0 | 11.0 | 6.0 | 9.0 | 0.0 | 0.0 | 0.0 | 1.0 | 90.9 | 40.0 | 0.0 | 2.0 | 34300.0 | 15165.0 | 29400.0 | 0.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||||
6 | nan | nan | 35000.0 | 35000.0 | 35000.0 | 36 months | 0.1398 | 1195.88 | C | C3 | Teacher | 10+ years | MORTGAGE | 91886.0 | Verified | 2014-08-01 00:00:00 | Fully Paid | n | nan | debt_consolidation | Debt consolidation | 920xx | CA | 20.97 | 0.0 | 1994-07-01 00:00:00 | 2.0 | nan | nan | 10.0 | 0.0 | 38409.0 | 0.662 | 26.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 474161.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 58000.0 | nan | nan | nan | 3.0 | 52685.0 | 14206.0 | 61.1 | 0.0 | 0.0 | 138.0 | 240.0 | 11.0 | 5.0 | 8.0 | 71.0 | nan | 1.0 | nan | 0.0 | 3.0 | 7.0 | 3.0 | 7.0 | 6.0 | 8.0 | 12.0 | 7.0 | 10.0 | 0.0 | 0.0 | 0.0 | 3.0 | 100.0 | 66.7 | 0.0 | 0.0 | 505693.0 | 58911.0 | 36500.0 | 30694.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
7 | nan | nan | 4000.0 | 4000.0 | 3950.0 | 36 months | 0.0917 | 127.52 | B | B1 | Associate Professor | 5 years | MORTGAGE | 55000.0 | Source Verified | 2014-05-01 00:00:00 | Charged Off | n | nan | credit_card | Credit card refinancing | 275xx | NC | 22.81 | 1.0 | 1993-12-01 00:00:00 | 1.0 | 21.0 | nan | 15.0 | 0.0 | 22042.0 | 0.249 | 36.0 | f | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 43231.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 88400.0 | nan | nan | nan | 2.0 | 2882.0 | 56510.0 | 27.9 | 0.0 | 0.0 | 83.0 | 244.0 | 12.0 | 7.0 | 3.0 | 65.0 | 21.0 | 1.0 | 21.0 | 0.0 | 3.0 | 5.0 | 8.0 | 20.0 | 4.0 | 13.0 | 28.0 | 5.0 | 15.0 | 0.0 | 0.0 | 0.0 | 2.0 | 97.2 | 25.0 | 0.0 | 0.0 | 111174.0 | 43231.0 | 78400.0 | 17117.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
8 | nan | nan | 10000.0 | 10000.0 | 10000.0 | 36 months | 0.0649 | 306.45 | A | A2 | owner | 10+ years | OWN | 100000.0 | Source Verified | 2014-06-01 00:00:00 | Fully Paid | n | nan | home_improvement | Home improvement | 890xx | NV | 17.88 | 0.0 | 1991-02-01 00:00:00 | 0.0 | nan | nan | 6.0 | 0.0 | 5414.0 | 0.722 | 23.0 | w | 0.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 0.0 | 153711.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 7500.0 | nan | nan | nan | 2.0 | 25619.0 | 99.0 | 98.0 | 0.0 | 0.0 | 127.0 | 280.0 | 56.0 | 7.0 | 7.0 | 107.0 | nan | 7.0 | nan | 0.0 | 1.0 | 2.0 | 1.0 | 5.0 | 9.0 | 2.0 | 7.0 | 2.0 | 6.0 | 0.0 | 0.0 | 0.0 | 2.0 | 100.0 | 100.0 | 0.0 | 0.0 | 178016.0 | 83729.0 | 5000.0 | 95922.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan | ||||||||
9 | nan | nan | 15000.0 | 15000.0 | 15000.0 | 36 months | 0.1099 | 491.01 | B | B3 | Sr Business Analyst | 10+ years | MORTGAGE | 78200.0 | Not Verified | 2014-05-01 00:00:00 | Fully Paid | n | nan | home_improvement | Home improvement | 786xx | TX | 14.01 | 1.0 | 1995-09-01 00:00:00 | 1.0 | 18.0 | nan | 13.0 | 0.0 | 7559.0 | 0.411 | 28.0 | w | 1.0 | nan | 1.0 | Individual | nan | nan | nan | 0.0 | 735.0 | 270873.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 19600.0 | nan | nan | nan | 3.0 | 22573.0 | 7306.0 | 48.9 | 0.0 | 0.0 | 145.0 | 224.0 | 12.0 | 2.0 | 4.0 | 12.0 | nan | 2.0 | 18.0 | 0.0 | 4.0 | 6.0 | 5.0 | 8.0 | 8.0 | 10.0 | 15.0 | 6.0 | 13.0 | 0.0 | 0.0 | 0.0 | 3.0 | 96.2 | 40.0 | 0.0 | 0.0 | 318951.0 | 32615.0 | 14300.0 | 29351.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | N | nan | nan | nan | nan | nan | nan | nan |
from h2o.automl import H2OAutoML
# Identify predictors and response
x = train.columns
y = "loan_status"
x.remove(y)
# For binary classification, response should be a factor
train[y] = train[y].asfactor()
test[y] = test[y].asfactor()
# Run AutoML
aml = H2OAutoML(project_name='LP',
max_models=1, # 1 base models *FOR DEMO PURPOSE
balance_classes=True, # Doing smart Class imbalance sampling
max_runtime_secs=3600, # 1 hour *FOR DEMO PURPOSE (need to be longer runtime or else model will not train)
seed=1234) # Set a seed for reproducability
aml.train(x=x, y=y, training_frame=train)
AutoML progress: |████████████████████████████████████████████████████████| 100%
lb = aml.leaderboard
lb.head(rows=lb.nrows) # Print all rows instead of default (10 rows)
model_id | auc | logloss | mean_per_class_error | rmse | mse |
---|---|---|---|---|---|
XGBoost_1_AutoML_20200922_122203 | 0.705681 | 0.4271 | 0.49588 | 0.366057 | 0.133997 |
test_pc = aml.predict(test)
test_pc
xgboost prediction progress: |████████████████████████████████████████████| 100%
predict | Charged Off | Fully Paid |
---|---|---|
Fully Paid | 0.0300723 | 0.969928 |
Fully Paid | 0.038023 | 0.961977 |
Fully Paid | 0.0306047 | 0.969395 |
Fully Paid | 0.242732 | 0.757268 |
Fully Paid | 0.0589932 | 0.941007 |
Fully Paid | 0.133116 | 0.866884 |
Fully Paid | 0.34577 | 0.65423 |
Charged Off | 0.432134 | 0.567866 |
Fully Paid | 0.379319 | 0.620681 |
Fully Paid | 0.0814494 | 0.918551 |
%%writefile loan_prediction.py
import h2o
from bentoml import api, env, artifacts, BentoService
from bentoml.frameworks.h2o import H2oModelArtifact
from bentoml.adapters import DataframeInput
@env(
pip_packages=['h2o==3.24.0.2', 'pandas'],
conda_channels=['h2oai'],
conda_dependencies=['h2o==3.24.0.2']
)
@artifacts([H2oModelArtifact('model')])
class LoanPrediction(BentoService):
@api(input=DataframeInput(), batch=True)
def predict(self, df):
h2o_frame = h2o.H2OFrame(df, na_strings=['NaN'])
predictions = self.artifacts.model.predict(h2o_frame)
return predictions.as_data_frame()
Overwriting loan_prediction.py
# 1) import the custom BentoService defined above
from loan_prediction import LoanPrediction
# 2) `pack` it with required artifacts
bentoml_svc = LoanPrediction()
bentoml_svc.pack('model', aml.leader)
# 3) save your BentoSerivce
saved_path = bentoml_svc.save()
[2020-09-22 12:39:14,554] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle. [2020-09-22 12:39:14,867] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed. [2020-09-22 12:39:14,870] WARNING - pip package requirement pandas already exist [2020-09-22 12:39:14,873] WARNING - pip package requirement h2o already exist [2020-09-22 12:39:15,844] INFO - Detected non-PyPI-released BentoML installed, copying local BentoML modulefiles to target saved bundle path..
warning: no previously-included files matching '*~' found anywhere in distribution warning: no previously-included files matching '*.pyo' found anywhere in distribution warning: no previously-included files matching '.git' found anywhere in distribution warning: no previously-included files matching '.ipynb_checkpoints' found anywhere in distribution warning: no previously-included files matching '__pycache__' found anywhere in distribution no previously-included directories found matching 'e2e_tests' no previously-included directories found matching 'tests' no previously-included directories found matching 'benchmark'
UPDATING BentoML-0.9.0rc0+3.gcebf2015/bentoml/_version.py set BentoML-0.9.0rc0+3.gcebf2015/bentoml/_version.py to '0.9.0.pre+3.gcebf2015' [2020-09-22 12:39:19,606] INFO - BentoService bundle 'LoanPrediction:20200922123915_EEBBD2' saved to: /Users/bozhaoyu/bentoml/repository/LoanPrediction/20200922123915_EEBBD2
To start a REST API model server with the BentoService saved above, use the bentoml serve command:
!bentoml serve LoanPrediction:latest
[2020-09-22 17:48:06,148] INFO - Getting latest version LoanPrediction:20200922123915_EEBBD2 [2020-09-22 17:48:06,148] INFO - Starting BentoML API server in development mode.. [2020-09-22 17:48:06,386] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle. [2020-09-22 17:48:06,406] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.9.0.pre, but loading from BentoML version 0.9.0.pre+3.gcebf2015 [2020-09-22 17:48:06,777] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed. Checking whether there is an H2O instance running at http://localhost:54321 . connected. Warning: Your H2O cluster version is too old (1 year, 5 months and 5 days)! Please download and install the latest version from http://h2o.ai/download/ -------------------------- --------------------------------------------------- H2O cluster uptime: 5 hours 27 mins H2O cluster timezone: America/Los_Angeles H2O data parsing timezone: UTC H2O cluster version: 3.24.0.2 H2O cluster version age: 1 year, 5 months and 5 days !!! H2O cluster name: H2O_from_python_bozhaoyu_392ekt H2O cluster total nodes: 1 H2O cluster free memory: 3.906 Gb H2O cluster total cores: 8 H2O cluster allowed cores: 8 H2O cluster status: locked, healthy H2O connection url: http://localhost:54321 H2O connection proxy: H2O internal security: False H2O API Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4 Python version: 3.7.3 final -------------------------- --------------------------------------------------- [2020-09-22 17:48:08,298] WARNING - pip package requirement pandas already exist [2020-09-22 17:48:08,298] WARNING - pip package requirement h2o already exist * Serving Flask app "LoanPrediction" (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off INFO:werkzeug: * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) Parse progress: |█████████████████████████████████████████████████████████| 100% xgboost prediction progress: |████████████████████████████████████████████| 100% /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_loan_status': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'initial_list_status': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'mths_since_last_delinq': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'mths_since_last_record': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_amount': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_start_date': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_end_date': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'payment_plan_start_date': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_dpd': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'orig_projected_additional_accrued_interest': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_payoff_balance_amount': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_last_payment_amount': substituting in a column of NaN warnings.warn(w) INFO:werkzeug:127.0.0.1 - - [22/Sep/2020 17:48:11] "POST /predict HTTP/1.1" 400 - ^C H2O session _sid_9aec closed.
If you are running this notebook from Google Colab, you can start the dev server with --run-with-ngrok
option, to gain acccess to the API endpoint via a public endpoint managed by ngrok:
!bentoml serve LoanPrediction:latest --run-with-ngrok
Open http://127.0.0.1:5000 to see more information about the REST APIs server in your browser.
curl -i \
--request POST \
--header "Content-Type: text/csv" \
--data @sample_data.csv \
localhost:5000/predict
One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.
Note that docker is not available in Google Colab. You will need to download and run this notebook locally to try out this containerization with docker feature.
If you already have docker configured, simply run the follow command to product a docker container serving the IrisClassifier prediction service created above:
!bentoml containerize LoanPrediction:latest
[2020-09-22 17:52:45,149] INFO - Getting latest version LoanPrediction:20200922123915_EEBBD2 Found Bento: /Users/bozhaoyu/bentoml/repository/LoanPrediction/20200922123915_EEBBD2 [2020-09-22 17:52:45,190] WARNING - Using BentoML installed in `editable` model, the local BentoML repository including all code changes will be packaged together with saved bundle created, under the './bundled_pip_dependencies' directory of the saved bundle. [2020-09-22 17:52:45,208] WARNING - Saved BentoService bundle version mismatch: loading BentoService bundle create with BentoML version 0.9.0.pre, but loading from BentoML version 0.9.0.pre+3.gcebf2015 Tag not specified, using tag parsed from BentoService: 'loanprediction:20200922123915_EEBBD2' Building Docker image loanprediction:20200922123915_EEBBD2 from LoanPrediction:latest -we in here processed docker file (None, None) root in create archive /Users/bozhaoyu/bentoml/repository/LoanPrediction/20200922123915_EEBBD2 ['Dockerfile', 'LoanPrediction', 'LoanPrediction/__init__.py', 'LoanPrediction/__pycache__', 'LoanPrediction/__pycache__/loan_prediction.cpython-37.pyc', 'LoanPrediction/artifacts', 'LoanPrediction/artifacts/__init__.py', 'LoanPrediction/artifacts/model', 'LoanPrediction/bentoml.yml', 'LoanPrediction/loan_prediction.py', 'MANIFEST.in', 'README.md', 'bentoml-init.sh', 'bentoml.yml', 'bundled_pip_dependencies', 'bundled_pip_dependencies/BentoML-0.9.0rc0+3.gcebf2015.tar.gz', 'docker-entrypoint.sh', 'environment.yml', 'python_version', 'requirements.txt', 'setup.py'] about to build about to upgrade params check each param and update if use config proxy if buildargs if shmsize if labels if cache from if target if network_mode if squash if extra hosts is not None if platform is not None if isolcation is not None if context is not None setting auth {'Content-Type': 'application/tar'} \docker build <tempfile._TemporaryFileWrapper object at 0x7fa3e54b4da0> {'t': 'loanprediction:20200922123915_EEBBD2', 'remote': None, 'q': False, 'nocache': False, 'rm': False, 'forcerm': False, 'pull': False, 'dockerfile': (None, None)} |docker response <Response [200]> context closes \print responses Step 1/15 : FROM bentoml/model-server:0.9.0.pre ---> a25066aa8b0e Step 2/15 : ARG EXTRA_PIP_INSTALL_ARGS= ---> Using cache ---> 315719b8980e Step 3/15 : ENV EXTRA_PIP_INSTALL_ARGS $EXTRA_PIP_INSTALL_ARGS ---> Using cache ---> a3b6c8107d94 Step 4/15 : COPY environment.yml requirements.txt setup.sh* bentoml-init.sh python_version* /bento/ ---> Using cache ---> 8a93eb1a85af Step 5/15 : WORKDIR /bento ---> Using cache ---> 22714b0bba7e Step 6/15 : RUN chmod +x /bento/bentoml-init.sh ---> Using cache ---> 44c32c282581 Step 7/15 : RUN if [ -f /bento/bentoml-init.sh ]; then bash -c /bento/bentoml-init.sh; fi ---> Running in dde5e5814405 |+++ dirname /bento/bentoml-init.sh ++ cd /bento ++ pwd -P + SAVED_BUNDLE_PATH=/bento + cd /bento + '[' -f ./setup.sh ']' + '[' -f ./python_version ']' ++ cat ./python_version + PY_VERSION_SAVED=3.7.3 + DESIRED_PY_VERSION=3.7 ++ python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")' \+ CURRENT_PY_VERSION=3.7 + [[ 3.7 == \3\.\7 ]] + echo 'Python Version in docker base image 3.7 matches requirement python=3.7. Skipping.' + command -v conda + echo 'Updating conda base environment with environment.yml' + conda env update -n base -f ./environment.yml Python Version in docker base image 3.7 matches requirement python=3.7. Skipping. Updating conda base environment with environment.yml \Collecting package metadata (repodata.json): ...working... \done Solving environment: ...working... Examining python=3.7: 0%| | 0/5 [00:00<?, ?it/s] Examining @/linux-64::__glibc==2.28=0: 40%|████ | 2/5 [00:00<00:01, 2.04it/s] Examining h2o==3.24.0.2: 40%|████ | 2/5 [00:00<00:01, 2.04it/s] Examining pip: 80%|████████ | 4/5 [00:01<00:00, 2.78it/s] Examining openjdk: 80%|████████ | 4/5 [00:10<00:00, 2.78it/s] Examining openjdk: 100%|██████████| 5/5 [00:10<00:00, 3.19s/it] Examining conflict for python h2o pip: 0%| | 0/5 [00:00<?, ?it/s] Examining conflict for python pip: 40%|████ | 2/5 [00:00<00:00, 7.88it/s] Examining conflict for python openjdk: 60%|██████ | 3/5 [00:00<00:00, 8.36it/s] Examining conflict for python openjdk h2o pip: 80%|████████ | 4/5 [00:00<00:00, 8.33it/s] Examining conflict for h2o pip: 80%|████████ | 4/5 [00:00<00:00, 8.33it/s] Examining conflict for h2o pip: 100%|██████████| 5/5 [00:00<00:00, 5.08it/s] Found conflicts! Looking for incompatible packages. This can take several minutes. Press CTRL-C to abort. failed Solving environment: ...working... Examining python=3.7: 0%| | 0/5 [00:00<?, ?it/s] Examining @/linux-64::__glibc==2.28=0: 40%|████ | 2/5 [00:01<00:01, 1.87it/s] Examining h2o==3.24.0.2: 40%|████ | 2/5 [00:01<00:01, 1.87it/s] Examining pip: 80%|████████ | 4/5 [00:01<00:00, 2.56it/s] Examining openjdk: 100%|██████████| 5/5 [00:11<00:00, 3.29s/it] Examining conflict for python h2o pip: 0%| | 0/5 [00:00<?, ?it/s] Examining conflict for python pip: 40%|████ | 2/5 [00:00<00:00, 7.38it/s] Examining conflict for python openjdk: 60%|██████ | 3/5 [00:00<00:00, 7.79it/s] Examining conflict for python openjdk h2o pip: 80%|████████ | 4/5 [00:00<00:00, 7.81it/s] Examining conflict for h2o pip: 80%|████████ | 4/5 [00:00<00:00, 7.81it/s] Examining conflict for h2o pip: 100%|██████████| 5/5 [00:00<00:00, 4.75it/s] Found conflicts! Looking for incompatible packages. This can take several minutes. Press CTRL-C to abort. failed UnsatisfiableError: The following specifications were found to be incompatible with the existing python installation in your environment: Specifications: - h2o==3.24.0.2 -> python[version='>=2.7,<2.8.0a0|>=3.5,<3.6.0a0|>=3.6,<3.7.0a0'] Your python: python=3.7 If python is on the left-most side of the chain, that's the version you've asked for. When python appears to the right, that indicates that the thing on the left is somehow not available for the python version you are constrained to. Note that conda will not change your python version to a different minor version unless you explicitly specify that. The following specifications were found to be incompatible with each other: Output in format: Requested package -> Available versions Package setuptools conflicts for: pip -> setuptools python=3.7 -> pip -> setuptools Package _libgcc_mutex conflicts for: python=3.7 -> libgcc-ng[version='>=7.5.0'] -> _libgcc_mutex[version='*|0.1',build='main|conda_forge'] openjdk -> libgcc-ng[version='>=7.5.0'] -> _libgcc_mutex[version='*|0.1',build='main|conda_forge'] Package certifi conflicts for: h2o==3.24.0.2 -> requests[version='>=2.10'] -> certifi[version='>=2017.4.17'] pip -> setuptools -> certifi[version='>=2016.09|>=2016.9.26'] Package libstdcxx-ng conflicts for: python=3.7 -> libstdcxx-ng[version='>=4.9|>=7.3.0|>=7.5.0|>=7.2.0'] h2o==3.24.0.2 -> python[version='>=3.6,<3.7.0a0'] -> libstdcxx-ng[version='>=4.9|>=7.3.0|>=7.5.0|>=7.2.0'] pip -> python[version='>=3'] -> libstdcxx-ng[version='>=4.9|>=7.3.0|>=7.5.0|>=7.2.0'] openjdk -> libstdcxx-ng[version='>=7.3.0|>=7.5.0'] Error: bentoml-cli containerize failed: The command '/bin/sh -c if [ -f /bento/bentoml-init.sh ]; then bash -c /bento/bentoml-init.sh; fi' returned a non-zero code: 1
!docker run --p 5000:5000 loanprediction
bentoml.load is the API for loading a BentoML packaged model in python:
import pandas as pd
loaded_bentoml_svc = bentoml.load(saved_path)
sample_data = pd.read_csv('sample_data.csv')
result = loaded_bentoml_svc.predict(sample_data)
print(result)
[2020-02-24 17:21:25,109] WARNING - BentoML local changes detected - Local BentoML repository including all code changes will be bundled together with the BentoService bundle. When used with docker, the base docker image will be default to same version as last PyPI release at version: 0.6.2. You can also force bentoml to use a specific version for deploying your BentoService bundle, by setting the config 'core/bentoml_deploy_version' to a pinned version or your custom BentoML on github, e.g.:'bentoml_deploy_version = git+https://github.com/{username}/bentoml.git@{branch}' [2020-02-24 17:21:25,121] WARNING - Saved BentoService bundle version mismatch: loading BentoServie bundle create with BentoML version 0.6.2, but loading from BentoML version 0.6.2+16.g7795c2f [2020-02-24 17:21:25,122] WARNING - Module `loan_prediction` already loaded, using existing imported module. Checking whether there is an H2O instance running at http://localhost:54321 . connected. Warning: Your H2O cluster version is too old (10 months and 7 days)! Please download and install the latest version from http://h2o.ai/download/
H2O cluster uptime: | 2 hours 30 mins |
H2O cluster timezone: | America/Los_Angeles |
H2O data parsing timezone: | UTC |
H2O cluster version: | 3.24.0.2 |
H2O cluster version age: | 10 months and 7 days !!! |
H2O cluster name: | H2O_from_python_bozhaoyu_7bamxr |
H2O cluster total nodes: | 1 |
H2O cluster free memory: | 3.805 Gb |
H2O cluster total cores: | 8 |
H2O cluster allowed cores: | 8 |
H2O cluster status: | locked, healthy |
H2O connection url: | http://localhost:54321 |
H2O connection proxy: | None |
H2O internal security: | False |
H2O API Extensions: | Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4 |
Python version: | 3.7.3 final |
[2020-02-24 17:21:25,414] WARNING - BentoML local changes detected - Local BentoML repository including all code changes will be bundled together with the BentoService bundle. When used with docker, the base docker image will be default to same version as last PyPI release at version: 0.6.2. You can also force bentoml to use a specific version for deploying your BentoService bundle, by setting the config 'core/bentoml_deploy_version' to a pinned version or your custom BentoML on github, e.g.:'bentoml_deploy_version = git+https://github.com/{username}/bentoml.git@{branch}' Parse progress: |█████████████████████████████████████████████████████████| 100% xgboost prediction progress: |████████████████████████████████████████████| 100% predict Charged Off Fully Paid 0 Charged Off 0.436739 0.563261 1 Fully Paid 0.056414 0.943586
BentoML cli supports loading and running a packaged model from CLI. With the DataframeInput adapter, the CLI command supports reading input Dataframe data from CLI argument or local csv or json files:
!bentoml run LoanPrediction:latest predict --input-file sample_data.csv
[2020-02-24 17:30:05,013] INFO - Getting latest version LoanPrediction:20200224153935_977ED8 [2020-02-24 17:30:05,014] WARNING - BentoML local changes detected - Local BentoML repository including all code changes will be bundled together with the BentoService bundle. When used with docker, the base docker image will be default to same version as last PyPI release at version: 0.6.2. You can also force bentoml to use a specific version for deploying your BentoService bundle, by setting the config 'core/bentoml_deploy_version' to a pinned version or your custom BentoML on github, e.g.:'bentoml_deploy_version = git+https://github.com/{username}/bentoml.git@{branch}' [2020-02-24 17:30:05,028] WARNING - Saved BentoService bundle version mismatch: loading BentoServie bundle create with BentoML version 0.6.2, but loading from BentoML version 0.6.2+16.g7795c2f [2020-02-24 17:30:05,114] WARNING - BentoML local changes detected - Local BentoML repository including all code changes will be bundled together with the BentoService bundle. When used with docker, the base docker image will be default to same version as last PyPI release at version: 0.6.2. You can also force bentoml to use a specific version for deploying your BentoService bundle, by setting the config 'core/bentoml_deploy_version' to a pinned version or your custom BentoML on github, e.g.:'bentoml_deploy_version = git+https://github.com/{username}/bentoml.git@{branch}' Checking whether there is an H2O instance running at http://localhost:54321 . connected. Warning: Your H2O cluster version is too old (10 months and 7 days)! Please download and install the latest version from http://h2o.ai/download/ -------------------------- --------------------------------------------------- H2O cluster uptime: 2 hours 38 mins H2O cluster timezone: America/Los_Angeles H2O data parsing timezone: UTC H2O cluster version: 3.24.0.2 H2O cluster version age: 10 months and 7 days !!! H2O cluster name: H2O_from_python_bozhaoyu_7bamxr H2O cluster total nodes: 1 H2O cluster free memory: 3.805 Gb H2O cluster total cores: 8 H2O cluster allowed cores: 8 H2O cluster status: locked, healthy H2O connection url: http://localhost:54321 H2O connection proxy: H2O internal security: False H2O API Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4 Python version: 3.7.3 final -------------------------- --------------------------------------------------- [2020-02-24 17:30:06,948] WARNING - BentoML local changes detected - Local BentoML repository including all code changes will be bundled together with the BentoService bundle. When used with docker, the base docker image will be default to same version as last PyPI release at version: 0.6.2. You can also force bentoml to use a specific version for deploying your BentoService bundle, by setting the config 'core/bentoml_deploy_version' to a pinned version or your custom BentoML on github, e.g.:'bentoml_deploy_version = git+https://github.com/{username}/bentoml.git@{branch}' Parse progress: |█████████████████████████████████████████████████████████| 100% xgboost prediction progress: |████████████████████████████████████████████| 100% /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset column 'emp_title' has levels not trained on: [Sr, Project Coordinator] warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset column 'hardship_reason' has levels not trained on: [nan] warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset column 'hardship_loan_status' has levels not trained on: [nan] warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset column 'hardship_status' has levels not trained on: [nan] warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'initial_list_status': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'mths_since_last_delinq': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'mths_since_last_record': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_amount': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_start_date': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_end_date': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'payment_plan_start_date': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_dpd': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'orig_projected_additional_accrued_interest': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_payoff_balance_amount': substituting in a column of NaN warnings.warn(w) /usr/local/anaconda3/envs/dev-py3/lib/python3.7/site-packages/h2o/job.py:69: UserWarning: Test/Validation dataset is missing column 'hardship_last_payment_amount': substituting in a column of NaN warnings.warn(w) predict Charged Off Fully Paid 0 Charged Off 0.436739 0.563261 1 Fully Paid 0.056414 0.943586 H2O session _sid_80c4 closed.
If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:
If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:
Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy: