Note: All numbers are in the form of $'000 unless otherwise stated
Let's import some libraries first...
import pandas
from pandas.plotting import scatter_matrix
from sklearn import datasets
from sklearn import model_selection
from sklearn import linear_model
# models
from sklearn.ensemble import RandomForestRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
Load the past few years of relevant data.
dataset = pandas.read_csv("data/ccp-consumer-lending-full-year.csv")
print (dataset)
period revenue npbt gross_book_average net_lending 0 FY14 19104 -3522 41465.5 49130 1 FY15 35826 1401 81343.0 51063 2 FY16 52418 8709 117278.5 55077 3 FY17 66374 17596 147714.5 46184 4 FY18 79336 23028 171786.0 52405
Let us create a linear regression model with the whole dataset.
array = dataset.values
X = array[:,3:5] # data = avg_gross_loan_book, net_lending
Y = array[:,2] # result = NPAT
model = LinearRegression()
model.fit(X, Y) # train model
# the model's linear regression coefficients
print("Coefficients: \t%s" % model.coef_)
print("Intercept: \t%s" % model.intercept_)
print("\nThe equation would look like...")
print("p = %sr + %sl + %s" % (model.coef_[0], model.coef_[1], model.intercept_))
Coefficients: [ 0.2111212 -0.28462548] Intercept: 265.191305252 The equation would look like... p = 0.211121197019r + -0.284625478565l + 265.191305252
Where
p = Net profit before tax (npbt)
b = Average gross loan book (gross_book_average)
l = Net lending for the period (net_lending)
Assumptions:
Gross loan book to end the year at $199.896m (long story on how I got to something so specific)
Average gross loan book will be $191.496m
Net lending will be $50m, on the upper range of the forecast. Quoting a high number here will actually reduce NPBT.
gross_book_average = 191496
net_lending = 50000
npbt = model.predict([[gross_book_average, net_lending]])[0]
print("EBIT = $%sm" % (npbt/1000))
print("NPAT = $%sm" % (npbt/1000 * 0.7))
EBIT = $26.4627821213m NPAT = $18.5239474849m
This sits inside the $17 - 19m range forecast by management, so our model is not crazy bad!
The higher the Net Lending completed by the company, the lower the reported Net Profit due to the way the company provisions the expected lossed upfront. So you get a situation where the NPAT is under-reported, unless the company stops growing its loan book. So what happens with NPAT when the loan book stops growing?
net_lending = gross_book_average * 0.1734
print("\nNet Lending Assumption = %s\n" % net_lending)
npbt_zero_growth = model.predict([[gross_book_average, net_lending]])[0]
print("EBIT: $%sm" % (npbt_zero_growth / 1000))
print("NPAT: $%sm" % (npbt_zero_growth * 0.7 / 1000))
print("\nNPAT buffer: $%sm" % ((npbt_zero_growth - npbt) / 1000))
Net Lending Assumption = 33205.4064 EBIT: $31.242951362m NPAT: $21.8700659534m NPAT buffer: $4.7801692407m