%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import numpy as np
import pandas as pd
from lifelines.fitters.piecewise_exponential_regression_fitter import PiecewiseExponentialRegressionFitter
from lifelines import *
from lifelines.datasets import load_regression_dataset
from lifelines.generate_datasets import piecewise_exponential_survival_data
# this code can be skipped
# generating piecewise exponential data to look like a monthly churn curve.
N, d = 2000, 2
# some numbers take from http://statwonk.com/parametric-survival.html
breakpoints = (1, 31, 34, 62, 65, 93, 96)
betas = np.array(
[
[1.0, -0.2, np.log(15)],
[5.0, -0.4, np.log(333)],
[9.0, -0.6, np.log(18)],
[5.0, -0.8, np.log(500)],
[2.0, -1.0, np.log(20)],
[1.0, -1.2, np.log(500)],
[1.0, -1.4, np.log(20)],
[1.0, -3.6, np.log(250)],
]
)
X = 0.1 * np.random.exponential(size=(N, d))
X = np.c_[X, np.ones(N)]
T = np.empty(N)
for i in range(N):
lambdas = np.exp(-betas.dot(X[i, :]))
T[i] = piecewise_exponential_survival_data(1, breakpoints, lambdas)[0]
T_censor = np.minimum(T.mean() * np.random.exponential(size=N), 110) # 110 is the end of observation, eg. current time.
df = pd.DataFrame(X[:, :-1], columns=['var1', 'var2'])
df["T"] = np.minimum(T, T_censor)
df["E"] = T <= T_censor
df.head()
kmf = KaplanMeierFitter().fit(df['T'], df['E'])
kmf.plot(figsize=(11,6))
To borrow a term from finance, we clearly have different regimes that a customer goes through: periods of low churn and periods of high churn, both of which are predictable.
Let's fit a piecewise hazard model to this curve. Since we have baseline information, we can fit a regression model. For simplicity, let's assume that a customer's hazard is constant in each period, however it varies over each customer (heterogenity in customers).
Our hazard model looks like¹: $$ h(t\;|\;x) = \begin{cases} \lambda_0(x)^{-1}, & t \le \tau_0 \\ \lambda_1(x)^{-1} & \tau_0 < t \le \tau_1 \\ \lambda_2(x)^{-1} & \tau_1 < t \le \tau_2 \\ ... \end{cases} $$
and $\lambda_i(x) = \exp(\mathbf{\beta}_i x^T), \;\; \mathbf{\beta}_i = (\beta_{i,1}, \beta_{i,2}, ...)$. That is, each period has a hazard rate, $\lambda_i$ the is the exponential of a linear model. The parameters of each linear model are unique to that period - different periods have different parameters (later we will generalize this).
Why do I want a model like this? Well, it offers lots of flexibilty (at the cost of efficiency though), but importantly I can see:
¹ I specifiy the reciprocal because that follows lifelines convention for exponential and weibull hazards. In practice, it means the interpretation of the sign is possibly different.
pew = PiecewiseExponentialRegressionFitter(
breakpoints=breakpoints)\
.fit(df, "T", "E")
pew.print_summary()
fig, ax = plt.subplots(figsize=(10, 10))
pew.plot(ax=ax)
If we suspect there is some parameter sharing between pieces, or we want to regularize (and hence share information) between pieces, we can include a penalizer which penalizes the variance of the estimates per covariate.
Specifically, our penalized log-likelihood, $PLL$, looks like:
$$ PLL = LL - \alpha \sum_j \hat{\sigma}_j^2 $$where $\hat{\sigma}_j$ is the standard deviation of $\beta_{i, j}$ over all periods $i$. This acts as a regularizer and much like a multilevel component in Bayesian statistics.
Below we examine some cases of $\alpha$.
Note: we do not penalize the intercept, currently. This is a modellers decision, but I think it's better not too.
# Extreme case, note that all the covariates' parameters are almost identical.
pew = PiecewiseExponentialRegressionFitter(
breakpoints=breakpoints,
penalizer=20.0)\
.fit(df, "T", "E")
fig, ax = plt.subplots(figsize=(10, 10))
pew.plot(ax=ax)
# less extreme case
pew = PiecewiseExponentialRegressionFitter(
breakpoints=breakpoints,
penalizer=.25)\
.fit(df, "T", "E", timeline=np.linspace(0, 130, 200))
fig, ax = plt.subplots(figsize=(10, 10))
pew.plot(ax=ax, fmt="s", label="small penalty on variance")
# compare this to the no penalizer case
pew_no_penalty = PiecewiseExponentialRegressionFitter(
breakpoints=breakpoints,
penalizer=0)\
.fit(df, "T", "E", timeline=np.linspace(0, 130, 200))
pew_no_penalty.plot(ax=ax, c="r", fmt="o", label="no penalty on variance")
plt.legend()
# Some prediction methods
pew.predict_survival_function(df.loc[0:5]).plot(figsize=(10, 5))
pew.predict_cumulative_hazard(df.loc[0:5]).plot(figsize=(10, 5))
pew.predict_median(df.loc[0:5])
# hazard
pew.predict_cumulative_hazard(df.loc[0:5]).diff().plot(figsize=(10, 5))
pew.print_summary()
from lifelines import WeibullAFTFitter
wf = WeibullAFTFitter().fit(df, 'T', 'E')