This notebook contains code and comments from Section 4.5 of the book Ensemble Methods for Machine Learning. Please see the book for additional details on this topic. This notebook and code are released under the MIT license.
We introduce a second boosting algorithm called LogitBoost. The development of LogitBoost was motivated by the desire to bring loss functions from established classification models such as logistic regression into the AdaBoost framework. In this manner, the general boosting framework can be applied to specific classification settings in order to train boosted ensembles with properties similar to those classifiers.
Under the hood, AdaBoost optimizes the exponential loss. LogitBoost, on the other hand, optimizes the logistic loss, which logistic regression also uses. The figure below compares the two loss functions.
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(5, 4))
x = np.linspace(-2.0, 2.0, num=500)
ax.plot(x, np.exp(-x), linewidth=3, marker=None)
ax.plot(x, np.log(1 + np.exp(-x)) / np.log(2), linewidth=4, linestyle='--', marker=None)
ax.plot(x, (-x >= 0).astype(float), linewidth=3, linestyle=':', c='k', marker=None)
ax.text(-2, 1.1, 'Misclassified')
ax.text(1, 1, 'Correctly\nclassified')
ax.set_xlabel('Extent of mis/correct classification (margin)', fontsize=12)
ax.set_ylabel('Loss', fontsize=12)
fig.legend(['Exponential loss', 'Logistic loss', 'Exact 0-1 loss'], fontsize=12)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
fig.tight_layout()
# plt.savefig('./figures/CH04_F16_Kunapuli.png', format='png', dpi=300, bbox_inches='tight', pad_inches=0)
# plt.savefig('./figures/CH04_F16_Kunapuli.pdf', format='pdf', dpi=300, bbox_inches='tight', pad_inches=0)
The LogitBoost algorithm performs the following steps within each iteration. The probability P(yi=1|xi) is abbreviated Pi:
Listing 4.5: LogitBoost for classification
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import accuracy_score
from scipy.special import expit
def fit_logitboosting(X, y, n_estimators=10):
n_samples, n_features = X.shape
D = np.ones((n_samples, )) / n_samples
p = np.full((n_samples, ), 0.5)
estimators = []
for t in range(n_estimators):
z = (y - p) / (p * (1 - p))
D = p * (1 - p)
h = DecisionTreeRegressor(max_depth=1)
h.fit(X, z, sample_weight=D)
estimators.append(h)
if t == 0:
margin = np.array([h.predict(X)
for h in estimators]).reshape(-1, )
else:
margin = np.sum(np.array([h.predict(X)
for h in estimators]), axis=0)
p = expit(margin)
return estimators
The predict_boosting function described in Listing 4.2 can be used to make predictions with the LogitBoost ensembles as well. However, LogitBoost requires training labels to be in 0/1 form while AdaBoost requires them to be in -1/1 form. Thus, we modify that function slightly to return 0/1 labels.
Listing 4.5: LogitBoost for prediction
def predict_logit_boosting(X, estimators):
pred = np.zeros((X.shape[0], ))
for h in estimators:
pred += h.predict(X)
y = (np.sign(pred) + 1) / 2
return y
As with AdaBoost, we can visualize how the ensemble trained by LogitBoost evolves over several iterations.
from sklearn.datasets import make_moons
from sklearn.metrics import accuracy_score
X, y = make_moons(n_samples=200, noise=0.125, random_state=13)
from plot_utils import plot_2d_data, plot_2d_classifier
n_samples, n_features = X.shape
n_estimators = 20
p = np.full((n_samples,), 0.5) # Initialize the prediction probabilities
estimators = [] # Initialize an empty ensemble
fig, ax = plt.subplots(nrows=2, ncols=3, figsize=(12, 8))
ax_index = -1
for t in range(n_estimators):
z = (y - p) / (p * (1 - p))
D = p * (1 - p)
h = DecisionTreeRegressor(max_depth=1)
h.fit(X, z, sample_weight=D) # Train a weak learner using sample weights
estimators.append(h)
if t == 0:
margin = np.array([h.predict(X) for h in estimators]).reshape(-1, )
else:
margin = np.sum(np.array([h.predict(X) for h in estimators]), axis=0)
p = expit(margin)
# -- Plot the ensemble
if t in [0, 1, 4, 9, 14, 19]:
ax_index += 1
r, c = np.divmod(ax_index, 3)
ypred = predict_logit_boosting(X, estimators)
err = (1 - accuracy_score(y, ypred)) * 100
title = 'Iteration {0}: err = {1:4.2f}%'.format(t + 1, err)
plot_2d_classifier(ax[r, c], X, y,
predict_function=predict_logit_boosting, predict_args=estimators,
alpha=0.3, xlabel=None, ylabel=None, s=80,
title=title, colormap='Blues')
ax[r, c].set_xticks([])
ax[r, c].set_yticks([])
fig.tight_layout()
# plt.savefig('./figures/CH04_F17_Kunapuli.png', format='png', dpi=300, bbox_inches='tight', pad_inches=0)
# plt.savefig('./figures/CH04_F17_Kunapuli.pdf', format='pdf', dpi=300, bbox_inches='tight', pad_inches=0)