🤖⚡ scikit-learn tip #24 (video)¶

Two new functions in scikit-learn 0.21 for visualizing decision trees:

plot_tree: uses Matplotlib (not Graphviz!)
export_text: doesn't require any external libraries

See examples 👇

In [1]:

import pandas as pd
df = pd.read_csv('http://bit.ly/kaggletrain')
df['Sex'] = df['Sex'].map({'male':0, 'female':1})

In [2]:

features = ['Pclass', 'Fare', 'Sex']
X = df[features]
y = df['Survived']

In [3]:

classes = ['Deceased', 'Survived']

In [4]:

from sklearn.tree import DecisionTreeClassifier
dt = DecisionTreeClassifier(max_depth=2, random_state=0)
dt.fit(X, y);

In [5]:

import matplotlib.pyplot as plt
from sklearn.tree import plot_tree, export_text  # both are new in 0.21

In [6]:

plt.figure(figsize=(8, 6))
plot_tree(dt, feature_names=features, class_names=classes, filled=True);

In [7]:

print(export_text(dt, feature_names=features, show_weights=True))

|--- Sex <= 0.50
|   |--- Fare <= 26.27
|   |   |--- weights: [361.00, 54.00] class: 0
|   |--- Fare >  26.27
|   |   |--- weights: [107.00, 55.00] class: 0
|--- Sex >  0.50
|   |--- Pclass <= 2.50
|   |   |--- weights: [9.00, 161.00] class: 1
|   |--- Pclass >  2.50
|   |   |--- weights: [72.00, 72.00] class: 0

🤖⚡ scikit-learn tip #24 (video)¶

Want more tips? View all tips on GitHub or Sign up to receive 2 tips by email every week 💌¶