ART decision tree classifier attack¶

This notebook shows how to compute adversarial examples on decision trees (as described in by Papernot et al. in https://arxiv.org/abs/1605.07277). Due to the structure of the decision tree, an adversarial example can be computed without any explicit gradients, only by traversing the learned tree structure.

Consider the following simple decision tree for four dimensional data, where we go to the left if a condition is true:

                F1<3

          F2<5        F2>2

     F4>3     C1    F3<1     C3* 

  C1     C2       C3    C1

Given sample [4,4,1,1], the tree outputs C3 (as indicated by the star). To misclassify the sample, we walk one node up and explore the subtree on the left. We find the leaf outputting C1 and change the two features, obtaining [4,1.9,0.9,1]. In this implementation, we change only the features with wrong values, and specify the offset in advance.

Applying the attack¶

In [1]:

from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_digits
from matplotlib import pyplot as plt
import numpy as np

from art.attacks.evasion import DecisionTreeAttack
from art.estimators.classification import SklearnClassifier

In [2]:

digits = load_digits()
X = digits.data
y = digits.target

clf = DecisionTreeClassifier()
clf.fit(X,y)
clf_art = SklearnClassifier(clf)
print(clf.predict(X[:14]))
plt.imshow(X[0].reshape(8,8))
plt.colorbar()

[0 1 2 3 4 5 6 7 8 9 0 1 2 3]

Out[2]:

<matplotlib.colorbar.Colorbar at 0x7fa24585ca10>

We now craft adversarial examples and plot their classification. The difference is really small, and often only one or two features are changed.

In [3]:

attack = DecisionTreeAttack(clf_art)
adv = attack.generate(X[:14])
print(clf.predict(adv))
plt.imshow(adv[0].reshape(8,8))
# plt.imshow((X[0]-adv[0]).reshape(8,8))  ##use this to plot the difference

Decision tree attack: 100%|██████████| 14/14 [00:00<00:00, 1546.08it/s]

[6 4 4 6 6 4 1 2 4 4 6 4 6 4]

Out[3]:

<matplotlib.image.AxesImage at 0x7fa24570fa10>

The change is possibly larger if we specify which class the sample should be (mis-)classified as. To do this, we just specify a label for each attack point.

In [4]:

adv = attack.generate(X[:14],np.array([6,6,7,7,8,8,9,9,1,1,2,2,3,3]))
print(clf.predict(adv))
plt.imshow(adv[0].reshape(8,8))

Decision tree attack: 100%|██████████| 14/14 [00:00<00:00, 1073.48it/s]

[6 6 7 7 8 8 9 9 1 1 2 2 3 3]

Out[4]:

<matplotlib.image.AxesImage at 0x7fa245684f50>

Finally, the attack has an offset parameter which specifies how close the new value of the feature is compared to the learned threshold of the tree. The default value is very small (0.001), however the value can be set larger when desired. Setting it to a very large value might however yield adversarial examples outside the range or normal features!

In [5]:

attack = DecisionTreeAttack(clf_art,offset=20.0)
adv = attack.generate(X[:14])
print(clf.predict(adv))
plt.imshow(adv[0].reshape(8,8))
plt.colorbar()

Decision tree attack: 100%|██████████| 14/14 [00:00<00:00, 1586.65it/s]

[6 4 4 4 6 4 1 2 4 4 6 4 4 4]

Out[5]:

<matplotlib.colorbar.Colorbar at 0x7fa24562fd90>

In [ ]: