Classification of handwritten digits using a SVM

This example demonstrates the application of a SVM classifier to recognize handwritten digits between 0 and 9. Each handwritten digit is represented as a $8 \times 8$-greyscale image of 4 Bit depth. An image displaying digit $i$ is labeled by the class index $i$ with $i \in \lbrace 0,1,2,\ldots,9\rbrace$. The entire dataset contains 1797 labeled images. This dataset is often applied as a benchmark for evaluating and comparing machine learning algorithms. The dataset is available from different sources. E.g. it is contained in the scikits-learn datasets directory.

In [1]:
from matplotlib import pyplot as plt
from sklearn import datasets, svm, metrics

The image dataset is loaded from the scikits-learn datasets directory. The load_digits() function returns a so called Bunch, which contains 2 numpy arrays - the images and the corresponding labeles:

In [2]:
digits = datasets.load_digits()
n_samples = len(digits.images)
print type(digits)
print type(digits.images)
print type(digits.target)
print "Number of labeled images: ",n_samples
<class 'sklearn.datasets.base.Bunch'>
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
Number of labeled images:  1797

In order to understand the representation of the digits the first 4 images are dispayed in a matplotlib-figure. Moreover, the contents of the first image are printed. Each image is a $8 \times 8$-numpy array with integer entries between $0$ (white) and $15$ (black).

In [3]:
plt.figure(figsize=(12, 10))
NIMAGES=4
for index in range(NIMAGES):
    plt.subplot(1,NIMAGES, index+1)
    plt.imshow(digits.images[index,:], cmap=plt.cm.gray_r)
    plt.title('Training sample of class: %i' % digits.target[index])
print digits.images[0,:]
[[  0.   0.   5.  13.   9.   1.   0.   0.]
 [  0.   0.  13.  15.  10.  15.   5.   0.]
 [  0.   3.  15.   2.   0.  11.   8.   0.]
 [  0.   4.  12.   0.   0.   8.   8.   0.]
 [  0.   5.   8.   0.   0.   9.   8.   0.]
 [  0.   4.  11.   0.   1.  12.   7.   0.]
 [  0.   2.  14.   5.  10.  12.   0.   0.]
 [  0.   0.   6.  13.  10.   0.   0.   0.]]

Each image is represented as a 2-dimensional array. Since all scikits-learn algorithms require that a single sample is represented as a one-dimensional element (typically one row within an array of training-samples), the 2-dimensional images must be flattened. After the flattening the array of images and the array of corresponding labels is randomly permuted. This step is necessary in order to guarantee, that the labels in the training set are equally distributed, i.e. for each class approximately the same number of training elements is available. The entire set of labeled images is partitioned such that a ratio of $3/4$ is used for training and the rest is used for testing.

In [4]:
flatImages = digits.images.reshape((n_samples, -1))
randperm = np.random.permutation(n_samples)
flatImages=flatImages[randperm,:]
targets=digits.target[randperm]
trainSamples=n_samples/3
plt.figure(figsize=(12, 10))
plt.hist(targets[:trainSamples])
plt.title("Distribution of labels in training partition")
plt.xticks(np.arange(10))
plt.xlabel("Class Label")
plt.ylabel("Frequency of Label")
Out[4]:
<matplotlib.text.Text at 0xe1c0a50>

A SVM classifier with an rbf-kernel, $C=1$ and $\gamma=0.001$ is trained with the training-partition. The trained model is then applied to predict the class label of the test-partition.

In [5]:
classifier = svm.SVC(kernel='rbf',C=1,gamma=0.001,coef0=1)
classifier.fit(flatImages[:trainSamples], targets[:trainSamples])
expected = targets[trainSamples:]
predicted = classifier.predict(flatImages[trainSamples:])

Functions from the scikits-learn metrics-module are applied to evaluate the classifier.

  • precision determines how much of the recognized class-$i$ instances are actually class-$i$ instances: $$Precision_i=\frac{TruePositive_i}{TruePositive_i + FalsePositive_i}.$$
  • recall determines how much of the class-$i$ inputs are recognized as class-$i$ instances. $$Recall_i=\frac{TruePositive_i}{TruePositive_i + FalseNegative_i}.$$
  • f1-score is the harmonic mean of precision and recall: $$F1-Score = 2\frac{Precision_i*Recall_i}{Precision_i+Recall_i}.$$

Moreover, the confusion matrix is plotted. The entry in row $i$ column $j$ of the confusion matrix determines the number of class-$i$ instances that have been recognized as class-$j$ instance.

Question: How to determine precision and recall from the confusion matrix?

In [6]:
print "Classification report for classifier %s:\n%s\n" % (
    classifier, metrics.classification_report(expected, predicted))
print "Confusion matrix:\n%s" % metrics.confusion_matrix(expected, predicted)
Classification report for classifier SVC(C=1, cache_size=200, class_weight=None, coef0=1, degree=3, gamma=0.001,
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False):
             precision    recall  f1-score   support

          0       1.00      0.99      1.00       119
          1       0.96      1.00      0.98       121
          2       0.98      1.00      0.99       118
          3       1.00      0.93      0.96       120
          4       0.99      0.98      0.99       126
          5       0.97      0.98      0.97       119
          6       1.00      0.98      0.99       130
          7       1.00      1.00      1.00       117
          8       0.93      0.95      0.94       115
          9       0.96      0.97      0.97       113

avg / total       0.98      0.98      0.98      1198


Confusion matrix:
[[118   0   0   0   1   0   0   0   0   0]
 [  0 121   0   0   0   0   0   0   0   0]
 [  0   0 118   0   0   0   0   0   0   0]
 [  0   0   2 111   0   2   0   0   4   1]
 [  0   0   0   0 124   0   0   0   2   0]
 [  0   0   0   0   0 117   0   0   0   2]
 [  0   1   0   0   0   0 128   0   1   0]
 [  0   0   0   0   0   0   0 117   0   0]
 [  0   4   1   0   0   0   0   0 109   1]
 [  0   0   0   0   0   2   0   0   1 110]]

Finally the first 4 images of the test-partition are plotted together with the corresponding predicted label.

In [7]:
plt.figure(figsize=(12, 10))
for index in range(NIMAGES):
    plt.subplot(2,NIMAGES, index+1)
    predImage=(flatImages[trainSamples+index,:]).reshape((8,8))
    plt.imshow(predImage, cmap=plt.cm.gray_r)
    plt.title('Predicted as class: %i' % predicted[index])