This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.

# 8.3. Learning to recognize handwritten digits with a K-nearest neighbors classifier¶

1. Let's do the traditional imports.
In [ ]:
import numpy as np
import sklearn
import sklearn.datasets as ds
import sklearn.cross_validation as cv
import sklearn.neighbors as nb
import matplotlib.pyplot as plt
%matplotlib inline

1. Let's load the digits dataset, part of the datasets module of scikit-learn. This dataset contains hand-written digits that have been manually labeled.
In [ ]:
digits = ds.load_digits()
X = digits.data
y = digits.target
print((X.min(), X.max()))
print(X.shape)


In the matrix X, each row contains the $8 \times 8=64$ pixels (in grayscale, values between 0 and 16). The pixels are ordered according to the row-major order.

1. Let's display some of the images.
In [ ]:
nrows, ncols = 2, 5
plt.figure(figsize=(6,3));
plt.gray()
for i in range(ncols * nrows):
ax = plt.subplot(nrows, ncols, i + 1)
ax.matshow(digits.images[i,...])
plt.xticks([]); plt.yticks([]);
plt.title(digits.target[i]);

1. Now, let's fit a K-nearest neighbors classifier on the data.
In [ ]:
(X_train, X_test,
y_train, y_test) = cv.train_test_split(X, y, test_size=.25)

In [ ]:
knc = nb.KNeighborsClassifier()

In [ ]:
knc.fit(X_train, y_train);

1. Let's evaluate the score of the trained classifier on the test dataset.
In [ ]:
knc.score(X_test, y_test)

1. Now, let's see if our classifier can recognize a "hand-written" digit!
In [ ]:
# Let's draw a 1.
one = np.zeros((8, 8))
one[1:-1, 4] = 16  # The image values are in [0, 16].
one[2, 3] = 16

In [ ]:
plt.figure(figsize=(2,2));
plt.imshow(one, interpolation='none');
plt.grid(False);
plt.xticks(); plt.yticks();
plt.title("One");

In [ ]:
knc.predict(one.ravel())


You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).

IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).