Unsupervised learning algorithms have often tackled image modeling by first modeling small image patches. This tutorial walks through the process of learning features from raw image pathces, and the introduces a whitening algorithm that is often used ot pre-process image patches before learning. Whitening is bilogically plausible, gives rise to physiologically normal first layer features, and typically yields better supervised performance.
The whitening algorithm introduced here is borrowed from Adam Coate's matlab sample code accompanying his paper The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization with Andrew Ng at ICML 2011.
Whitening is often not discussed in detail in papers, but it can be crucial to reproducing state-of-the-art performance. The Stanford UFLDL Tutorial has a nice section on whitening.
There are several techniques for whitening, this one combines a technique used in recent papers from Andrew Ng's group at Stanford, with techniques developed by Nicolas Pinto at MIT.
%pylab inline
rcParams['axes.grid'] = False
rcParams['figure.subplot.wspace'] = 0
rcParams['figure.subplot.hspace'] = 0
import numpy as np
import autodiff
import util
from skdata import cifar10
nexamples = 10000
npatches = 10000
patchH = 8
patchW = 8
img_shape = (8, 8, 3)
dtype = 'float32'
data_view = cifar10.view.OfficialImageClassification(x_dtype=dtype,
n_train=nexamples)
x = data_view.train.x[:nexamples]
x_test = data_view.test.x[:nexamples]
y = data_view.train.y[:nexamples]
y_test = data_view.test.y[:nexamples]
print x.shape, x_test.shape, y.shape, y_test.shape
print x.min(), x.max(), np.unique(y), np.unique(y_test)
Populating the interactive namespace from numpy and matplotlib (10000, 32, 32, 3) (10000, 32, 32, 3) (10000,) (10000,) 0.0 1.0 [0 1 2 3 4 5 6 7 8 9] [0 1 2 3 4 5 6 7 8 9]
WARNING: pylab import has clobbered these variables: ['dtype'] `%pylab --no-import-all` prevents importing * from pylab and numpy
## random patch extractor
def random_patches(images, N, R, C):
"""
Return a stack of N uniformly dranw image patches of
size (N, R, C, channels)
images: 4-tensor of shape (n_imgs, rows, cols, channels)
N - number of patches to extract
R - rows per patch
C - cols per patch
"""
rng = np.random # random seed
n_imgs, n_rows, n_cols, n_colors = images.shape
rval = np.empty((N, R, C, n_colors))
selected_imgs = rng.randint(n_imgs, size = N)
selected_rows = rng.randint(n_rows - R, size = N)
selected_cols = rng.randint(n_cols - C, size = N)
for i_rval, i_img, i_row, i_col in zip(rval, selected_imgs, selected_rows, selected_cols):
## reference instead of value
i_rval[:] = images[i_img, i_row:i_row+R, i_col:i_col+R, :]
return rval
def show_filters(imgs, img_shape, layout):
nrows, ncols = layout
fig, axes = subplots(*layout, figsize = (1.2 * ncols, 1.2 * nrows))
axes = axes.flatten()
for i, img in enumerate(imgs):
axes[i].axis('off')
axes[i].imshow(img)
x_pathces = random_patches(x,
npatches, patchH, patchW)
x_test_patches = random_patches(x_test,
npatches, patchH, patchW)
x_feats = x_pathces.reshape(npatches, -1)
x_test_feats = x_test_patches.reshape(npatches, -1)
print x_feats.shape, x_test_patches.shape
show_filters(x_pathces[:100], (8, 8, 3), (10, 10))
(10000, 192) (10000, 8, 8, 3)