Facial Reconstruction

Description: Given a training set of 11,000 64x64 training images of faces, and a test set of 1,233 64x32 training images containing only the left side of a face image, reconstruct the right sides of the test images. This reconstruction will use the linear least squares regression technique.

In [1]:
# The libraries that will be used

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import os
import random 
import scipy.io
import urllib

Getting the Data

First a new folder for the image data called 'facial_reconstruction' is made in the current working directory . This new directory is made the current working directory and the image data is downloded to it. The image data is in matlab matrix format, so they must be converted to numpy objects.

In [2]:
os.mkdir(os.getcwd() + '/facial_reconstruction')    # make a directory called Facial_Reconstruction
os.chdir('facial_reconstruction')                   # set working directory to the direct
print 'Image data files will be downloaded to {}'.format(os.getcwd())   # show location data is downloading to
Image data files will be downloaded to /home/neil/Facial-Reconstruction/facial_reconstruction
In [3]:
data_location = "http://www.sci.ccny.cuny.edu/~szlam/2013-fall-366/"    # url of image data data
image_stack_names = ['train_data_left',                                 # names of the data matrices
img_stacks = []  # this will hold the downloaded image data as numpy matrices   
for index, name in enumerate(image_stack_names):                   
    file_name = name + '.mat'                                      # creates filename used to get and name the data
    urllib.urlretrieve(data_location + file_name, file_name)       # download the file to the Facial Recognition folder
    print 'File {} of 3 has finished downloading'.format(index+1)  # send a message that the image data downloaded successfully
    img_stacks.append(scipy.io.loadmat(name)[name])                # convert matlab data to numpy data and stores it in a list
print 'The contents of {} are {}'.format(os.getcwd(), os.listdir(os.getcwd())) # make sure the data downloaded and is in the folder
File 1 of 3 has finished downloading
File 2 of 3 has finished downloading
File 3 of 3 has finished downloading
The contents of /home/neil/Facial-Reconstruction/facial_reconstruction are ['train_data_left.mat', 'test_data_left.mat', 'train_data_right.mat']

Reshaping the Data

The data will be checked by looking at the first image in the training and testing stacks. The data will be confirmed to be held in a 3-dimensional stack of images, converted to a 2-dimensional stack where each row contains an image in vector form, and checked again.

In [4]:

print 'Viewing first complete image from the 2 stacks of {} {}x{} training images'.format(
Viewing first complete image from the 2 stacks of 12000 64x32 training images
In [5]:

print 'Viewing first image from the stack of {} {}x{} test images'.format(img_stacks[2].shape[2],
Viewing first image from the stack of 1233 64x32 test images
In [6]:
# To train the filter I need to take the 3-d stacks of images and turn them into 2-d matrices 
# Each image will be turned into a vector and stored as a row in the corresponding matrix

def convert_to_rows(stack):
    '''Takes a stack of images and creates a numpy matrix where every row is an image'''
    size = stack.shape[2]           # get number of pictures in stacks
    images = np.empty((size, 2048)) # create empty matrix to hold image vectors
    for i in range(size):           # loop through each image in the stack
        images[i] = stack[:,:,i].reshape(2048,)     # reshape image to a vector and store it in the matrix
    return images

img_rows = []                                # this will hold our converted stacks of images                            
for stack in img_stacks:                     # loop through the 3 stacks of images
    img_rows.append(convert_to_rows(stack))  # convert the stacked images into vector images 
In [7]:
# Let's make sure that the stacks of images were converted correctly and that each row is indeed an image

                    img_rows[0][0].reshape(64, 32), 
                    img_rows[1][0].reshape(64, 32),

print 'Viewing first rows/images from the 2 {}x{} training matrices'.format(img_rows[0].shape[0],
Viewing first rows/images from the 2 12000x2048 training matrices
In [8]:
# Now to check that the first row/image in the test matrix

plt.imshow(img_rows[2][0].reshape(64, 32))

print 'Viewing first row/image from the {}x{} test matrix'.format(img_rows[0].shape[0],
Viewing first row/image from the 12000x2048 test matrix

Regression and Reconstruction

Now a 2,048 square weight matrix will be found that minimizes the least squares error between L*w and R, where L and R are 12,000 x 2,0148 matrices containing the left and right sides of the training data images respectively. Then the 1,233x2,048 test matrix is run through w to reconstruct the right side of the test images so that test * w = reconstructed. Finding this reconstructed matrix completes the goal of this project.

In [9]:
left, right, test = img_rows                # unpack our images matrices to convenient names   
weights = np.linalg.lstsq(left, right)[0]   # use least squares regression to find weight matrix that reconstructs partial facial images
reconstructed = np.dot(test, weights)       # then reconstruct the other half of the test set images 


Because this is image data, we can look at the reconstructed images to test our results. Using the view function, any image from any stack can be viewed. The train and recon functions show a random training or reconstructed image in its entirety.

In [10]:
# And finally let's create a function view our reconstructed images, as well as images from the test and training set

def view(stack, image):  # function to view an image
    ''' View a single image from a chosen stack of images
    Keyword arguments: 
        stack -- (str) this should be either 'train', 'test', or 'recon'                 
        number -- (int) this is the index number of the image the stack,
                  12,000 images in the training set, 1233 in test set
    if stack == 'train':
    elif stack == 'test':
    elif stack == 'recon':
def train():
    """Use  this function to view random training images"""
    img_no = random.randint(0,12000)
    print 'training image number:', img_no
    view('train', random.randint(0,12000))
def recon():
    """Use this function to view random reconstructed images"""
    img_no = random.randint(0,1233)
    print 'reconstructed image number:', img_no
    view('recon', random.randint(0,1233))
In [11]:
training image number: 1373
In [12]:
training image number: 2488
In [13]:
training image number: 10587
In [14]:
reconstructed image number: 629
In [15]:
reconstructed image number: 662
In [16]:
reconstructed image number: 622
In [17]:
reconstructed image number: 840
In [18]:
reconstructed image number: 264
In [19]:
reconstructed image number: 685
In [20]:
reconstructed image number: 577
In [21]:
reconstructed image number: 965
In [22]:
reconstructed image number: 17