Part 1 of 4 - Exploratory Data Analysis

| Final Capstone Project for the Diploma Program in Data Science | BrainStation Vancouver |

| Arash Tavassoli | May-June 2019 |


This is the first notebook in a series of four:

  • Part 1 - Exploratory Data Analysis

  • Part 2 - Data Preprocessing

  • Part 3 - Model Training and Analysis

  • Part 4 - Real-Time Facial Expression Recognition

What to expect in this notebook:

  1. A brief intoroduction to the project
  2. Exploratory data analysis on the dataset and initial filtering of the data
  3. Defining the expected model accuracy (Bayes error and human error)

1. Introduction:

The field known as Facial Expression Recognition (FER) represents an Image Classification problem within the wider field of Computer Vision that is deemed of significant academic and commercial potential.

While machine learning systems may be trained to accurately recognize the emotional expressions from images of human faces, the task still remains to be a challenge for a variety of reasons, including but not limited to absence of publicly availabile, high-quality labelled images, the subtle differences between some facial expressions (e.g. sad and neutral) and the computationally expensive training process.

In this project, a machine learning system is developed to recognize emotional expressions (e.g. Happiness, Sadness, Anger or Surprise) from images of human faces using a deep-learning implementation of Convolutional Neural Networks (CNN) using TensorFlow (Keras) and OpenCV in Python.

1.1. Motivation

This project is completed as the Capstone project for my 12-week, immersive, full-time, Diploma program in Data Science at BrainStation Vancouver but stems from my life-long interest in understanding how the computers detect faces and process images of such nature. It was also motivated by my interest in facing the challenge of dealing with large-sized data and computationally expensive data processing and machine kearning training process.

1.2. The Project at a Glance

Below shows a brief summary of the steps taken in this project, which will be further discussed in the 4 Jupyter Notebooks listed above:

1.3. Data Source

The main dataset for this project is the AffectNet Dataset, which is a relatively new dataset of facial expressions in the wild and contains over 1 million facial images collected from Internet by querying major search engines using 1250 emotion related keywords in six different languages. About half of the retrieved images (~440K) are manually annotated and the rest are labelled automatically using a trained model. AffectNet is gathered by a research group at University of Denver and was generously shared with me for research use in this project. The size of the database is over 120GB, which represents a challenge when it come to preprocessing and model training.

Citation: Ali Mollahosseini, Behzad Hasani, and Mohammad H. Mahoor, “AffectNet: A New Database for Facial Expression, Valence, and Arousal Computation in the Wild”, IEEE Transactions on Affective Computing, 2017

The other datasets that were used in the early stages of the model development were Extended Cohn-Kanade Dataset (CK+) and FEI Face Database that are not presented in this notebook.

1.4. Tech Stack

Python (Jupyter Notebook for local computations and Google Colab for training the model on Google GPUs), OpenCV, Keras and TensorFlow.


2. Exploratory Data Analysis:

As the very first step, we look at the AffectNet dataset to get a better understanding of the available data, its limitations and expected challenges.

Let's start by importing the libraries that we'll need in this part:

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import cv2
import seaborn as sns
from IPython.display import clear_output
from sklearn.metrics import confusion_matrix

The AffectNet dataset has come with three file lists with the sub-directory filepath and associated meta-data (including class label) listed for each image; two separate lists for the manually annotated images (splitted for training and validation) and one list for the remaining automaticcaly annotated images:

In [2]:
# Root folder path for where the raw images and file lists are saved
root_dir = '/Volumes/Arash External Drive/AffectNet Data'

# Importing the list of filepath and image metadata 
# (with added columns to distinguish between manually/auto annotated images)
file_list_manual1 = pd.read_csv(root_dir + '/Manually_Annotated_file_lists/training.csv')
file_list_manual1['annotation'] = 'manual'

file_list_manual2 = pd.read_csv(root_dir + '/Manually_Annotated_file_lists/validation.csv')
file_list_manual2['annotation'] = 'manual'

file_list_auto = pd.read_csv(root_dir + '/Automatically_annotated_file_list/automatically_annotated.csv')
file_list_auto['annotation'] = 'auto'

# Concatenating all lists into one master file_list
file_list = pd.concat([file_list_manual1, file_list_manual2, file_list_auto]).reset_index(drop = True)

file_list.head(3)
Out[2]:
subDirectory_filePath face_x face_y face_width face_height facial_landmarks expression valence arousal annotation
0 689/737db2483489148d783ef278f43f486c0a97e140fc... 134 134 899 899 181.64;530.91;188.32;627.82;195.1;723.37;205.2... 1 0.785714 -0.055556 manual
1 392/c4db2f9b7e4b422d14b6e038f0cdc3ecee239b5532... 20 20 137 137 28.82;77.52;29.12;93.25;31.04;108.51;33.03;123... 0 -0.017253 0.004313 manual
2 468/21772b68dc8c2a11678c8739eca33adb6ccc658600... 11 11 176 176 30.52;87.33;32.55;106.43;36.94;125.81;43.06;14... 0 0.174603 0.007937 manual

During the data pre-processing phase, one of the images was found to be corrupted and must be removed from the master list:

In [3]:
path_to_corrupted_file = '103/29a31ebf1567693f4644c8ba3476ca9a72ee07fe67a5860d98707a0a.jpg'
index = (file_list[file_list['subDirectory_filePath'] == path_to_corrupted_file]).index

file_list = file_list.drop(index)

Next, we use the class names and get a summary of data availability for each class:

In [3]:
# Defining expression names
expression_summary = pd.DataFrame([[0, 'Neutral'], [1, 'Happiness'], [2, 'Sadness'], [3, 'Surprise'], 
                                   [4, 'Fear'], [5, 'Disgust'], [6, 'Anger'], [7, 'Contempt'], 
                                   [8, 'None'], [9, 'Uncertain'], [10, 'No-Face']], 
                                  columns = ['Expression Code', 'Expression Name'])

# Adding counts for each class
expression_summary['Count (Total)'] = file_list.groupby(['expression']).size()
expression_summary['Count (Manually-Annotated)'] = file_list[file_list['annotation'] == 'manual'].groupby(['expression']).size()
expression_summary['Count (Auto-Annotated)'] = file_list[file_list['annotation'] == 'auto'].groupby(['expression']).size()

# Sorting based on Count (Total) columns
expression_summary = expression_summary.sort_values('Count (Total)', ascending=False).reset_index(drop=True)

display(expression_summary)
Expression Code Expression Name Count (Total) Count (Manually-Annotated) Count (Auto-Annotated)
0 1 Happiness 381150 134915 246235
1 0 Neutral 218516 75374 143142
2 10 No-Face 153432 82915 70517
3 6 Anger 53382 25382 28000
4 2 Sadness 46813 25959 20854
5 8 None 40847 33588 7259
6 3 Surprise 32052 14590 17462
7 9 Uncertain 13592 12145 1447
8 4 Fear 10677 6878 3799
9 5 Disgust 5193 4303 890
10 7 Contempt 4252 4250 2

Before proceeding, let's have a look at a sample image from each of the main classes:

In [5]:
# A function to load and show sample images from the manually annotated ones:
def image_viewer(index):
    file_dir = root_dir + '/Manually_Annotated_compressed/Manually Annotated Images/'
    
    sample_image = cv2.imread(file_dir + file_list['subDirectory_filePath'].iloc[index])
    expression_code = file_list['expression'].iloc[index]
    expression_name = expression_summary[expression_summary['Expression Code'] == expression_code]['Expression Name'].values[0]
    
    plt.imshow(cv2.cvtColor(sample_image, cv2.COLOR_BGR2RGB))
    plt.title(expression_name, fontsize=14);
In [7]:
# Examples of Expressions:
plt.figure(figsize = (16,10))
gridspec.GridSpec(2,3)

plt.subplot2grid((2,3), (0,0))
image_viewer(85)

plt.subplot2grid((2,3), (0,1))
image_viewer(1959)

plt.subplot2grid((2,3), (0,2))
image_viewer(3638)

plt.subplot2grid((2,3), (1,0))
image_viewer(3666)

plt.subplot2grid((2,3), (1,1))
image_viewer(8244)

plt.subplot2grid((2,3), (1,2))
image_viewer(505)

Next, let's visualize the total number of images in each class:

In [7]:
x = expression_summary['Expression Name']
y = expression_summary['Count (Total)']

# Marking the 'No-Face', 'None', 'Uncertain' classes as orange
custom_colors = ['orangered' if (i in ['No-Face', 'None', 'Uncertain']) else 'seagreen' for i in x]

plt.figure(figsize = (24,5))
sns.barplot(x, y, palette=custom_colors)
sns.despine(offset=10, trim=False)
plt.title('Number of Examples for each Expression', fontsize=22, pad = 30);
plt.xlabel('Expression Name', fontsize=18, labelpad=25)
plt.ylabel('Total Number of Images', fontsize=18, labelpad=25)
plt.xticks(fontsize=17)
plt.yticks(fontsize=17)

for i, v in enumerate(y):
    plt.text(i, v + 9000, "{:,}".format(v), color='black', ha='center', fontsize=18);

It is clear from this graph that we're dealing with a considerable class imbalance in this dataset. A large portion of the data is on 'happy' and 'neutral' images, while minority classes 'fear', 'disgust' and 'contempt' are represented by a much smaller number of images.

Another one to note is that the 'no-face', 'none' and 'uncertain' classes add no meaningful value to the model, and therefore can be excluded from the dataset.

For the scope of this project, we will exclude the three minority classes 'fear', 'disgust' and 'contempt' as well as the 'no-face', 'none' and 'uncertain' classes. This will narrow down our focus on the following remaining categories:

In [16]:
expression_list = ['Happiness', 'Neutral', 'Anger', 'Sadness', 'Surprise']
expression_summary = expression_summary[expression_summary['Expression Name'].isin(expression_list)]

# Saving as one csv to be used in next parts:
expression_summary.to_csv('data/expression_summary.csv')

display(expression_summary)
Expression Code Expression Name Count (Total) Count (Manually-Annotated) Count (Auto-Annotated)
0 1 Happiness 381150 134915 246235
1 0 Neutral 218516 75374 143142
3 6 Anger 53382 25382 28000
4 2 Sadness 46813 25959 20854
6 3 Surprise 32052 14590 17462
In [9]:
x = expression_summary['Expression Name']
y = expression_summary['Count (Total)']

plt.figure(figsize = (24,5))
sns.barplot(x, y, color='seagreen')
sns.despine(offset=10, trim=False)

plt.title('Number of Examples for each Expression', fontsize=22, pad = 30);
plt.xlabel('Expression Name', fontsize=18, labelpad=25)
plt.ylabel('Total Number of Images', fontsize=18, labelpad=25)
plt.xticks(fontsize=17)
plt.yticks(fontsize=17)

for i, v in enumerate(y):
    plt.text(i, v + 9000, "{:,}".format(v), color='black', ha='center', fontsize=18);