Celebs vs. Mortals: Facial Recognition in Python

After watching way too much Game of Thrones and Ink Master, I started wondering what it is that makes a celebrity standout from the rest of us. Are there features, especially facial features, that can distinguish celebrities from common folk?


I put this question out of my mind until recently, while looking for interesting applications of datamining Twitter, I stumbled on AlchemyAPI, an IBM Company. AlchemyAPI provides a high-level interface to deep-learning tools. They offer a suite of natural language processing tools, for example the ability to have a computer read a newsarticle, extract the relevant people and ideas, and decide if they are of favorable sentiment. Try out the webdemo, it's really cool. They also offer computer vision tools, of particular interest to me, an API for facial recognition and extraction. They provide 9 SDK's, including one for Python. Python also happens to be a great language for data and image analysis with libraries like scikit-image, IPython, and matplotlib, and there's a lot of cool stuff to be done with faces.

HTML5 Icon


This demo will compare age and gender predictions of faces, automatically extracted from web images, between me, my friends, and some of our favorite celebrities. In regard to facial recognition and analysis, AlchemyAPI does all of the hard work. I can then simply use some scikit-image and matplotlib Py-Fu to visualize the images, and rank them by age and gender scores.


If you want to try this for yourself, you'll need the following:

  1. AlchemyAPI Python SDK (Must use working dev branch)
  2. scikit-image
  3. pyparty (for some multiplot utilities)

Loading some people

I used facebook to track the URL to some pictures of my friends, with unobstructed views of their faces and seemingly good image quality. Then I grabbed a bunch of individual and group pics of celebrities.


The links to my friends are hidden for privacy...

In [1]:
%pylab inline
from __future__ import division
import skimage.io as skio

# Change matplotlib label fontsize
from matplotlib import rcParams
rcParams['font.size'] = 15

BUDDIES = dict(ME = 'https://fbcdn-sphotos-g-a.akamaihd.net/hphotos-ak-xap1/v/t1.0-9/970276_643330279472_1364386421_n.jpg?oh=ef0a6a5758cee5ddccd1eb951937916f&oe=557FFA87&__gda__=1434357290_1745bc5628ade9870d78c55c661b4046',',
               LOVELY_FIANCE ='<URL HIDDEN>',
               CLAIRE ='<URL HIDDEN>', 
               ...
               )

CELEBS = dict(GOT_GALS = 'http://media4.popsugar-assets.com/files/2013/07/04/088/n/4981324/cdf19997323733ea_Main.xxxlarge/i/Pictures-Women-Game-Thrones-Emilia-Clarke.jpg',
              GOT_GUYS = 'http://media1.popsugar-assets.com/files/2014/06/13/959/n/1922283/462854365d4d1e3f_GoT-Cover.xxxlarge/i/Hottest-Guys-Game-Thrones.jpg',
              INKMASTER = 'http://www.slangstrong.com/wp-content/uploads/2012/11/Ink-Masters.jpg',
              FASTFIVEGUYS = 'http://thatsenuff.com/wp-content/uploads/2011/04/0429-rather-obvi-credit.jpg',
              MROD = 'http://d1oi7t5trwfj5d.cloudfront.net/81/3e/1e704c3c4232b7d10eaf67f8d260/michelle-rodriguez.jpg',
              EVA = 'http://media1.popsugar-assets.com/files/2013/01/02/3/192/1922398/d8a83663d4d95316_evamendes.xxxlarge_2.jpg',
              RYGOSLING = 'http://blogs.psychcentral.com/life-goals/files/2015/01/ryan-gosling.jpg',
              THE_CLOONE = 'http://img2-2.timeinc.net/people/i/2014/sandbox/news/140210/george-clooney-600x450.jpg',
              REAL_HOUSEWIVES = 'http://media.silive.com/entertainment_impact_tvfilm/photo/real-housewives-of-new-jerseyjpg-87e95765ec41dcc8.jpg'
              )
              
US_AND_THEM = dict(BUDDIES.items() + CELEBS.items())
              

def showimage(url_or_array, *args, **kwargs):
    """ Displays image; removes x/y labels"""
    ax = kwargs.pop('ax', pylab.gca())
    if isinstance(url_or_array, basestring):
        out = ax.imshow(skio.imread(url_or_array), *args, **kwargs)
    else:
        out = ax.imshow(url_or_array, *args, **kwargs)
    # Hide x and y axis ticks/labels
    ax.get_xaxis().set_ticks([])
    ax.get_yaxis().set_ticks([])
    return out
Populating the interactive namespace from numpy and matplotlib

So for example, my not so flattering picture looks like:

In [2]:
showimage(US_AND_THEM['ME'])
plt.title('6th St. Austin');

And for ink master:

In [3]:
showimage(CELEBS['INKMASTER'])
plt.title("OMG DAVE NAVARRO!");

And so on

In [4]:
showimage(CELEBS['GOT_GALS']);

Face Finding

Next, I have to load Alchemy's Python SDK and then I can pass these images into the face recognition features.

In [7]:
import os 
os.chdir(os.path.expanduser('~/Desktop/alchemyapi_python/'))

from alchemyapi import AlchemyAPI
api = AlchemyAPI() #<-- Must Instantiate

We can use AlchemyAPI.faceTagging to find one or more faces in an image. For example, the three ink masters:

In [9]:
api.faceTagging('url', CELEBS['INKMASTER'])
Out[9]:
{u'imageFaces': [{u'age': {u'ageRange': u'35-44', u'score': u'0.389988'},
   u'gender': {u'gender': u'MALE', u'score': u'0.970688'},
   u'height': u'61',
   u'identity': {u'disambiguated': {u'dbpedia': u'http://dbpedia.org/resource/Dave_Navarro',
     u'freebase': u'http://rdf.freebase.com/ns/m.01lz4tf',
     u'name': u'Dave Navarro',
     u'subType': [u'Person',
      u'Composer',
      u'MusicalArtist',
      u'Celebrity',
      u'FilmMusicContributor',
      u'Guitarist',
      u'Lyricist',
      u'MusicalGroupMember',
      u'TVProducer',
      u'TVActor'],
     u'website': u'http://www.6767.com/',
     u'yago': u'http://yago-knowledge.org/resource/Dave_Navarro'},
    u'name': u'Dave Navarro',
    u'score': u'0.970688'},
   u'positionX': u'244',
   u'positionY': u'77',
   u'width': u'61'},
  {u'age': {u'ageRange': u'35-44', u'score': u'0.478642'},
   u'gender': {u'gender': u'MALE', u'score': u'0.995033'},
   u'height': u'72',
   u'positionX': u'445',
   u'positionY': u'80',
   u'width': u'72'},
  {u'age': {u'ageRange': u'45-54', u'score': u'0.388825'},
   u'gender': {u'gender': u'MALE', u'score': u'0.993307'},
   u'height': u'58',
   u'positionX': u'73',
   u'positionY': u'102',
   u'width': u'58'}],
 u'status': u'OK',
 u'totalTransactions': u'4',
 u'url': u'http://www.slangstrong.com/wp-content/uploads/2012/11/Ink-Masters.jpg',
 u'usage': u'By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html'}

Store the Faces Pythonically

Rather than work with these JSON streams, I'll make a class (namedtuple specifically) to store a face. Some of the metatadata stored on the face will include:

 - Pixels in original image where face is
 - Reference to original image
 - Predicted gender and confidence score
 - Predicted ageRange and confidence score

The code below defines such a class. If you're not a programmer, don't worry about this next cell too much.

In [10]:
from collections import namedtuple

# Custom named tuple class, custom printout
Face = namedtuple('Face', ['face', 'index', 'gender', 'genderscore', 'agerange', 'agescore'], verbose=False)
def newrep(obj):
    return 'FACE_{o.index} ({o.gender} {o.genderscore}, {o.agerange} {o.agescore})'.format(o=obj)
Face.__repr__ = newrep

def cutface(image, facetag):
    """ Given original image and JSON face tag, crop out the face. """
    
    def _parseFace(attr):
        return int(facetag[attr])

    X, Y, WIDTH, HEIGHT = _parseFace('positionX'), _parseFace('positionY'), _parseFace('width'), _parseFace('height')
    return image[Y:Y+HEIGHT, X:X+WIDTH]

def mapfaces(facedict):
    """ Takes a dictionary of name:url (see BUDDIES above) and returns name:Face,
    where FACE is the python class for storing image face and metadata.
    """

    out = {}
    for name, imageurl in US_AND_THEM.items():
        try:
            faces = api.faceTagging('url', imageurl)['imageFaces']
        except Exception as exc:
            print "FAILED ON IMAGE: %s with exception:\n%s" % (name, exc.message)
            continue
        image = skio.imread(imageurl)
        out[name] = []
    
        # Iterate over faces, store
        for (idx, facetag) in enumerate(faces):
            faceregion = cutface(image, facetag)
            gender = facetag['gender']['gender']
            genderscore = float(facetag['gender']['score'])
            agerange = facetag['age']['ageRange']
            agescore = float(facetag['age']['score'])
            if agerange == '<18':
                agerange = '0-18' #<-- For sorting later on
            out[name].append(Face(faceregion, idx, gender, genderscore, agerange, agescore))
    return out
    
# THIS ACTUALLY DOES THE MAPPING
US_AND_THEM = mapfaces(US_AND_THEM)

Now the face regions of the image, as well as the important metadata', are stored in a list of Face classes. The list is called US_AND_THEM. We can access the face image from the face attribute, and some images have multiples faces.

In [11]:
f, axes = plt.subplots(1,3, figsize=(4,2))

for idx, celeb in enumerate(US_AND_THEM['INKMASTER']):
    showimage(celeb.face, ax=axes[idx])

f.suptitle('The Faces of Ink Mastery');

Analytics

Alchemy's facial recognition predicts age range and gender. It also provides a confidence score for these predictions. For example, for face predicted to be male, how confident in the prediction is the algorithm? Let's arrange the faces by increasing confidence in gender prediction...


To do so, first, subdivide list by gender. We'll also need the multi_axes function from my pyparty library to make it easier to plot lots of faces and have python automatically figure out the axes sizes and counts (you will need pyparty to reproduce this).

In [12]:
from pyparty.utils import multi_axes

ALL = []
MEN = []
WOMEN = []
for faces in US_AND_THEM.values():
    for face in faces:
        ALL.append(face)
        if face.gender == 'MALE':
            MEN.append(face)
        else:
            WOMEN.append(face)
            
def multiface_plot(faces, title=''):
    """ From list of faces, plots each face, maintaining sort order
    and figures out sizing/number of subaxes to create automatically.
    """
    def rint(x): 
        return int(round(x))
   
    num = len(faces)
    
    # Ad-hoc stuff to ensure images sized right
    size=(5+rint(num/2), 2.5+rint(num/2))
    axes = multi_axes(num, figsize=size)[0]

    #Sort faces in each age group by confidence of age group
    for (idx, face) in enumerate(faces):
        try:
            ax = axes[idx]
        except TypeError:
            ax = axes #<--- IF length 1
        showimage(face.face, ax=ax)
        plt.tight_layout()
    f=plt.gcf()
    f.suptitle(title,
               fontsize=16,
               y=1.02)
    return f, axes
/home/adam/Enthought/Canopy_32bit/User/lib/python2.7/site-packages/enthought/__init__.py:11: UserWarning: Module argparse was already imported from /home/adam/Canopy/appdata/canopy-1.4.0.1938.rh5-x86/lib/python2.7/argparse.pyc, but /home/adam/Enthought/Canopy_32bit/User/lib/python2.7/site-packages is being added to sys.path
  __import__('pkg_resources').declare_namespace(__name__)
In [13]:
MALECERTAINTY = sorted(MEN, key=lambda x: x.genderscore)
multiface_plot(MALECERTAINTY, title='Increasing Confidence that this is a man -->');