Basic background

Here we provide some basic background information on the use of so-called Gabor filters to extract a rich collection of pertinent visual features from images. While a detailed introduction is out of the scope of this lesson, here we shall try to communicate just the core ideas.

In words, a Gabor filter is a sinusoid modulated by a Gaussian function. The basic formulation is

\begin{align*} (\text{Filter output}) = (\text{Sinusoid carrier}) \times (\text{Gaussian envelope}). \end{align*}

There are 1D, 2D, and 3D versions common in the literature. Some great images of the graphs of key quantities related to Gabor filters are given by Jones et al. (1987). Gabor filters have been applied in tasks such as modeling the receptive field (DeAngelis et al., 1993), the identification of textures (via scikit-image docs, "Gabor filter banks for texture classification"), facial recognition (via Kyutech BIS Lab), and much more. Note that the following points are of critical importance in practice:

  • Modifying the frequency parameter of the sinusoid affects the resolution of the filter.
  • Modifying the radius of the Gaussian impacts how locally concentrated (in space or time) the filter's impact is.
  • The simplest filter is temporal, more elaborate filters are spatial or spatio-temporal.

There is more than one way to "apply" a filter to a signal, but it is likely that the most common approach is to take the convolution of the two signals (since both the input and the filter are functions of the same domain). Recall that the convolution of two arbitrary signals $f$ and $g$, denoted $f \ast g$, is defined in the 1D case by

\begin{align*} (f \ast g)(t) = \int_{\mathbb{R}} f(t)g(t-\tau) \, d\tau \end{align*}

and analogously in the 2D case by

\begin{align*} (f \ast g)(x,y) = \int_{\mathbb{R}^{2}} f(x,y)g(x-\alpha,y-\beta) \, d\alpha\,d\beta. \end{align*}

Since the convolution is itself a function of the same domain as the signals from which it is comprised, it too is a signal. When $f$ is the input signal, and $g$ is our filter, then $f \ast g$ shall be the final filter response.

One-dimensional Gabor filters

In order to implement this ourselves, however, these illustrations are not enough. A slightly more formal representation is required. Let's start with the one-dimensional case, a temporal Gabor filter, written

\begin{align*} G(t) = f(t) \, s(t) \end{align*}

where

\begin{align*} s(t) & = \exp \left(i (2\pi ut + \phi) \right)\\ & = \cos(2\pi ut + \phi) + i\sin(2\pi ut + \phi)\\ f(t) & = A \exp\left( -\frac{t^2}{\sigma^2} \right). \end{align*}

The parameters are as follows:

  • $u$: frequency, how many cycles the sinusoid makes per unit of time. For example, if $u=1$, the period is one unit of time (seconds, minutes, days, etc.). If $u=2$, then the period is 1/2 a unit of time, and since it makes two cycles per unit of time. If $u=1/2$, conversely, then it takes two units of time to complete a single cycle, and so on.
  • $A$: amplitude, the maximum magnitude of the filter.
  • $\phi$: phase, shifts where (on time domain) the functions is takes large +/- values.
  • $\sigma$: controls the breadth of the Gaussian function. Smaller $\sigma > 0$ implies that the function rapidly decreases as $t$ increases (in either +/- direction).

Enough equations. Let's define our own functions and visualize them to get a better grasp on things.

In [1]:
import math
import numpy as np


def G_carrier_real(t, freq, phase):
    '''
    Real part of the carrier.
    '''
    topass = 2 * math.pi * freq * t + phase
    out = np.cos(topass)
    return out

def G_carrier_imag(t, freq, phase):
    '''
    Imaginary part of the carrier.
    '''
    topass = 2 * math.pi * freq * t + phase
    out = np.sin(topass)
    return out


def G_envelope(t, amp, sdev):
    '''
    The impact of the filter is controlled by a Gaussian function.
    '''
    out = amp * np.exp( (-(t/sdev)**2) )
    return out

The filter itself is specified by a handful of parameters, as below.

In [2]:
myparas = {"amp": 1/(2*math.pi),
           "sdev": 1,
           "freq": 1,
           "phase": 0}

These parameters are then used to specify a customized filter.

In [3]:
def G_fil_real(t, paras):
    '''
    Custom-built filter response (real part).
    Assumes that t is an array of temporal inputs.
    '''
    carrier = G_carrier_real(t=t, freq=paras["freq"], phase=paras["phase"])
    envelope = G_envelope(t=t, amp=paras["amp"], sdev=paras["sdev"])
    out = carrier * envelope
    return out

def G_fil_imag(t, paras):
    '''
    Custom-built filter response (imaginary part).
    Assumes that t is an array of temporal inputs.
    '''
    carrier = G_carrier_imag(t=t, freq=paras["freq"], phase=paras["phase"])
    envelope = G_envelope(t=t, amp=paras["amp"], sdev=paras["sdev"])
    out = carrier * envelope
    return out

Note that the $(\text{carrier}) \times (\text{envelope})$ form is clear in the above definitions. Let's visualize each part of this.

In [5]:
import matplotlib.pyplot as plt

myfig = plt.figure(figsize=(16,4))
t_inputs = np.linspace(-3,3,500)

ax_carrier = myfig.add_subplot(1,3,1)
plt.title("Sinusoidal carrier")
plt.xlabel("Time (s)")
ax_envelope = myfig.add_subplot(1,3,2)
plt.title("Gaussian envelope")
plt.xlabel("Time (s)")
ax_filter = myfig.add_subplot(1,3,3)
plt.title("Gabor filter (1D)")
plt.xlabel("Time (s)")

ax_carrier.plot(t_inputs, G_carrier_real(t=t_inputs, freq=myparas["freq"], phase=myparas["phase"]))
ax_carrier.plot(t_inputs, G_carrier_imag(t=t_inputs, freq=myparas["freq"], phase=myparas["phase"]))

ax_envelope.plot(t_inputs, G_envelope(t=t_inputs, amp=myparas["amp"], sdev=myparas["sdev"]))

ax_filter.plot(t_inputs, G_fil_real(t=t_inputs, paras=myparas))
ax_filter.plot(t_inputs, G_fil_imag(t=t_inputs, paras=myparas))

plt.show()

Exercises:

  1. Look at the freq parameter and the left-most plot above. Are the number of cycles per unit of time what you expect? Modify freq to values such as 1, 4, 0.5, and 0.25, observing what happens each time.

  2. Change sdev to values such as 0.25 and 2, observing the impact. If necessary, adjust the range of t_inputs to get a better view.

  3. Modify the phase parameter of just the imaginary part, such that it coincides with the real part (they are out of phase by $90^{\circ}$, or $\pi/2$ radians).


Application of 1D filter

Now that we've looked at this filter, let's consider what it means to apply this filter. Let's construct an artificial signal to explore this.

In [6]:
para_HIGHFREQ = 5
para_LOWFREQ = 0.25

def my_signal(t):
    
    highfreq = 0.25 * np.sin((2*math.pi*para_HIGHFREQ*t))
    lowfreq = 2 * np.sin((2*math.pi*para_LOWFREQ*t))
    
    cond = (np.abs(t) <= 5)
    signal = highfreq + lowfreq
    out = np.select([cond], [signal])
    
    return out


myfig = plt.figure(figsize=(10,10))
t_inputs = np.linspace(-10, 10, 500)

plt.plot(t_inputs, my_signal(t=t_inputs))
plt.title("Some artificial signal")
plt.xlabel("Time (s)")
plt.show()

Now, to "apply" a filter typically means to take the convolution of a filter and some input signal. This can be interpreted as a new signal (a function of time), and the most natural filter response.

In [7]:
from scipy import signal

t_inputs = np.linspace(-10, 10, 1000)
sig_values = my_signal(t=t_inputs)

myparas = {"amp": 1/(2*math.pi),
           "sdev": 1,
           "freq": 0.25,
           "phase": 0}
fil_values_real = G_fil_real(t=t_inputs, paras=myparas)
fil_values_imag = G_fil_imag(t=t_inputs, paras=myparas)
fil_response_real = signal.convolve(sig_values, fil_values_real, mode="same")
fil_response_imag = signal.convolve(sig_values, fil_values_imag, mode="same")

myfig = plt.figure(figsize=(18,4))

ax_signal = myfig.add_subplot(1,3,1)
plt.title("Artificial signal")
plt.xlabel("Time (s)")
ax_filter = myfig.add_subplot(1,3,2)
plt.title("Gabor filter")
plt.xlabel("Time (s)")
ax_response = myfig.add_subplot(1,3,3)
plt.title("Filter response")
plt.xlabel("Time (s)")

ax_signal.plot(t_inputs, sig_values)

ax_filter.plot(t_inputs, fil_values_real)
ax_filter.plot(t_inputs, fil_values_imag)

ax_response.plot(t_inputs, fil_response_real)
ax_response.plot(t_inputs, fil_response_imag)

plt.show()

Exercises:

  1. Using the above code, try numerous different values for the freq parameter stored in myparas, ranging between para_LOWFREQ and para_HIGHFREQ. Note how the filter better "picks up" the low-frequency and high-frequency components of our artificial signal as its freq parameter moves closer to para_LOWFREQ and para_HIGHFREQ respectively. Prepare and save some labeled plots that illustrate this.

  2. Play with the phase and standard deviation parameters as well, and describe how their modification impacts the effect of the filter.

  3. Modify the my_signal function such that it has low/med/high frequency components (linear combination of three periodic functions with distinct angular frequencies), and confirm that constructing a Gabor filter with the appropriate freq parameter for each can indeed extract each of the desired underlying component functions.

The key take-away from this simple one-dimensional exercise is that by preparing different filters specified by different parameters, we can extract different features of the underlying signal. This basic principles extends directly to the multidimensional setting we will consider next.


Two-dimensional Gabor filters

Now that we have grasped the basic idea of what a Gabor filter (and many other filters, in fact) can do, let us consider a more general version of the filter. The basic form, we repeat from above, is

\begin{align*} (\text{Filter output}) = (\text{Sinusoid carrier}) \times (\text{Gaussian envelope}) \end{align*}

just as seen in our concrete look at the one-dimensional version. For 2D however, the domain extends from "the line" to "the plane", and instead of passing $t \in \mathbb{R}$ to the filter, we pass $(x,y) \in \mathbb{R}^{2}$, namely spatial coordinates. Formally, we update our notation as

\begin{align*} G(x,y) = f(x,y) \, s(x,y) \end{align*}

where

\begin{align*} s(x,y) & = \exp \left(i (2\pi (ux + vy) + \phi) \right)\\ & = \cos(2\pi (ux + vy) + \phi) + i\sin(2\pi (ux + vy) + \phi)\\ f(x,y) & = A \exp\left( -\frac{x^2 + y^2}{\sigma^2} \right). \end{align*}

As a convention we follow, let the angular frequencies $u$ and $v$ be specified in polar coordinates, as

\begin{align*} u & = \omega \cos(\theta) \\ v & = \omega \sin(\theta) \end{align*}

where $\omega$ (spatial frequency) and $\theta$ (filter orientation) are new parameters here. An explanation of the parameters in the filter definition is as follows:

  • $u$ and $v$: frequency, typically in the "horizontal" and "vertical" directions of an image, namely an array of pixels identified with discrete positions, in which case the frequency parameter specifies the number of cycles the sinusoid makes per pixel.
  • $A$: amplitude, as above.
  • $\phi$: phase, as above.
  • $\sigma$: as above; note that our formulation implies a circular Gaussian. An elliptical form can be constructed by replacing the summands $(x/\sigma)^2$ and $(y/\sigma)^2$ by say $(x/\sigma_{X})^2$ and $(y/\sigma_{Y})^2$ where $\sigma_{X}$ and $\sigma_{Y}$ determine the size of the major/minor axes.

That's enough for the equations here as well. Let's define our own functions and visualize them to get a better grasp on things here.

In [1]:
import numpy as np
import math
import matplotlib.pyplot as plt


def G2_carrier_real(x, y, freqx, freqy, phase):
    '''
    Real part of the 2-D Gabor carrier.
    '''
    topass = 2 * math.pi * (freqx*x + freqy*y) + phase
    out = np.cos(topass)
    return out


def G2_carrier_imag(x, y, freqx, freqy, phase):
    '''
    Imaginary part of the 2-D Gabor carrier.
    '''
    topass = 2 * math.pi * (freqx*x + freqy*y) + phase
    out = np.sin(topass)
    return out


def G2_envelope(x, y, amp, sdev):
    '''
    Gaussian envelope for a 2-D Gabor filter.
    We assume that it is circular (same rate of decrease in x/y directions).
    '''
    out = amp * np.exp(-(x**2+y**2)/(sdev**2))
    return out

As with the one-dimensional case, the filter can be specified by a small handful of parameters.

In [5]:
PIX_W = 128 # image width, in pixels
PIX_H = 128 # image height, in pixels
max_cycles = 4 # the maximum cycles per image.
myparas = {"freqs": max_cycles/max(PIX_W,PIX_H), # cycles per pixel
           "dir": math.pi/2, # orientation
           "amp": 1,
           "sdev": max(PIX_W,PIX_H)/5,
           "phase": 0}

Note here freqs refers to spatial frequency $\omega$ as introduced above. These parameters are then used to build a customized filter.

In [6]:
def G2_fil_real(x, y, paras):
    '''
    Custom-built filter response (real part).
    '''
    # Spatial frequency in polar coordinates.
    u = paras["freqs"] * math.cos(paras["dir"])
    v = paras["freqs"] * math.sin(paras["dir"])
    # Computations.
    carrier = G2_carrier_real(x=x, y=y, freqx=u, freqy=v, phase=paras["phase"])
    envelope = G2_envelope(x=x, y=y, amp=paras["amp"], sdev=paras["sdev"])
    out = carrier * envelope
    return out

def G2_fil_imag(x, y, paras):
    '''
    Custom-built filter response (imaginary part).
    '''
    # Spatial frequency in polar coordinates.
    u = paras["freqs"] * math.cos(paras["dir"])
    v = paras["freqs"] * math.sin(paras["dir"])
    # Computations.
    carrier = G2_carrier_imag(x=x, y=y, freqx=u, freqy=v, phase=paras["phase"])
    envelope = G2_envelope(x=x, y=y, amp=paras["amp"], sdev=paras["sdev"])
    out = carrier * envelope
    return out

Moving ahead, let's look at the graphs of these functions as a simple example to start with.

In [7]:
myfig = plt.figure(figsize=(18,4))

y0 = math.floor(PIX_H/2)
x0 = math.floor(PIX_W/2)
y_inputs, x_inputs = np.mgrid[-y0:(y0+1), -x0:(x0+1)]

# Store pixel values (of envelope, carrier, filter) for plotting via imshow.
out_envelope = G2_envelope(x=x_inputs, y=y_inputs,
                             amp=myparas["amp"],
                             sdev=myparas["sdev"])
out_carrier = G2_carrier_real(x=x_inputs,
                              y=y_inputs,
                              freqx=myparas["freqs"]*math.cos(myparas["dir"]),
                              freqy=myparas["freqs"]*math.sin(myparas["dir"]),
                              phase=myparas["phase"])
out_filter = G2_fil_real(x=x_inputs, y=y_inputs, paras=myparas)


ax_carrier = myfig.add_subplot(1,3,1)
plt.title("Sinusoidal carrier")
topass = ax_carrier.imshow(out_carrier, cmap=plt.cm.BuPu_r)
plt.colorbar(topass)

ax_envelope = myfig.add_subplot(1,3,2)
plt.title("Gaussian envelope")
topass = ax_envelope.imshow(out_envelope, cmap=plt.cm.BuPu_r)
plt.colorbar(topass)

ax_filter = myfig.add_subplot(1,3,3)
plt.title("Gabor filter (2D)")
topass = ax_filter.imshow(out_filter, cmap=plt.cm.BuPu_r)
plt.colorbar(topass)

plt.show()

The 2D spatial filter implemented above is a direct extension of the 1D temporal filter implemented previously, and has completely analogous parameters, as one can confirm in the exercises below.

Exercises:

  1. Modify the dir parameter (corresponding to orientation parameter $\theta$), setting it to zero, $\pi/4$, $\pi/2$, and so on. Describe the qualitative changes in the filter output.

  2. Modify the freqs parameter and note the number of cycles over the full image. Set dir and freqs such that the sinusoid only depends on the horizontal position (i.e., the leftmost plot looks like alternating vertical bars), and the number of cycles in the horizontal direction over the full image is five. Repeat, but with 10 cycles. Repeat once again for 25 cycles.

  3. Replace the real part (carrier_real and G2_fil_real) with the imaginary part in the above code. Qualitatively, what change has taken place with respect to the effect of the filter?

  4. Modify the standard deviation parameter such that its visible radius is approximately half the width/height of the domain pictured. Repeat, making it approximately a quarter. Again, one eighth.


Application of 2D filter

Now it comes time to apply the filters to signals, now defined on the plane rather than the line. In this sense, the pixel values of an image are considered to be the values taken by a signal defined on the plane (thus we call them "2D").

We begin with reading and writing images from file. A standard library for doing this is imageio (http://imageio.github.io/). Assuming all the image files have been saved locally in advance, the following example shows how easy it is to visualize the content of image files using objects from this library.

In [12]:
import imageio

# Read images from file.
im_cat = imageio.imread("img/chelsea.png")
im_boy = imageio.imread("img/bishop.png")
im_parrots = imageio.imread("img/parrots.png")

# Inspect (and plot) the images to ensure they have been read as we expect.
print("Shape:", im_cat.shape, "Type:", type(im_cat), "First value (r,g,b):", im_cat[0,0,:])
print("Shape:", im_boy.shape, "Type:", type(im_boy), "First value (r,g,b):", im_boy[0,0,:])
print("Shape:", im_parrots.shape, "Type:", type(im_parrots), "First value (r,g,b):", im_parrots[0,0,:])

myfig = plt.figure(figsize=(18,4))
ax_cat = myfig.add_subplot(1,3,1)
ax_boy = myfig.add_subplot(1,3,2)
ax_parrots = myfig.add_subplot(1,3,3)
ax_cat.imshow(im_cat)
ax_boy.imshow(im_boy)
ax_parrots.imshow(im_parrots)

plt.show()
Shape: (300, 451, 3) Type: <class 'imageio.core.util.Image'> First value (r,g,b): [143 120 104]
Shape: (128, 128, 3) Type: <class 'imageio.core.util.Image'> First value (r,g,b): [180 116  33]
Shape: (633, 801, 3) Type: <class 'imageio.core.util.Image'> First value (r,g,b): [168 165 160]

Note that the imageio library provides its own image objects (not typical ndarray objects), which are plotted as we would hope (all three channels are used).

Let's take a look at individual channels from a single image.

In [13]:
im = imageio.imread("img/chelsea.png")

myfig = plt.figure(figsize=(18,4))
ax_R = myfig.add_subplot(1,3,1)
plt.title("Red channel")
ax_G = myfig.add_subplot(1,3,2)
plt.title("Green channel")
ax_B = myfig.add_subplot(1,3,3)
plt.title("Blue channel")

ax_R.imshow(im[:,:,0], plt.get_cmap('gray'))
ax_G.imshow(im[:,:,1], cmap=plt.get_cmap('gray'))
ax_B.imshow(im[:,:,2], cmap=plt.get_cmap('gray'))
plt.show()

Or to emphasize the colours a bit more, we can change the colour maps.

In [14]:
myfig = plt.figure(figsize=(18,4))
ax_R = myfig.add_subplot(1,3,1)
plt.title("Red channel")
ax_G = myfig.add_subplot(1,3,2)
plt.title("Green channel")
ax_B = myfig.add_subplot(1,3,3)
plt.title("Blue channel")

ax_R.imshow(im[:,:,0], cmap=plt.cm.Reds)
ax_G.imshow(im[:,:,1], cmap=plt.cm.Greens)
ax_B.imshow(im[:,:,2], cmap=plt.cm.Blues)
plt.show()