Non Local Means

Important: Please read the installation page for details about how to install the toolboxes. $\newcommand{\dotp}[2]{\langle #1, #2 \rangle}$ $\newcommand{\enscond}[2]{\lbrace #1, #2 \rbrace}$ $\newcommand{\pd}[2]{ \frac{ \partial #1}{\partial #2} }$ $\newcommand{\umin}[1]{\underset{#1}{\min}\;}$ $\newcommand{\umax}[1]{\underset{#1}{\max}\;}$ $\newcommand{\umin}[1]{\underset{#1}{\min}\;}$ $\newcommand{\uargmin}[1]{\underset{#1}{argmin}\;}$ $\newcommand{\norm}[1]{\|#1\|}$ $\newcommand{\abs}[1]{\left|#1\right|}$ $\newcommand{\choice}[1]{ \left\{ \begin{array}{l} #1 \end{array} \right. }$ $\newcommand{\pa}[1]{\left(#1\right)}$ $\newcommand{\diag}[1]{{diag}\left( #1 \right)}$ $\newcommand{\qandq}{\quad\text{and}\quad}$ $\newcommand{\qwhereq}{\quad\text{where}\quad}$ $\newcommand{\qifq}{ \quad \text{if} \quad }$ $\newcommand{\qarrq}{ \quad \Longrightarrow \quad }$ $\newcommand{\ZZ}{\mathbb{Z}}$ $\newcommand{\CC}{\mathbb{C}}$ $\newcommand{\RR}{\mathbb{R}}$ $\newcommand{\EE}{\mathbb{E}}$ $\newcommand{\Zz}{\mathcal{Z}}$ $\newcommand{\Ww}{\mathcal{W}}$ $\newcommand{\Vv}{\mathcal{V}}$ $\newcommand{\Nn}{\mathcal{N}}$ $\newcommand{\NN}{\mathcal{N}}$ $\newcommand{\Hh}{\mathcal{H}}$ $\newcommand{\Bb}{\mathcal{B}}$ $\newcommand{\Ee}{\mathcal{E}}$ $\newcommand{\Cc}{\mathcal{C}}$ $\newcommand{\Gg}{\mathcal{G}}$ $\newcommand{\Ss}{\mathcal{S}}$ $\newcommand{\Pp}{\mathcal{P}}$ $\newcommand{\Ff}{\mathcal{F}}$ $\newcommand{\Xx}{\mathcal{X}}$ $\newcommand{\Mm}{\mathcal{M}}$ $\newcommand{\Ii}{\mathcal{I}}$ $\newcommand{\Dd}{\mathcal{D}}$ $\newcommand{\Ll}{\mathcal{L}}$ $\newcommand{\Tt}{\mathcal{T}}$ $\newcommand{\si}{\sigma}$ $\newcommand{\al}{\alpha}$ $\newcommand{\la}{\lambda}$ $\newcommand{\ga}{\gamma}$ $\newcommand{\Ga}{\Gamma}$ $\newcommand{\La}{\Lambda}$ $\newcommand{\si}{\sigma}$ $\newcommand{\Si}{\Sigma}$ $\newcommand{\be}{\beta}$ $\newcommand{\de}{\delta}$ $\newcommand{\De}{\Delta}$ $\newcommand{\phi}{\varphi}$ $\newcommand{\th}{\theta}$ $\newcommand{\om}{\omega}$ $\newcommand{\Om}{\Omega}$

This numerical tour study image denoising using non-local means. This algorithm has been introduced for denoising purposes in BuaCoMoA05

In [1]:
from __future__ import division

import numpy as np
import scipy as scp
import pylab as pyl
import matplotlib.pyplot as plt

from nt_toolbox.general import *
from nt_toolbox.signal import *

%matplotlib inline
%load_ext autoreload
%autoreload 2

Patches in Images

This numerical tour is dedicated to the study of the structure of patches in images.

Size $N = n \times n$ of the image.

In [2]:
n = 128

We load a noisy image $f_0\in \RR^N$.

In [3]:
c = [100,200]
f0 = load_image("nt_toolbox/data/lena.bmp")
f0 = rescale(f0[c[0]-n//2:c[0]+n//2, c[1]-n//2:c[1]+n//2])

Display $f_0$.

In [4]:
plt.figure(figsize = (5,5))
imageplot(f0)

Noise level $\si$.

In [5]:
sigma = .04

Generate a noisy image $f=f_0+\epsilon$ where $\epsilon \times \Nn(0,\si^2\text{Id}_N)$.

In [6]:
from numpy import random
f = f0 + sigma*random.standard_normal((n,n))

Display $f$.

In [7]:
plt.figure(figsize = (5,5))
imageplot(clamp(f))

We denote $w$ to be the half width of the patches, and $w_1=2w+1$ the full width.

In [8]:
w = 3
w1 = 2*w + 1

We set up large $(n,n,w_1,w_1)$ matrices to index the the X and Y position of the pixel to extract.

Location of pixels to extract.

In [9]:
[X,Y,dX,dY] = np.meshgrid(np.arange(1,n+1),np.arange(1,n+1),np.arange(-w,w+1),np.arange(-w,w+1))
X = X + dX
Y = Y + dY

We handle boundary condition by reflexion

In [10]:
X[X < 1] = 2-X[X < 1] 
Y[Y < 1] = 2-Y[Y < 1]
X[X > n] = 2*n-X[X > n]
Y[Y > n] = 2*n-Y[Y > n]

Patch extractor operator.

In [11]:
I = (X-1) + (Y-1)*n
for i in range(n//w):
    for j in range(n//w):
        I[i,j] = np.transpose(I[i,j])
        
patch = lambda f: np.ravel(f)[I]

Define the patch matrix $P$ of size $(n,n,w_1,w_1)$. Each $P(i,j,:,:)$ represent an $(w_1,w_1)$ patch extracted around pixel $(i,j)$ in the image.

In [12]:
P = patch(f)

Display some example of patches.

In [13]:
from numpy import random

plt.figure(figsize = (5,5))

for i in range(16):
    x = random.randint(n)
    y = random.randint(n)
    imageplot(P[x, y], '', [4, 4, i+1])

Dimensionality Reduction with PCA

Since NL-means type algorithms require the computation of many distances between patches, it is advantagous to reduce the dimensionality of the patch while keeping as much as possible of information.

Target dimensionality $d$.

In [14]:
d = 25

A linear dimensionality reduction is obtained by Principal Component Analysis (PCA) that projects the data on a small number of leading direction of the covariance matrix of the patches.

Turn the patch matrix into an $(w_1*w_1,n*n)$ array, so that each $P(:,i)$ is a $w_1*w_1$ vector representing a patch.

In [15]:
resh = lambda P: np.transpose((np.reshape(P, (n*n,w1*w1), order="F")))

Operator to remove the mean of the patches to each patch.

In [16]:
remove_mean = lambda Q: Q - np.tile(np.mean(Q,0),(w1*w1,1))

Compute the mean and the covariance of the points cloud representing the patches.

In [17]:
P1 = remove_mean(resh(P))
C = np.dot(P1,np.transpose(P1))

Extract the eigenvectors, sorted by decreasing amplitude.

In [18]:
from numpy import linalg

[D,V] = linalg.eig(C)
D = np.sort(D)[::-1]
I = np.argsort(D)[::-1]
V = V[I,:]

Display the decaying amplitude of the eigenvalues.

In [19]:
plt.plot(D, linewidth = 2)
plt.ylim(0,max(D))
plt.show()

Display the leading eigenvectors - they look like Fourier modes.

In [20]:
plt.figure(figsize = (5,5))
for i in range(16):
    imageplot(abs(np.reshape(V[:,i], (w1,w1))), '', [4, 4, i+1])

Patch dimensionality reduction operator.

In [21]:
iresh = lambda Q: np.reshape(np.transpose(Q),(n,n,d),order="F")
descriptor = lambda f: iresh(np.dot(np.transpose(V[: ,:d]),remove_mean(resh(P))))

Each $H(i,j,:)$ is a $d$-dimensional descriptor of a patch.

In [22]:
H = descriptor(f)

Non-local Filter

NL-means applies, an adaptive averaging kernel is computed from patch distances to each pixel location.

We denote $H_{i} \in \RR^d$ the descriptor at pixel $i$. We define the distance matrix $$ D_{i,j} = \frac{1}{w_1^2}\norm{H_i-H_j}^2. $$

Operator to compute the distances $(D_{i,j})_j$ between the patch around $i=(i_1,i_2)$ and all the other ones.

In [23]:
distance = lambda i: np.sum((H - np.tile(H[i[0],i[1],:], (n,n,1)))**2, 2)/(w1*w1)

The non-local mean filter computes a denoised image $\tilde f$ as :

$$ \tilde f_i = \sum_j K_{i,j} f_j $$ where the weights $K$ are computed as : $$ K_{i,j} = \frac{ \tilde K_{i,j} }{ \sum_{j'} \tilde K_{i,j'} } \qandq \tilde K_{i,j} = e^{-\frac{D_{i,j}}{2\tau^2}} . $$

The width $\tau$ of the Gaussian is very important and should be adapted to match the noise level.

Compute and normalize the weight.

In [24]:
normalize = lambda K: K/np.sum(K)
kernel = lambda i,tau: normalize(np.exp(-distance(i)/(2*tau**2)))

Compute a typical example of kernel for some pixel position $(x,y)$.

In [25]:
tau = .05
i = [83,72]
D = distance(i)
K = kernel(i, tau)

Display the squared distance and the kernel.

In [26]:
plt.figure(figsize = (10,10))
imageplot(D, 'D', [1, 2, 1])
imageplot(K, 'K', [1, 2, 2])

Localizing the Non-local Means

We set a "locality constant" $q$ that set the maximum distance between patches to compare. This allows to speed up computation, and makes NL-means type methods semi-global (to avoid searching in all the image).

In [27]:
q = 14

Using this locality constant, we compute the distance between patches only within a window. Once again, one should be careful about boundary conditions.

In [28]:
selection = lambda i: np.array((clamp(np.arange(i[0]-q,i[0] + q + 1), 0, n-1), clamp(np.arange(i[1]-q,i[1] + q + 1), 0, n-1)))

Compute distance and kernel only within the window.

In [29]:
def distance_0(i,sel): 
    H1 = (H[sel[0],:,:])
    H2 = (H1[:,sel[1],:])
    return np.sum((H2 - np.tile(H[i[0],i[1],:],(len(sel[0]),len(sel[1]),1)))**2,2)/w1*w1

distance = lambda i: distance_0(i, selection(i))
kernel = lambda i, tau: normalize(np.exp(-distance(i)/ (2*tau**2)))

Compute a typical example of kernel for some pixel position $(x,y)$.

In [30]:
D = distance(i)
K = kernel(i, tau)

Display the squared distance and the kernel.

In [31]:
plt.figure(figsize = (10,10))

imageplot(D, 'D', [1, 2, 1])
imageplot(K, 'K', [1, 2, 2])

The NL-filtered value at pixel $(x,y)$ is obtained by averaging the values of $f$ with the weight $K$.

In [32]:
def NLval_0(K,sel): 
    f_temp = f[sel[0],:]
    return np.sum(K*f_temp[:, sel[1]])

NLval = lambda i, tau: NLval_0(kernel(i, tau), selection(i))

We apply the filter to each pixel location to perform the NL-means algorithm.

In [33]:
[Y, X] = np.meshgrid(np.arange(0,n),np.arange(0,n))

def arrayfun(f,X,Y):
    n = len(X)
    p = len(Y)
    R = np.zeros([n,p])
    for k in range(n):
        for l in range(p):
            R[k,l] = f(k,l)
    return R

NLmeans = lambda tau: arrayfun(lambda i1, i2: NLval([i1,i2], tau), X, Y)

Display the result for some value of $\tau$.

In [34]:
tau = .03

plt.figure(figsize = (5,5))
imageplot(NLmeans(tau))

Exercise 1

Compute the denoising result for several values of $\tau$ in order to determine the optimal denoising that minimizes $\norm{\tilde f - f_0}$.

In [35]:
run -i nt_solutions/denoisingadv_6_nl_means/exo1
In [36]:
## Insert your code here.

Display the best result.

In [37]:
plt.figure(figsize = (5,5))
imageplot(clamp(fNL))

Exercise 2

Explore the influence of the $q$ and $w$ parameters.

In [38]:
run -i nt_solutions/denoisingadv_6_nl_means/exo2
In [39]:
## Insert your code here.

Bibliography