Multimodal multivariate Gaussian¶

We can create a multimodal multivariate gaussian using MultimodalGaussianLogPDF. By default, this has the distribution

$$ p(\boldsymbol{x}) \propto \mathcal{N}\left(\boldsymbol{x}\;\lvert\;\boldsymbol{\mu}_1, \boldsymbol{\Sigma}_1\right) + \mathcal{N}\left(\boldsymbol{x}\;\lvert\;\boldsymbol{\mu}_2, \boldsymbol{\Sigma}_2\right),$$

where, $\boldsymbol{\mu}_1 = (0,0)$ and $\boldsymbol{\mu}_2 = (10,10)$ and $\boldsymbol{\Sigma}_1$ and $\boldsymbol{\Sigma}_2$ are diagonal correlation matrices.

Plotting this pdf:

In [1]:

import pints
import pints.toy
import numpy as np
import matplotlib.pyplot as plt

# Create log pdf
log_pdf = pints.toy.MultimodalGaussianLogPDF()

# Contour plot of pdf
levels = np.linspace(-3,12,20)
num_points = 100
x = np.linspace(-5, 15, num_points)
y = np.linspace(-5, 15, num_points)
X, Y = np.meshgrid(x, y)
Z = np.zeros(X.shape)
Z = np.exp([[log_pdf([i, j]) for i in x] for j in y])
plt.contour(X, Y, Z)
plt.xlabel('x')
plt.ylabel('y')
plt.show()

Use adaptive covariance MCMC to sample from this (un-normalised) pdf.

In [2]:

# Create an adaptive covariance MCMC routine
x0 = np.random.uniform([2, 2], [8, 8], size=(4, 2))
mcmc = pints.MCMCController(log_pdf, 4, x0, method=pints.AdaptiveCovarianceMCMC)

# Set maximum number of iterations
mcmc.set_max_iterations(4000)

# Disable logging
mcmc.set_log_to_screen(False)

# Number of chains
num_chains = 4

# Run!
print('Running...')
chains = mcmc.run()
print('Done!')

# Discard warm-up
chains = [chain[1000:] for chain in chains]

Running...
Done!

Scatter plot of the samples. Adaptive covariance MCMC does ok if we start the chains between the modes.

In [3]:

stacked = np.vstack(chains)
plt.contour(X, Y, Z, colors='k', alpha=0.5)
plt.scatter(stacked[:,0], stacked[:,1], marker='.', alpha=0.2)
plt.xlim(-5, 15)
plt.ylim(-5, 15)
plt.xlabel('x')
plt.ylabel('y')
plt.show()

And KL divergence by mode is good here.

In [4]:

print("KL divergence by mode: " + str(log_pdf.kl_divergence(stacked)))

KL divergence by mode: [ 0.00746156  0.00312236]

But if we start the chains at one of the modes, it can fail to find the other:

In [5]:

# Create an adaptive covariance MCMC routine
x0 = np.random.uniform([-0.1, -0.1], [0.1, 0.1], size=(4, 2))
mcmc = pints.MCMCController(log_pdf, 4, x0, method=pints.AdaptiveCovarianceMCMC)

# Set maximum number of iterations
mcmc.set_max_iterations(4000)

# Disable logging
mcmc.set_log_to_screen(False)

# Number of chains
num_chains = 4

# Run!
print('Running...')
chains = mcmc.run()
print('Done!')

# Discard warm-up
chains = [chain[1000:] for chain in chains]

Running...
Done!

In [6]:

stacked = np.vstack(chains)
plt.contour(X, Y, Z, colors ='k', alpha=0.5)
plt.scatter(stacked[:,0], stacked[:,1], marker='.', alpha = 0.2)
plt.xlim(-5, 15)
plt.ylim(-5, 15)
plt.xlabel('x')
plt.ylabel('y')
plt.show()

Now the KL divergence is for one mode is good but for the other it is terrible.

In [7]:

print("KL divergence by mode: " + str(log_pdf.kl_divergence(stacked)))

KL divergence by mode: [ 0.00681348  0.00906176]

Other mode configurations¶

This distribution can also be used to generate modes in other configurations and other dimensions. For example, we can generate three modes in two dimensions then independently sample from them.

In [8]:

log_pdf = pints.toy.MultimodalGaussianLogPDF(modes=[[0, 0], [5, 10], [10, 0]])

samples = log_pdf.sample(100)

# Contour plot of pdf
levels = np.linspace(-3,12,20)
num_points = 100
x = np.linspace(-5, 15, num_points)
y = np.linspace(-5, 15, num_points)
X, Y = np.meshgrid(x, y)
Z = np.zeros(X.shape)
Z = np.exp([[log_pdf([i, j]) for i in x] for j in y])
plt.contour(X, Y, Z)
plt.scatter(samples[:,0], samples[:,1])
plt.xlabel('x')
plt.ylabel('y')
plt.show()

Or specify different covariance matrices for each mode.

In [9]:

covariances = [[[1, 0], [0, 1]],
               [[2, 0.8], [0.8, 3]],
               [[1, -0.5], [-0.5, 1]]]
log_pdf = pints.toy.MultimodalGaussianLogPDF(modes=[[0, 0], [5, 10], [10, 0]], covariances=covariances)

samples = log_pdf.sample(100)

# Contour plot of pdf
num_points = 100
x = np.linspace(-5, 15, num_points)
y = np.linspace(-5, 15, num_points)
X, Y = np.meshgrid(x, y)
Z = np.zeros(X.shape)
Z = np.exp([[log_pdf([i, j]) for i in x] for j in y])
plt.contour(X, Y, Z)
plt.scatter(samples[:,0], samples[:,1])
plt.xlim(-5, 15)
plt.ylim(-5, 15)
plt.xlabel('x')
plt.ylabel('y')
plt.show()

Or make a multimodal distribution in one dimension.

In [10]:

log_pdf = pints.toy.MultimodalGaussianLogPDF(modes=[[0], [5], [10]])
x = np.linspace(-5, 15, num_points)
prob = np.exp([log_pdf([i]) for i in x])
samples = log_pdf.sample(1000)
plt.plot(x, prob / 2)
plt.hist(samples, 40, density=True)
plt.xlabel('x')
plt.show()

Or a multimodal distribution in 5 dimensions (plotting the first two dimensions only).

In [11]:

log_pdf = pints.toy.MultimodalGaussianLogPDF(modes=[np.repeat(0, 5), np.repeat(5, 5), np.repeat(10, 5)])

samples = log_pdf.sample(100)
plt.scatter(samples[:, 0], samples[:, 1])
plt.xlabel('x')
plt.ylabel('y')
plt.show()

Now use Hamiltonian Monte Carlo to sample from this distribution.

In [12]:

# Create an adaptive covariance MCMC routine
x0 = np.random.uniform([2, 2, 2, 2, 2], [8, 8, 8, 8, 8], size=(4, 5))
sigma0 = [1, 1, 1, 1, 1]
mcmc = pints.MCMCSampling(log_pdf, 4, x0, method=pints.HamiltonianMCMC, sigma0=sigma0)

# Set maximum number of iterations
mcmc.set_max_iterations(500)

# Disable logging
# mcmc.set_log_to_screen(False)

# Number of chains
num_chains = 4

# Run!
print('Running...')
chains = mcmc.run()
print('Done!')

# Discard warm-up
chains = [chain[250:] for chain in chains]

Running...
Using Hamiltonian MCMC
Generating 4 chains.
Running in sequential mode.
Iter. Eval. Accept.   Accept.   Accept.   Accept.   Time m:s
0     4      0         0         0         0          0:00.0
1     84     0.333     0.333     0.333     0.333      0:00.0
2     164    0.5       0.5       0.5       0.5        0:00.1
3     244    0.6       0.6       0.6       0.6        0:00.1
20    1604   0.909     0.909     0.909     0.909      0:00.7
40    3204   0.952381  0.952381  0.952381  0.952381   0:01.3
60    4804   0.968     0.968     0.968     0.968      0:02.0
80    6404   0.976     0.976     0.976     0.976      0:02.7
100   8004   0.98      0.98      0.98      0.98       0:03.3
120   9604   0.984     0.984     0.984     0.984      0:04.0
140   11204  0.986     0.986     0.986     0.986      0:04.7
160   12804  0.988     0.988     0.988     0.988      0:05.3
180   14404  0.989011  0.989011  0.989011  0.989011   0:06.0
200   16004  0.990099  0.990099  0.990099  0.990099   0:06.6
220   17604  0.990991  0.990991  0.990991  0.990991   0:07.3
240   19204  0.992     0.992     0.992     0.992      0:08.0
260   20804  0.989     0.992     0.992     0.992      0:08.6
280   22404  0.989     0.993     0.993     0.993      0:09.3
300   24004  0.99      0.993     0.993     0.993      0:10.0
320   25604  0.988     0.994     0.994     0.994      0:10.6
340   27204  0.988     0.994152  0.994152  0.994152   0:11.3
360   28804  0.989     0.994     0.994     0.994      0:12.0
380   30404  0.99      0.995     0.995     0.995      0:12.6
400   32004  0.99      0.995     0.995     0.995      0:13.3
420   33604  0.991     0.995     0.995     0.995      0:13.9
440   35204  0.991     0.995     0.995     0.995      0:14.6
460   36804  0.991342  0.995671  0.995671  0.995671   0:15.3
480   38404  0.992     0.996     0.996     0.996      0:15.9
500   39924  0.992016  0.996008  0.996008  0.996008   0:16.5
Halting: Maximum number of iterations (500) reached.
Done!

Plot samples in two dimensions from a single chain. Hamiltonian Monte Carlo gets stuck on a mode and remains there!

In [14]:

plt.plot(chains[0][:, 2], chains[1][:, 2])
plt.xlim(-5, 15)
plt.ylim(-5, 15)
plt.xlabel('x')
plt.ylabel('y')
plt.show()