Emukit Bayesian Optimization Benchmark

This notebook uses the emukit.benchmarking package to compare two Bayesian optimization methods against each other, using the Branin test function.

In [1]:
import emukit
import numpy as np

Set up test function

We use the Branin function which is already included in Emukit, both the function and the appropriate input domain are ready made for us.

In [2]:
from emukit.test_functions.branin import branin_function
branin_fcn, parameter_space = branin_function()

Set up methods to test

We compare Bayesian optimization using different models. All the methods collect points one at a time in a sequential fashion and use the expected improvement acquisition function. The models we test are:

  • A Gaussian process with Matern52 covariance function
  • Random forest using the pyrfr package

We choose to create lambda functions with a consistent interface that return an instance of a loop with a given initial data set.

In [3]:
from emukit.examples.enums import ModelType, AcquisitionType
from emukit.examples.optimization_loops import create_bayesian_optimization_loop
from emukit.examples.single_objective_bayesian_optimization import GPBayesianOptimization

loops = [
    ('Random Forest', lambda x, y: create_bayesian_optimization_loop(x, y, parameter_space, AcquisitionType.EI, 
    ('Gaussian Process', lambda x, y: GPBayesianOptimization(parameter_space.parameters, x, y, 
                                                             acquisition_type=AcquisitionType.EI, noiseless=True))

Run benchmark

A total of 10 initial data sets are generated of 5 observations that are randomly sampled from the input domain. For every intial data set, each method is run for 30 optimization iterations. The Gaussian process model has its hyper-parameters optimized after each function observation whereas the other models have fixed hyper-parameters.

In [4]:
from emukit.benchmarking.benchmarker import Benchmarker
from emukit.benchmarking.metrics import MinimumObservedValueMetric, TimeMetric
n_repeats = 30
n_initial_data = 5
n_iterations = 50

metrics = [MinimumObservedValueMetric(), TimeMetric()]

benchmarkers = Benchmarker(loops, branin_fcn, parameter_space, metrics=metrics)
benchmark_results = benchmarkers.run_benchmark(n_iterations=n_iterations, n_initial_data=n_initial_data, 

Plot results

Plot the results of each method against each other. The plot shows the average value and standard deviation of the lowest observed value up to the given iteration.

In [6]:
from emukit.benchmarking.benchmark_plot import BenchmarkPlot
colours = ['m', 'c']
line_styles = ['-', '--']

metrics_to_plot = ['minimum_observed_value']
plots = BenchmarkPlot(benchmark_results, loop_colours=colours, loop_line_styles=line_styles, 

Plot results against time

Using the TimeMetric object above, the time taken to complete each iteration of the loops was recorded. Here we plot the minimum observed value against the time taken.

In [7]:
# Plot against time
plots = BenchmarkPlot(benchmark_results, loop_colours=colours, loop_line_styles=line_styles, x_axis='time',


We have shown how to use Emukit to benchmark different methods against each other for Bayesian optimziation. This methodology can easily be expanded to more loops using different models and acquisition functions.