Tutorial for Hyperactive

This is a tutorial to introduce you to the basic functionalities of Hyperactive and provide some interesting applications. It will also give an introduction to some optimization techniques. Hyperactive is a package that can optimize any python function and collect its search data.

You can learn more about Hyperactive on Github

Table of contents:

In [1]:
def warn(*args, **kwargs):
    pass
import warnings
warnings.warn = warn

import time
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from sklearn.preprocessing import Normalizer, MinMaxScaler
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.neural_network import MLPClassifier
from sklearn.gaussian_process.kernels import Matern, WhiteKernel, RBF, ConstantKernel

from sklearn.tree import DecisionTreeRegressor
from sklearn.datasets import load_boston, load_iris

from hyperactive import Hyperactive, BayesianOptimizer, HillClimbingOptimizer
from gradient_free_objective_functions.visualize import plotly_surface, plotly_heatmap

color_scale = px.colors.sequential.Jet
In [2]:
def _create_grid(objective_function, search_space):
    def objective_function_np(*args):
        para = {}
        for arg, key in zip(args, search_space.keys()):
            para[key] = arg

        return objective_function(para)

    (x_all, y_all) = search_space.values()
    xi, yi = np.meshgrid(x_all, y_all)
    zi = objective_function_np(xi, yi)

    return xi, yi, zi

import plotly.graph_objects as go
from plotly.subplots import make_subplots

def compare_objective_functions(objective_function1, objective_function2):
    search_space_plot = {
        "x": list(np.arange(-5, 5, 0.2)),
        "y": list(np.arange(-5, 5, 0.2)),
    }

    xi_c, yi_c, zi_c = _create_grid(objective_function1, search_space_plot)
    xi_a, yi_a, zi_a = _create_grid(objective_function2, search_space_plot)

    fig1 = go.Surface(x=xi_c, y=yi_c, z=zi_c, colorscale=color_scale)
    fig2 = go.Surface(x=xi_a, y=yi_a, z=zi_a, colorscale=color_scale)

    fig = make_subplots(rows=1, cols=2,
                        specs=[[{'is_3d': True}, {'is_3d': True}]],
                        subplot_titles=['Convex Function', 'Non-convex Function'],
                        )

    fig.add_trace(fig1, 1, 1)
    fig.add_trace(fig2, 1, 2)
    fig.update_layout(title_text="Objective Function Surface")
    fig.show()
In [3]:
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Dense, Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import SGD
from keras.utils import np_utils
from tensorflow import keras

import tensorflow as tf

config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
config.log_device_placement = True

sess = tf.compat.v1.Session(config=config)
tf.compat.v1.keras.backend.set_session(sess)
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device

In [4]:
# load dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

img_width = 28
img_height = 28

x_train = x_train.astype("float32")
x_train /= 255.0
x_test = x_test.astype("float32")
x_test /= 255.0

# reshape input data
x_train = x_train.reshape(x_train.shape[0], img_width, img_height, 1)
x_test = x_test.reshape(x_test.shape[0], img_width, img_height, 1)

# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]


data = load_boston()
X_boston, y_boston = data.data, data.target

data = load_iris()
X_iris, y_iris = data.data, data.target

Introduction

There are two things you need to define before starting your first optimization run:

- the objective function: 
    Contains some kind of model. It always returns a score that will be maximized during
- a search space: 
    Defines the parameter space in which the optimizer searches for the best parameter set

In this notebook you will see several different examples for objective functions.

In [5]:
def objective_function(para):
    loss = para["x"]*para["x"]
    # -x*x is an inverted parabola 
    return -loss

# We have only one dimension here
search_space = {
    "x": list(np.arange(-5, 5, 0.01)),
}

In the next step we will start the optimization run.

You only need the objective_function, search_space and the number of iterations. Each iteration will evaluate the objective function. This will generate a score, that the optimization algorithm uses to determine which position in the search space to look next. All of the calculations will be done by Hyperactive in the background. You will receive the results of the optimization run when all iterations are done.

In [6]:
hyper_0 = Hyperactive(verbosity=False)
hyper_0.add_search(objective_function, search_space, n_iter=70, initialize={"random": 2, "vertices": 2})
hyper_0.run()

search_data_0 = hyper_0.results(objective_function)
search_data_0[["x", "score"]]
Out[6]:
x score
0 -2.56 -6.5536
1 1.93 -3.7249
2 -5.00 -25.0000
3 4.99 -24.9001
4 4.98 -24.8004
... ... ...
65 1.21 -1.4641
66 2.51 -6.3001
67 2.35 -5.5225
68 -2.36 -5.5696
69 -0.65 -0.4225

70 rows × 2 columns

In the table above you can see the 70 iterations performed during the run. This is called the search data. In each row you can see the parameter x and the corresponding score. As we previously discussed the optimization algorithm determines which position to select next based on the score from the evaluated objective function.

When Hyperactive starts the optimization the first iterations are initializations from the initialize-dictionary. In the example above there are 4 initializations (2 random and 2 vertices). They determine the initial positions in the search space that are used to evaluate the objective funtion. As you can see in the search data the 2. and 3. iteration are the vertices (edge points) of the search space. The 0. and 1. The first rows of the search data are randomly selected. After those few initialization steps the optimization algorithm will select the next positions in the search space based on the score of the previous position(s).

The default algorithm for the optimization is the random-search. You can see the random pattern in the last few iterations of the search data. We can also see the random pattern if we plot the search data:

Random Search Optimizer

The random search optimizer is a very simple algorithm. It randomly selects the position in each iteration without adapting to the optimization problem (exploitation). One the other hand it is very useful to initialy explore the search space or find new regions with optima (exploration).

The following two gifs show how random search explores the search space for two different objective functions:

In [7]:
fig = px.scatter(search_data_0, x="x", y="score")
fig.show()