Organize your machine learning experiments with ScalarStop

What is ScalarStop?

ScalarStop helps you train machine learning models by:

  • creating a system to uniquely name datasets, model architectures, trained models, and their hyperparameters.
  • saving and loading datasets and models to/from the filesystem in a consistent way.
  • recording dataset and model names, hyperparameters, and training metrics to a SQLite or PostgreSQL database.

Installing ScalarStop

ScalarStop is available on PyPI. You can install it from the command line using::

pip3 install scalarstop

Getting started

First, we will organize your training, validation, and test sets with subclasses of a DataBlob objects.

Second, we will describe the architecture of your machine learning models with subclasses of ModelTemplate objects.

Third, we'll create a Model subclass instance that initializes a model with a ModelTemplate and trains it on a DataBlob's training and validation sets.

Finally, we will save the hyperparameters and training metrics from many DataBlobs, ModelTemplates, and Models into a SQLite or PostgreSQL database using the TrainStore client.

But first, let's import the modules we'll need for this demo.

In [1]:
import os

import scalarstop as sp

import tensorflow as tf

DataBlob: Keeping your training dataset organized

The first step to training machine learning models with ScalarStop is to encase your dataset into a DataBlob.

A DataBlob is a set of three tf.data.Dataset pipelines--representing your training, validation, and test sets.

When you create a DataBlob, variables that affect the creation of the tf.data.Dataset pipeline are are stored in a nested Python dataclass named Hyperparams. Only store simple JSON-serializable types in the Hyperparams dataclass.

Creating a new DataBlob with Hyperparams looks roughly like this:

from typing import List, Dict
import scalarstop as sp

class my_datablob_group_name(sp.DataBlob):

    @sp.dataclass
    class Hyperparams(sp.HyperparamsType):
        a: int
        b: str
        c: Dict[str, float]
        d = List[int]

    # ... more setup below ...

Then, we define three methods on our DataBlob subclass:

  • set_training()
  • set_validation()
  • set_test()

Each one of them has to create a new instance of a tf.data.Dataset pipeline with data samples and labels zipped together. Typically that looks like:

# Create a tf.data.Dataset for your training samples.
samples = tf.data.Dataset.from_tensor_slices([1, 2, 3])

# And another tf.data.Dataset for your training labels.
labels = tf.data.Dataset.from_tensor_slices([0, 1, 0])

# And zip them together.
tf.data.Dataset.zip((samples, labels))

Do not apply any batching at this stage. We will do that later.

Now we'll create a DataBlob that contains the Fashion MNIST dataset.

In [3]:
class fashion_mnist_v1(sp.DataBlob):

    @sp.dataclass
    class Hyperparams(sp.HyperparamsType):
        num_training_samples: int
    
    def __init__(self, hyperparams):
        """
        You only need to override __init__ if you want to validate
        your hyperparameters or add arguments that are not hyperparameters.

        One example of a non-hyperparameter argument would be a
        database connection URL.
        """
        if hyperparams["num_training_samples"] > 50_000:
            raise ValueError("num_training_samples should be <= 50_000")
        super().__init__(hyperparams=hyperparams)
        (self._train_images, self._train_labels), \
            (self._test_images, self._test_labels) = \
            tf.keras.datasets.fashion_mnist.load_data()

    def set_training(self) -> tf.data.Dataset:
        """The training set."""
        samples = tf.data.Dataset.from_tensor_slices(
            self._train_images[:self.hyperparams.num_training_samples]
        )
        labels = tf.data.Dataset.from_tensor_slices(
            self._train_labels[:self.hyperparams.num_training_samples]
        )
        return tf.data.Dataset.zip((samples, labels))

    def set_validation(self) -> tf.data.Dataset:
        """
        The validation set.

        In this example, the validation set does not change with the
        hyperparameters. This allows us to compare results with
        different training sets to the same validation set.

        However, if your hyperparameters specify how to engineer
        features, then you might wnat the validation set and
        training set to rely on the same hyperparameters.
        """
        samples = tf.data.Dataset.from_tensor_slices(
            self._train_images[50_000:]
        )
        labels = tf.data.Dataset.from_tensor_slices(
            self._train_labels[50_000:]
        )
        return tf.data.Dataset.zip((samples, labels))

    def set_test(self) -> tf.data.Dataset:
        """The test set. Used to evaluate models but not train them."""
        samples = tf.data.Dataset.from_tensor_slices(
            self._test_images
        )
        labels = tf.data.Dataset.from_tensor_slices(
            self._test_labels
        )
        return tf.data.Dataset.zip((samples, labels))

Here we create a DataBlob instance with a dictionary to set our Hyperparams.

The DataBlob name is computed by hashing your DataBlob subclass class name and the names and values of your Hyperparams.

In [4]:
datablob1 = fashion_mnist_v1(hyperparams=dict(num_training_samples=10))
datablob1.name
Out[4]:
'fashion_mnist_v1-p166sf7xz19hg8n3mj8f93m8'

The DataBlob group name is by default the DataBlob subclass name.

In [5]:
datablob1.group_name
Out[5]:
'fashion_mnist_v1'
In [6]:
print(datablob1.hyperparams)
fashion_mnist_v1.Hyperparams(num_training_samples=10)

Now we create another DataBlob instance with a different value for Hyperparams.

Note that it has a different automatically-generated name, but it'll have the same group_name.

In [7]:
datablob2 = fashion_mnist_v1(hyperparams=dict(num_training_samples=50))
datablob2.name, datablob2.group_name
Out[7]:
('fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza', 'fashion_mnist_v1')
In [8]:
datablob1.training.take(1)
Out[8]:
<TakeDataset shapes: ((28, 28), ()), types: (tf.uint8, tf.uint8)>

We can save a DataBlob to the filesystem and load it back later.

In [9]:
os.makedirs("datablobs_directory", exist_ok=True)
datablob1.save("datablobs_directory")
Out[9]:
<sp.DataBlob fashion_mnist_v1-p166sf7xz19hg8n3mj8f93m8>

Here, we use the classmethod from_filesystem() to calculate the exact path of our saved DataBlob using a copy of the DataBlob's hyperparameters.

In [10]:
loaded_datablob1 = fashion_mnist_v1.from_filesystem(
    hyperparams=dict(num_training_samples=10),
    datablobs_directory="datablobs_directory",
)
loaded_datablob1
Out[10]:
<sp.DataBlob fashion_mnist_v1-p166sf7xz19hg8n3mj8f93m8>

Alternatiely, if we know the exact directory name of our saved DataBlob1, we can load it with with_exact_path().

In [11]:
loaded_datablob2 = fashion_mnist_v1.from_exact_path(
    os.path.join("datablobs_directory", datablob1.name)
)
loaded_datablob2
Out[11]:
<sp.DataBlob fashion_mnist_v1-p166sf7xz19hg8n3mj8f93m8>

ModelTemplate: Parameterizing your model creation

The ModelTemplate is the same concept as the DataBlob, but instead of three tf.data.Dataset s, the ModelTemplate creates a machine learning framework model object.

Here is an example of creating a Keras model. Building and compiling the model is parameterized by values in the Hyperparams dataclass.

In [12]:
class small_dense_10_way_classifier_v1(sp.ModelTemplate):

    @sp.dataclass
    class Hyperparams(sp.HyperparamsType):
        hidden_units: int
        optimizer: str = "adam"

    def new_model(self):
        model = tf.keras.Sequential(
            layers=[
                tf.keras.layers.Flatten(input_shape=(28, 28)),
                tf.keras.layers.Dense(
                    units=self.hyperparams.hidden_units,
                    activation="relu",
                ),
                tf.keras.layers.Dense(units=10)
            ],
            name=self.name,
        )
        model.compile(
            optimizer=self.hyperparams.optimizer,
            loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
            metrics=["accuracy"],
        )
        return model

Once again, the ModelTemplate has a unique name generated by hashing your subclass and the Hyperparams.

In [13]:
model_template = small_dense_10_way_classifier_v1(hyperparams=dict(hidden_units=3))
model_template.name
Out[13]:
'small_dense_10_way_classifier_v1-uptyfbjofo7rqv8antxrwhjs'

Model:Combine your ModelTemplate with your DataBlob

DataBlobs and ModelTemplates are not very useful until you bring them together with a Model.

A Model is an object created by pairing together a ModelTemplate instance and a DataBlob instance, for the purpose of training the machine learning model created by the ModelTemplate on the DataBlob's training and validation sets.

Make sure to batch your DataBlob before using it.

In [14]:
datablob = datablob2.batch(2)

model = sp.KerasModel(
    datablob=datablob,
    model_template=model_template,
)

Once again, the Model has a unique name. But this time it is just a concatenation of the DataBlob and ModelTemplate names.

In [15]:
model.name
Out[15]:
'mt_small_dense_10_way_classifier_v1-uptyfbjofo7rqv8antxrwhjs__d_fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza'
In [16]:
model.fit(final_epoch=2, verbose=1)
Epoch 1/2
25/25 [==============================] - 3s 115ms/step - loss: 39.2720 - accuracy: 0.3199 - val_loss: 2.6354 - val_accuracy: 0.1039
Epoch 2/2
25/25 [==============================] - 2s 99ms/step - loss: 2.3040 - accuracy: 0.1014 - val_loss: 2.5192 - val_accuracy: 0.1040
Out[16]:
{'loss': [23.980653762817383, 2.3024940490722656],
 'accuracy': [0.18000000715255737, 0.11999999731779099],
 'val_loss': [2.635443925857544, 2.5192012786865234],
 'val_accuracy': [0.1039000004529953, 0.10400000214576721]}

In ScalarStop, training a machine learning model is an idempotent operation. Instead of saying, "Train for $n$ more epochs," we say, "Train until the model has been trained for $n$ epochs total."

If we call model.fit() again with final_epoch() still at 2, we get the same metrics but no training happened.

In [17]:
model.fit(final_epoch=2, verbose=1)
Out[17]:
{'loss': [23.980653762817383, 2.3024940490722656],
 'accuracy': [0.18000000715255737, 0.11999999731779099],
 'val_loss': [2.635443925857544, 2.5192012786865234],
 'val_accuracy': [0.1039000004529953, 0.10400000214576721]}

Training ScalarStop Models are idempotent because they keep track of how many epochs they have been trained for and the generated training metrics (e.g. loss, accuracy, etc.). This information is saved to the filesystem if you call model.save() and is loaded back from disk if you create a new Model object with Model.from_filesystem() or Model.from_filesystem_or_new().

In [18]:
os.makedirs("models_directory", exist_ok=True)

model.save("models_directory")

os.listdir("models_directory")
INFO:tensorflow:Assets written to: models_directory/mt_small_dense_10_way_classifier_v1-uptyfbjofo7rqv8antxrwhjs__d_fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza/assets
Out[18]:
['mt_small_dense_10_way_classifier_v1-uptyfbjofo7rqv8antxrwhjs__d_fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza']

This is an example of us loading the model back, calculating the exact filename based on the hyperparameters of both the DataBlob and ModelTemplate.

In [19]:
model2 = sp.KerasModel.from_filesystem(
    datablob=datablob,
    model_template=model_template,
    models_directory="models_directory",
)
print(model2.name)
model2.history
mt_small_dense_10_way_classifier_v1-uptyfbjofo7rqv8antxrwhjs__d_fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza
Out[19]:
{'accuracy': [0.18000000715255737, 0.11999999731779099],
 'loss': [23.980653762817383, 2.3024940490722656],
 'val_accuracy': [0.1039000004529953, 0.10400000214576721],
 'val_loss': [2.635443925857544, 2.5192012786865234]}

If you provide models_directory as an argument to fit(), ScalarStop will save the model to the filesystem after every epoch.

In [20]:
_ = model2.fit(final_epoch=5, verbose=1, models_directory="models_directory")
Epoch 3/5
25/25 [==============================] - 3s 103ms/step - loss: 2.3016 - accuracy: 0.1200 - val_loss: 2.5192 - val_accuracy: 0.1040
INFO:tensorflow:Assets written to: models_directory/mt_small_dense_10_way_classifier_v1-uptyfbjofo7rqv8antxrwhjs__d_fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza/assets
Epoch 4/5
25/25 [==============================] - 2s 88ms/step - loss: 2.2994 - accuracy: 0.1000 - val_loss: 2.5192 - val_accuracy: 0.1060
INFO:tensorflow:Assets written to: models_directory/mt_small_dense_10_way_classifier_v1-uptyfbjofo7rqv8antxrwhjs__d_fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza/assets
Epoch 5/5
25/25 [==============================] - 2s 90ms/step - loss: 2.2980 - accuracy: 0.1600 - val_loss: 2.5192 - val_accuracy: 0.1060
INFO:tensorflow:Assets written to: models_directory/mt_small_dense_10_way_classifier_v1-uptyfbjofo7rqv8antxrwhjs__d_fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza/assets

Once again, ScalarStop saves the model's trainining history alongside the model's weights, but this is not very convenient if you want to do large-scale analysis on the training metrics of many models at once.

A better way of storing the training metrics is to use the TrainStore.


TrainStore: Save and query your machine learning metrics in a database

The TrainStore is a client that saves hyperparameters and training metrics to a SQLite or PostgreSQL database. Let's create a new TrainStore instance that will save data to a file named train_store.sqlite3.

In [21]:
train_store = sp.TrainStore.from_filesystem(filename="train_store.sqlite3")
train_store
Out[21]:
<sp.TrainStore sqlite:///train_store.sqlite3>

The TrainStore is also available as a Python context manager.

with sp.TrainStore.from_filesystem(filename="train_store.sqlite3") as train_store:
    # use the TrainStore here

# here the TrainStore database connection is automatically closed for you.

We don't use it that way in this example because we want to use the TrainStore across multiple Jupyter notebook cells.

And if we want to connect to a PostgreSQL database, the syntax looks like:

connection_string = "postgresql://username:[email protected]:port/database"
with sp.TrainStore(connection_string=connection_string) as train_store:
    # ...

The TrainStore will automatically save your DataBlob and ModelTemplate name, group name, and hyperparameters to the database. And when you train a Model, the TrainStore will persist the model name and the epoch training metrics.

All of this happens automatically if you pass the TrainStore instance to Model.fit().

In [22]:
_ = model.fit(final_epoch=5, train_store=train_store)
Epoch 3/5
25/25 [==============================] - 2s 87ms/step - loss: 2.3012 - accuracy: 0.0800 - val_loss: 2.5129 - val_accuracy: 0.1039
Epoch 4/5
25/25 [==============================] - 2s 88ms/step - loss: 2.2999 - accuracy: 0.0800 - val_loss: 2.5124 - val_accuracy: 0.0959
Epoch 5/5
25/25 [==============================] - 2s 90ms/step - loss: 2.2985 - accuracy: 0.0600 - val_loss: 2.5124 - val_accuracy: 0.0959

Once you have some information in the TrainStore, you can query it for information and receive results as a Pandas DataFrame.

First, let's list the DataBlobs that we have saved:

In [23]:
train_store.list_datablobs()
Out[23]:
name group_name hyperparams last_modified
0 fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza fashion_mnist_v1 {'num_training_samples': 50} 2021-04-11 20:40:11.891137

...and the ModelTemplates that we have saved:

In [24]:
train_store.list_model_templates()
Out[24]:
name group_name hyperparams last_modified
0 small_dense_10_way_classifier_v1-uptyfbjofo7rq... small_dense_10_way_classifier_v1 {'hidden_units': 3, 'optimizer': 'adam'} 2021-04-11 20:40:11.892808

...and the models that we have trained:

In [25]:
train_store.list_models()
Out[25]:
model_name model_class_name model_last_modified datablob_name datablob_group_name model_template_name model_template_group_name dbh__num_training_samples mth__hidden_units mth__optimizer
0 mt_small_dense_10_way_classifier_v1-uptyfbjofo... KerasModel 2021-04-11 20:40:11.894643 fashion_mnist_v1-3wzktz1cmz86vs1r7rbmdoza fashion_mnist_v1 small_dense_10_way_classifier_v1-uptyfbjofo7rq... small_dense_10_way_classifier_v1 50 3 adam

...and this is how we query for the training history for a given model:

In [26]:
train_store.list_model_epochs(model_name=model.name)
Out[26]:
epoch_num model_name last_modified metric__loss metric__accuracy metric__val_loss metric__val_accuracy
0 3 mt_small_dense_10_way_classifier_v1-uptyfbjofo... 2021-04-11 20:40:13.981798 2.301222 0.08 2.512903 0.1039
1 4 mt_small_dense_10_way_classifier_v1-uptyfbjofo... 2021-04-11 20:40:16.110798 2.299875 0.08 2.512378 0.0959
2 5 mt_small_dense_10_way_classifier_v1-uptyfbjofo... 2021-04-11 20:40:18.271750 2.298548 0.06 2.512365 0.0959
In [27]:
model_template_2 = small_dense_10_way_classifier_v1(hyperparams=dict(hidden_units=5))
In [28]:
model_2 = sp.KerasModel(datablob=datablob, model_template=model_template_2)
In [29]:
_ = model_2.fit(final_epoch=10, train_store=train_store)
Epoch 1/10
25/25 [==============================] - 2s 91ms/step - loss: 28.2529 - accuracy: 0.1749 - val_loss: 2.5577 - val_accuracy: 0.1217
Epoch 2/10
25/25 [==============================] - 2s 92ms/step - loss: 2.8903 - accuracy: 0.2733 - val_loss: 2.4093 - val_accuracy: 0.1032
Epoch 3/10
25/25 [==============================] - 2s 89ms/step - loss: 2.2542 - accuracy: 0.2733 - val_loss: 2.3985 - val_accuracy: 0.1081
Epoch 4/10
25/25 [==============================] - 2s 95ms/step - loss: 2.2393 - accuracy: 0.2733 - val_loss: 2.3969 - val_accuracy: 0.1096
Epoch 5/10
25/25 [==============================] - 2s 89ms/step - loss: 2.2373 - accuracy: 0.2733 - val_loss: 2.3966 - val_accuracy: 0.1097
Epoch 6/10
25/25 [==============================] - 2s 89ms/step - loss: 2.2353 - accuracy: 0.2733 - val_loss: 2.3966 - val_accuracy: 0.1097
Epoch 7/10
25/25 [==============================] - 2s 88ms/step - loss: 2.2333 - accuracy: 0.2733 - val_loss: 2.3966 - val_accuracy: 0.1097
Epoch 8/10
25/25 [==============================] - 2s 93ms/step - loss: 2.2313 - accuracy: 0.2733 - val_loss: 2.3966 - val_accuracy: 0.1097
Epoch 9/10
25/25 [==============================] - 2s 91ms/step - loss: 2.2293 - accuracy: 0.2733 - val_loss: 2.3967 - val_accuracy: 0.1097
Epoch 10/10
25/25 [==============================] - 2s 88ms/step - loss: 2.2273 - accuracy: 0.2733 - val_loss: 2.3967 - val_accuracy: 0.1097
In [30]:
train_store.list_model_epochs(model_name=model_2.name)
Out[30]:
epoch_num model_name last_modified metric__loss metric__accuracy metric__val_loss metric__val_accuracy
0 1 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:20.782506 13.858220 0.14 2.557676 0.1217
1 2 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:22.989173 2.634313 0.18 2.409259 0.1032
2 3 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:25.121682 2.264078 0.18 2.398494 0.1081
3 4 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:27.412649 2.252162 0.18 2.396904 0.1096
4 5 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:29.547414 2.250889 0.18 2.396623 0.1097
5 6 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:31.679167 2.249614 0.18 2.396614 0.1097
6 7 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:33.794024 2.248346 0.18 2.396612 0.1097
7 8 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:36.031745 2.247089 0.18 2.396631 0.1097
8 9 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:38.216244 2.245847 0.18 2.396651 0.1097
9 10 mt_small_dense_10_way_classifier_v1-axos7t2rck... 2021-04-11 20:40:40.344820 2.244625 0.18 2.396712 0.1097