Augment Segmentation Maps

Segmentation maps are 2D arrays in which every spatial position is assigned to exactly one class. They are represented in imgaug using imgaug.augmentables.segmaps.SegmentationMapOnImage. The class is instantiated as SegmentationMapOnImage(arr, shape, [nb_classes]). arr contains the 2D segmentation map, shape is the shape of the corresponding image and nb_classes is the number of unique classes that can appear in the map. nb_classes is expected to be removed in the future, but currently still has to be provided if arr has an int/uint dtype (the most common scenario).

Currently, SegmentationMapOnImage represents segmentation maps internally as (H,W,C) arrays, where C is the number of channels. Each component is a float between 0.0 and 1.0. The class vector associated with each spatial location can be viewed as a one-hot-vector, though its components can sum to more than 1. When converting this representation to an (H,W) integer array, only the id of the class with largest value is kept per location (i.e. argmax). However, if that value is below a small threshold (default: 0.01), a background class will instead be used. (Note: This internal representation will change in v0.3.0 to a simple integer array.)

Noteworthy attributes of SegmentationMapOnImage are .shape (shape of the corresponding image) and .arr (internal segmentation map representation).

Noteworthy methods of SegmentationMapOnImage are:

  • get_arr_int([background_threshold], [background_class_id]): Converts the internal representation of the segmentation map to an integer dtype and returns it.
  • draw([size], [background_threshold], [background_class_id], [colors], [return_foreground_mask]): Converts the segmentation map to an RGB image.
  • draw_on_image(image, [alpha], [resize], [size], [background_threshold], [background_class_id], [colors], [draw_background]): Converts the segmentation map to an RGB image and blends it with a provided image.
  • pad([top], [right], [bottom], [left], [mode], [cval]): Pad the segmentation map on its sides. Note that this currently pads the internal segmentation map, i.e. cval should not be a class id.
  • pad_to_aspect_ratio(aspect_ration, [mode], [cval], [return_pad_amounts]): Pad the segmentation map to an aspect ratio (width/height). Note that this currently pads the internal segmentation map, i.e. cval should not be a class id.
  • resize(sizes, [interpolation]): Resize the segmentation map to a provided size. As the internal representation is a float, the interpolation can differ from nearest neighbour.

To augment segmentation maps, use augment_segmentation_maps(), which is offered by all augmenters. It expects a SegmentationMapOnImage or list of SegmentationMapOnImage.

For more details, see the API: imgaug.SegmentationMapOnImage, imgaug.augmenters.meta.Augmenter.augment_segmentation_maps().

For drawing routines SegmentationMapOnImage uses a predefined set of colors. These are currently saved in the constant SegmentationMapOnImage.DEFAULT_SEGMENT_COLORS. They will likely be replaced in the future by a matplotlib colormap, so change only with caution.

Important: imgaug's segmentation map augmentation is geared towards ground truth outputs. As such, only augmentation techniques that change the image geometry will be applied to segmentation maps, even when other augmentation techniques are part of a pipeline. Examples for that are e.g. horizontal flips or affine transformations. To also apply non-geometric augmentation techniques, feed the segmentation map array through augmenter.augment_images() instead.

Creating an Example Segmentation Map from Polygons Given as Points

The goal of our first example is to load an image, create a segmentation map and augment both of them. Let's first load and visualize our example image:

In [1]:
import imageio
import imgaug as ia
%matplotlib inline

image = imageio.imread("")
image = ia.imresize_single_image(image, 0.15)
(319, 479, 3)

Now we need a segmentation map for that image. We will create two classes, one for the tree (bottom) and one for the chipmunk (center). Everything else will be background. Both classes will be created as polygons and then drawn on a segmentation map array. First, we define the four corner points of the tree polygon:

In [2]:
import numpy as np
from imgaug.augmentables.kps import KeypointsOnImage

tree_kps_xy = np.float32([
    [0, 300],  # left top of the tree
    [image.shape[1]-1, 230],  # right top
    [image.shape[1]-1, image.shape[0]-1],  # right bottom
    [0, image.shape[0]-1]  # left bottom

# visualize
kpsoi_tree = KeypointsOnImage.from_xy_array(tree_kps_xy, shape=image.shape)
ia.imshow(kpsoi_tree.draw_on_image(image, size=13))

Now we have to create the chipmunk polygon. That one requires significantly more corner points, but the underlying method is the same:

In [3]:
chipmunk_kps_xy = np.float32([
    [200, 50],  # left ear, top (from camera perspective)
    [220, 70],
    [260, 70],
    [280, 50],  # right ear, top
    [290, 80],
    [285, 110],
    [310, 140],
    [330, 175], # right of cheek
    [310, 260], # right of right paw
    [175, 275], # left of left paw
    [170, 220],
    [150, 200],
    [150, 170], # left of cheek
    [160, 150],
    [186, 120], # left of eye
    [185, 70]

# visualize
kpsoi_chipmunk = KeypointsOnImage.from_xy_array(chipmunk_kps_xy, shape=image.shape)
ia.imshow(kpsoi_chipmunk.draw_on_image(image, size=7))

In the next step, we convert both sets of corner points to instances of imgaug.augmentables.polys.Polygon:

In [4]:
from imgaug.augmentables.polys import Polygon

# create polygons
poly_tree = Polygon(kpsoi_tree.keypoints)
poly_chipmunk = Polygon(kpsoi_chipmunk.keypoints)

# visualize polygons