Combined Color Semantics and Deep Learning for the Automatic Detection of Dolphin Dorsal Fins

Gianvito Losapio

Mediterranean Machine Learning Summer School 2021

Please refer to the publication:

Renò, V.; Losapio, G.; Forenza, F.; Politi, T.; Stella, E.; Fanizza, C.; Hartman, K.; Carlucci, R.; Dimauro, G.; Maglietta, R. Combined Color Semantics and Deep Learning for the Automatic Detection of Dolphin Dorsal Fins, Electronics 2020, 9, 758.

Available here

Table of Contents

Demo 3D Polyhedra Convolutional neural network Final Comments

% Assert the working directory contains all the files
% extracted from the archive ccd_demo.zip
pwd
ans = 'C:\Users\gvlos\Desktop\poster\demo'

% Otherwise set it manually
cd C:\Users\gvlos\Desktop\poster\demo

addpath(genpath('matlab_files'));

Demo

We will go through a simple demo of the algorithm presented in the paper.

Let's import and visualize some images

path = '.\images';

imds = imageDatastore(path,'FileExtensions','.jpg');

im_cell = load_images(imds);

total_images = length(imds.Files);

montage(im_cell)

title('Sample images', 'FontSize',18)

Now the function fin_detection can be used to detect dorsal fins in the images and to inspect every single stage of the procedure.

out_cell = fin_detection(im_cell);

It accepts a cell array called im_cell containing RGB images as input (size 1xtotal_images) and outputs another cell array called out_cell of size 5xtotal_images.

Every row of out_cell represents the outputs of a single step of the algorithm. More specifically, the content of each row of out_cell is the following:

out_cell(1,:) -> estimated color of the sea (model assignment)
out_cell(2,:) -> corresponding 3D polyhedron
out_cell(3,:) -> binary mask
out_cell(4,:) -> morphological operations
out_cell(5,:) -> region proposals with predicted labels

Let's analyze the results.

First, have a look at the steps of the image preprocessing algorithm used for region proposals. (Tip: click on the button appearing at the top right corner of the figure to enlarge it)

montage([im_cell;out_cell(1:4,:)],'Size', [total_images 5], 'BackgroundColor','white')

title('Image preprocessing', 'FontSize',18)

You can notice that all the 5 different color models mentioned in the paper are employed here. Notice also that the morphological operations help reducing noise from the first to the final binary mask.

The final predictions can be visualized with the function show_results.

show_results(out_cell)

We have 3 possible outcomes:

Fin or No Fin is a label provided by the custom convolutional neural network
Discarded is a label associated to cropped regions from the binary mask which do not satisfy an empirical threshold on the aspect ratio

3D Polyhedra

And now let's familiarize with the creation and manipulation of 3D polyhedra used to binarize images.

First, let's apply the function sampleMask provided as an example

im = imresize(im_cell{1,5},[800,1200]);

[~,maskedRGBImage] = sampleMask(im);

montage({im,maskedRGBImage}, 'BackgroundColor','white')

title('Sample Mask', "FontSize",18)

We would like to create a 3D polyhedron starting from the list of color triplets which are left in the masked image (only the gray related to the fin).

For this purpose, objects of type alphaShape offered by Matlab turn out to be really handy. You just need to pass an array of 3D points (the color triplets in our case) to get the corresponding polyhedron which embeds them.

The custom function plot_polyhedron can be used to inspect the resulting point cloud and the corresponding colors in the CIE L*a*b* space.

triplets = get_triplets(maskedRGBImage);

shp = alphaShape(triplets);

plot_polyhedron(shp)

Now you can try it yourself! The following command will launch the Color Thresholding App provided by Matlab.

Choose L*a*b* as color space, play with the filters (histograms or color cloud) to extract the dorsal fin and then export the variable maskedRGBimage1 through the Export button (top right).

colorThresholder(im)

Have a look at the new polyhedron you get

triplets = get_triplets(maskedRGBImage1);
shp = alphaShape(triplets);
plot_polyhedron(shp)

Convolutional neural network

Following is the code used to generate the CNN architecture in the image.

inputSize = [224 224 3];
numClasses = 2;
layers = [
    
    imageInputLayer(inputSize)
    
    convolution2dLayer(3,8,'Padding','same','Name','CONV1-8')
    reluLayer   
    
    convolution2dLayer(3,8,'Padding','same','Name','CONV2-8')
    reluLayer   
    
    maxPooling2dLayer(2,'Stride',2,'Name','MAXPOOL1')
    
    convolution2dLayer(3,16,'Padding','same','Name','CONV3-16')
    reluLayer   
    
    convolution2dLayer(3,16,'Padding','same','Name','CONV4-16')
    reluLayer
    
    maxPooling2dLayer(2,'Stride',2,'Name','MAXPOOL2')
    
    convolution2dLayer(3,32,'Padding','same','Name','CONV5-32')
    reluLayer   
    
    convolution2dLayer(3,32,'Padding','same','Name','CONV6-32')
    reluLayer   
    
    maxPooling2dLayer(2,'Stride',2,'Name','MAXPOOL3')
    
    fullyConnectedLayer(64)
    reluLayer
    
    fullyConnectedLayer(64)
    reluLayer
    
    fullyConnectedLayer(numClasses)
    softmaxLayer
    
    classificationLayer];

Let's inspect each layers through the function analyzeNetwork provided by Matlab

analyzeNetwork(layers)

After the creation you can define your dataset with an object augmentedImageDatastore, the training options with an object trainingOptions and finally use the function trainNetwork to train your newtork.

Final Comments

That's it! I hope you enjoyed the notebook!

For any offline comment, question or curiosity just drop me a line

gianvito97losapio@gmail.com