Combined Color Semantics and Deep Learning for the Automatic Detection of Dolphin Dorsal Fins

Gianvito Losapio
Mediterranean Machine Learning Summer School 2021
Please refer to the publication:
RenĂ², V.; Losapio, G.; Forenza, F.; Politi, T.; Stella, E.; Fanizza, C.; Hartman, K.; Carlucci, R.; Dimauro, G.; Maglietta, R. Combined Color Semantics and Deep Learning for the Automatic Detection of Dolphin Dorsal Fins, Electronics 2020, 9, 758.
Available here
Table of Contents
% Assert the working directory contains all the files
% extracted from the archive ccd_demo.zip
pwd
ans = 'C:\Users\gvlos\Desktop\poster\demo'
% Otherwise set it manually
cd C:\Users\gvlos\Desktop\poster\demo
addpath(genpath('matlab_files'));

Demo

We will go through a simple demo of the algorithm presented in the paper.
Let's import and visualize some images
path = '.\images';
imds = imageDatastore(path,'FileExtensions','.jpg');
im_cell = load_images(imds);
total_images = length(imds.Files);
montage(im_cell)
title('Sample images', 'FontSize',18)
Now the function fin_detection can be used to detect dorsal fins in the images and to inspect every single stage of the procedure.
out_cell = fin_detection(im_cell);
It accepts a cell array called im_cell containing RGB images as input (size 1xtotal_images) and outputs another cell array called out_cell of size 5xtotal_images.
Every row of out_cell represents the outputs of a single step of the algorithm. More specifically, the content of each row of out_cell is the following:
Let's analyze the results.
First, have a look at the steps of the image preprocessing algorithm used for region proposals. (Tip: click on the button appearing at the top right corner of the figure to enlarge it)
montage([im_cell;out_cell(1:4,:)],'Size', [total_images 5], 'BackgroundColor','white')
title('Image preprocessing', 'FontSize',18)
You can notice that all the 5 different color models mentioned in the paper are employed here. Notice also that the morphological operations help reducing noise from the first to the final binary mask.
The final predictions can be visualized with the function show_results.
show_results(out_cell)
We have 3 possible outcomes:

3D Polyhedra

And now let's familiarize with the creation and manipulation of 3D polyhedra used to binarize images.
First, let's apply the function sampleMask provided as an example
im = imresize(im_cell{1,5},[800,1200]);
[~,maskedRGBImage] = sampleMask(im);
montage({im,maskedRGBImage}, 'BackgroundColor','white')
title('Sample Mask', "FontSize",18)
We would like to create a 3D polyhedron starting from the list of color triplets which are left in the masked image (only the gray related to the fin).
For this purpose, objects of type alphaShape offered by Matlab turn out to be really handy. You just need to pass an array of 3D points (the color triplets in our case) to get the corresponding polyhedron which embeds them.
The custom function plot_polyhedron can be used to inspect the resulting point cloud and the corresponding colors in the CIE L*a*b* space.
triplets = get_triplets(maskedRGBImage);
shp = alphaShape(triplets);
plot_polyhedron(shp)
Now you can try it yourself! The following command will launch the Color Thresholding App provided by Matlab.
Choose L*a*b* as color space, play with the filters (histograms or color cloud) to extract the dorsal fin and then export the variable maskedRGBimage1 through the Export button (top right).
colorThresholder(im)
Have a look at the new polyhedron you get
triplets = get_triplets(maskedRGBImage1);
shp = alphaShape(triplets);
plot_polyhedron(shp)

Convolutional neural network

Following is the code used to generate the CNN architecture in the image.
inputSize = [224 224 3];
numClasses = 2;
layers = [
imageInputLayer(inputSize)
convolution2dLayer(3,8,'Padding','same','Name','CONV1-8')
reluLayer
convolution2dLayer(3,8,'Padding','same','Name','CONV2-8')
reluLayer
maxPooling2dLayer(2,'Stride',2,'Name','MAXPOOL1')
convolution2dLayer(3,16,'Padding','same','Name','CONV3-16')
reluLayer
convolution2dLayer(3,16,'Padding','same','Name','CONV4-16')
reluLayer
maxPooling2dLayer(2,'Stride',2,'Name','MAXPOOL2')
convolution2dLayer(3,32,'Padding','same','Name','CONV5-32')
reluLayer
convolution2dLayer(3,32,'Padding','same','Name','CONV6-32')
reluLayer
maxPooling2dLayer(2,'Stride',2,'Name','MAXPOOL3')
fullyConnectedLayer(64)
reluLayer
fullyConnectedLayer(64)
reluLayer
fullyConnectedLayer(numClasses)
softmaxLayer
classificationLayer];
Let's inspect each layers through the function analyzeNetwork provided by Matlab
analyzeNetwork(layers)
After the creation you can define your dataset with an object augmentedImageDatastore, the training options with an object trainingOptions and finally use the function trainNetwork to train your newtork.

Final Comments

That's it! I hope you enjoyed the notebook!
For any offline comment, question or curiosity just drop me a line
gianvito97losapio@gmail.com
4.gif