cknowledge.org/ai: Crowdsourcing benchmarking and optimisation of AI¶

A suite of open-source tools for collecting knowledge on optimising AI:

[PUBLIC] Benchmarking Caffe with OpenBLAS on Samsung Chromebook 2¶

Table of Contents¶

Overview
See the code [for developers]
Get the data [for developers]
See the tables
All data
All execution time data
Mean execution time per batch
Mean execution time per image
Mean number of images per second
See the graphs
Mean number of images per second

Overview¶

We study the execution time of inference (forward propagation):

on the Samsung Chromebook 2 platform:
- [CPU] quad-core ARM Cortex-A15 @ 1900 MHz;
- [CPU] quad-core ARM Cortex-A7 @ 1300 MHz (not used);
- [GPU] quad-core ARM Mali-T628 @ 600 MHz (not used);
- [GPU] dual-core ARM Mali-T628 @ 600 MHz (not used);
- [GPU] OpenCL driver 6.0 (r6p0-02rel0.b77b627bc37583eeaa34bbee29868088);
- [GPU] OpenCL standard 1.1;
- [RAM] 2 GB;
- Gentoo Linux over ChromeOS:

$ cat /etc/lsb-release
CHROMEOS_AUSERVER=https://tools.google.com/service/update2
CHROMEOS_BOARD_APPID={24E2E4F7-F92C-6115-3E26-02C7EAA02946}
CHROMEOS_CANARY_APPID={90F229CE-83E2-4FAF-8479-E368A34938B1}
CHROMEOS_DEVSERVER=
CHROMEOS_RELEASE_APPID={24E2E4F7-F92C-6115-3E26-02C7EAA02946}
CHROMEOS_RELEASE_BOARD=peach_pit-signed-mp-v3keys
CHROMEOS_RELEASE_BRANCH_NUMBER=69
CHROMEOS_RELEASE_BUILDER_PATH=peach_pit-release/R58-9334.69.0
CHROMEOS_RELEASE_BUILD_NUMBER=9334
CHROMEOS_RELEASE_BUILD_TYPE=Official Build
CHROMEOS_RELEASE_CHROME_MILESTONE=58
CHROMEOS_RELEASE_DESCRIPTION=9334.69.0 (Official Build) stable-channel peach_pit 
CHROMEOS_RELEASE_NAME=Chrome OS
CHROMEOS_RELEASE_PATCH_NUMBER=0
CHROMEOS_RELEASE_TRACK=stable-channel
CHROMEOS_RELEASE_VERSION=9334.69.0
DEVICETYPE=CHROMEBOOK
GOOGLE_RELEASE=9334.69.0
$ uname -a
Linux localhost 3.8.11 #1 SMP Wed May 10 18:37:16 PDT 2017 armv7l ARMv7 Processor rev 3 (v7l) SAMSUNG EXYNOS5 (Flattened Device Tree) GNU/Linux

using 3 CNN models (net architecture + weights):
- AlexNet;
- GoogleNet;
- SqueezeNet 1.1;
using 1 library version:
- [CPU] OpenBLAS 0.2.19;
with the number of threads varying from 1 to 4;
with the batch size varying from 1 to 4.

Data wrangling code¶

NB: Please ignore this section if you are not interested in re-running or modifying this notebook.

Includes¶

Standard¶

In [1]:

import os
import sys
import json
import re

Scientific¶

If some of the scientific packages are missing, please install them using:

# pip install jupyter pandas numpy matplotlib

In [2]:

import IPython as ip
import pandas as pd
import numpy as np
import matplotlib as mp

In [3]:

print ('IPython version: %s' % ip.__version__)
print ('Pandas version: %s' % pd.__version__)
print ('NumPy version: %s' % np.__version__)
print ('Matplotlib version: %s' % mp.__version__)

IPython version: 4.1.1
Pandas version: 0.19.1
NumPy version: 1.11.2
Matplotlib version: 1.5.3

In [4]:

from IPython.display import Image
from IPython.core.display import HTML

from IPython.display import display
def display_in_full(df):
    pd.options.display.max_columns = len(df.columns)
    pd.options.display.max_rows = len(df.index)
    display(df)

In [5]:

import matplotlib.pyplot as plt; plt.style.use('classic')
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
%matplotlib inline

In [6]:

default_title = 'Caffe with OpenBLAS on Samsung Chromebook 2'
default_ylabel = 'Execution time (ms)'
default_colormap = cm.autumn
default_fontsize = 16
default_figsize = [16, 16]
default_dpi = 200

In [7]:

if mp.__version__[0]=='2': mp.style.use('classic')
mp.rcParams['figure.figsize'] = default_figsize
mp.rcParams['figure.dpi'] = default_dpi
mp.rcParams['font.size'] = default_fontsize
mp.rcParams['legend.fontsize'] = 'medium'

Collective Knowledge¶

If CK is not installed, please install it using:

# pip install ck

In [8]:

import ck.kernel as ck
print ('CK version: %s' % ck.__version__)

CK version: 1.9.1.1

Access the experimental data¶

In [9]:

def get_experimental_results(repo_uoa, tags, time_ms = 'time_fw_ms'):
    module_uoa = 'experiment'
    r = ck.access({'action':'search', 'repo_uoa':repo_uoa, 'module_uoa':module_uoa, 'tags':tags})
    if r['return']>0:
        print ("Error: %s" % r['error'])
        exit(1)
    experiments = r['lst']
    
    dfs = []
    for experiment in experiments:
        data_uoa = experiment['data_uoa']
        r = ck.access({'action':'list_points', 'repo_uoa':repo_uoa, 'module_uoa':module_uoa, 'data_uoa':data_uoa})
        if r['return']>0:
            print ("Error: %s" % r['error'])
            exit(1)

        # Get (lib_tag, model_tag) from a list of tags that should be available in r['dict']['tags'].
        # Tags include 2 of the 3 irrelevant tags, a model tag and a lib tag.
        # NB: Since it's easier to list all model tags than all lib tags, the latter list is not expicitly specified.
        tags = r['dict']['tags']
        irrelevant_tags = [ 'explore-batch-size-openblas-threads', 'caffe-time', 'samsung-chromebook2' ]
        model_tags = [ 'bvlc-alexnet', 'bvlc-googlenet', 'deepscale-squeezenet-1.1' ]
        lib_model_tags = [ tag for tag in tags if tag not in irrelevant_tags ]
        model_tags = [ tag for tag in lib_model_tags if tag in model_tags ]
        lib_tags = [ tag for tag in lib_model_tags if tag not in model_tags ]
        if len(lib_tags)==1 and len(model_tags)==1:
            (lib, model) = (lib_tags[0], model_tags[0])
        else:
            continue
        
        for point in r['points']:
            with open(os.path.join(r['path'], 'ckp-%s.0001.json' % point)) as point_file:
                point_data_raw = json.load(point_file)
            characteristics_list = point_data_raw['characteristics_list']
            num_repetitions = len(characteristics_list)            
            # Obtain column data.
            data = [
                {
                    # features
                    'platform' : point_data_raw['features']['platform']['platform']['model'],
                    # choices
                    'lib' : lib,
                    'model' : model,
                    'batch_size' : np.int64(point_data_raw['choices']['env'].get('CK_CAFFE_BATCH_SIZE',-1)),
                    'num_threads' : np.int64(point_data_raw['choices']['env'].get('OPENBLAS_NUM_THREADS',-1)),
                    # statistical repetition
                    'repetition_id': repetition_id,
                    # runtime characteristics
                    'time (ms)'   : characteristics['run'].get(time_ms,+1e9), # "positive infinity"
                    'per layer info' : characteristics['run'].get('per_layer_info',[]),
                    'success?'    : characteristics['run'].get('run_success','n/a')
                }
                for (repetition_id, characteristics) in zip(range(num_repetitions), characteristics_list) 
            ]
            # Deal with missing column data (resulting from failed runs).
            if len(data)==1:
                repetitions = point_data_raw['features'].get('statistical_repetitions',1)
                characteristics = characteristics * repetitions
            # Construct a DataFrame.
            df = pd.DataFrame(data)
            # Set columns and index names.
            df.columns.name = 'run characteristic'
            df.index.name = 'index'
            df = df.set_index([ 'platform', 'lib', 'model', 'num_threads', 'batch_size', 'repetition_id' ])
            # Append to the list of similarly constructed DataFrames.
            dfs.append(df)
    # Concatenate all constructed DataFrames (i.e. stack on top of each other).
    result = pd.concat(dfs)
    return result.sortlevel(result.index.names)

Plot images per second against the batch size and the number of threads¶

In [10]:

def plot_trisurf(df_model, x_col, y_col, z_col, x_label=None, y_label=None, z_label=None, title=None):
    x = df_model[x_col]
    y = df_model[y_col]
    z = df_model[z_col]
    
    if x_label == None: x_label = x_col
    if y_label == None: y_label = y_col
    if z_label == None: z_label = z_col
        
    x_ticks = x.unique()
    y_ticks = y.unique()
    
    fig = plt.figure(figsize=(24, 12), dpi=600)
    ax = fig.add_subplot(111, projection='3d')
    trisurf = ax.plot_trisurf(x, y, z, cmap=cm.autumn_r, linewidth=0.2, antialiased=True, shade=True)
    ax.set_xlabel(x_label); ax.set_xticks(x_ticks); ax.set_xlim3d(x_ticks.max(), x_ticks.min())
    ax.set_ylabel(y_label); ax.set_yticks(y_ticks); ax.set_ylim3d(y_ticks.min(), y_ticks.max())
    ax.set_zlabel(z_label); ax.set_zlim3d(z.min(), z.max())
    ax.set_title(title, fontsize=20)
    fig.colorbar(trisurf, shrink=0.5, aspect=10)
    return fig

Get the experimental data¶

NB: Please ignore this section if you are not interested in re-running or modifying this notebook.

The Caffe experimental data was collected on the experimental platform (after installing all Caffe libraries and models of interest) as follows:

$ cd `ck find ck-caffe:script:explore-batch-size-openblas-threads`
$ python explore-batch-size-openblas-threads-benchmarking.py

The data can be downloaded from GitHub via CK as follows:

$ ck pull repo:ck-caffe-samsung-chromebook2 --url=https://github.com/dividiti/ck-caffe-samsung-chromebook2

Tables¶

All data¶

In [11]:

df_all = get_experimental_results(
    repo_uoa='ck-caffe-samsung-chromebook2',
    tags='explore-batch-size-openblas-threads') \
    .reset_index(['platform', 'lib'], drop=True)
display_in_full(df_all)

			run characteristic	per layer info	success?	time (ms)
model	num_threads	batch_size	repetition_id
bvlc-alexnet	1	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	694.353
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	696.456
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	696.375
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1167.060
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1162.620
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1156.160
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1674.100
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1672.910
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1666.920
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1996.090
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2005.840
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2004.880
	2	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	524.983
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	522.722
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	522.799
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	799.439
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	795.181
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	796.874
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1121.040
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1115.980
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1118.950
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1249.540
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1255.670
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1250.690
	3	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	475.538
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	472.089
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	473.300
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	694.085
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	697.827
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	695.824
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	982.992
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	960.929
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	964.487
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1052.490
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1055.100
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1052.270
	4	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	487.265
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	477.739
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	486.216
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	703.497
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	677.439
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	670.514
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	903.934
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	902.611
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	909.762
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	966.699
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	965.654
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	963.569
bvlc-googlenet	1	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1335.570
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1336.710
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1338.930
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2676.170
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2661.660
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2671.690
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	4020.280
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	4000.710
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	3985.830
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	5297.440
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	5315.000
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	5320.740
	2	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	922.396
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	908.155
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	919.381
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1794.620
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1798.210
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1797.320
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2697.390
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2684.180
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2718.890
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	3595.340
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	3595.200
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	3608.520
	3	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	787.989
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	790.105
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	781.814
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1565.810
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1554.960
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1553.640
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2317.810
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2345.490
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2324.120
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	3146.610
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	3146.140
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	3094.600
	4	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	718.714
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	772.498
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	709.756
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1390.610
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1403.510
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1391.730
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2076.340
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2112.780
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2081.190
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2835.490
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2825.630
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	2779.020
deepscale-squeezenet-1.1	1	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	304.267
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	314.208
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	316.030
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	613.367
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	614.067
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	614.589
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	917.684
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	921.206
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	915.081
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1217.780
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1216.740
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	1224.210
	2	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	197.077
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	195.849
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	197.484
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	397.776
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	396.322
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	394.886
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	594.631
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	600.282
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	595.269
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	787.323
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	785.526
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	788.686
	3	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	170.664
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	170.538
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	171.891
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	341.801
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	389.126
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	340.976
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	518.305
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	513.175
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	511.812
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	705.115
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	682.494
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	717.512
	4	1	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	149.971
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	150.513
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	150.944
		2	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	304.959
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	304.886
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	312.029
		3	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	466.609
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	456.494
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	461.580
		4	0	[{u'index': 0, u'direction': u'forward', u'tim...	yes	617.098
			1	[{u'index': 0, u'direction': u'forward', u'tim...	yes	609.183
			2	[{u'index': 0, u'direction': u'forward', u'tim...	yes	688.131

All execution time data¶

In [12]:

df_time = df_all['time (ms)'].unstack(df_all.index.names[:-1])
display_in_full(df_time)

model	bvlc-alexnet																bvlc-googlenet																deepscale-squeezenet-1.1
num_threads	1				2				3				4				1				2				3				4				1				2				3				4
batch_size	1	2	3	4	1	2	3	4	1	2	3	4	1	2	3	4	1	2	3	4	1	2	3	4	1	2	3	4	1	2	3	4	1	2	3	4	1	2	3	4	1	2	3	4	1	2	3	4
repetition_id
0	694.353	1167.06	1674.10	1996.09	524.983	799.439	1121.04	1249.54	475.538	694.085	982.992	1052.49	487.265	703.497	903.934	966.699	1335.57	2676.17	4020.28	5297.44	922.396	1794.62	2697.39	3595.34	787.989	1565.81	2317.81	3146.61	718.714	1390.61	2076.34	2835.49	304.267	613.367	917.684	1217.78	197.077	397.776	594.631	787.323	170.664	341.801	518.305	705.115	149.971	304.959	466.609	617.098
1	696.456	1162.62	1672.91	2005.84	522.722	795.181	1115.98	1255.67	472.089	697.827	960.929	1055.10	477.739	677.439	902.611	965.654	1336.71	2661.66	4000.71	5315.00	908.155	1798.21	2684.18	3595.20	790.105	1554.96	2345.49	3146.14	772.498	1403.51	2112.78	2825.63	314.208	614.067	921.206	1216.74	195.849	396.322	600.282	785.526	170.538	389.126	513.175	682.494	150.513	304.886	456.494	609.183
2	696.375	1156.16	1666.92	2004.88	522.799	796.874	1118.95	1250.69	473.300	695.824	964.487	1052.27	486.216	670.514	909.762	963.569	1338.93	2671.69	3985.83	5320.74	919.381	1797.32	2718.89	3608.52	781.814	1553.64	2324.12	3094.60	709.756	1391.73	2081.19	2779.02	316.030	614.589	915.081	1224.21	197.484	394.886	595.269	788.686	171.891	340.976	511.812	717.512	150.944	312.029	461.580	688.131

Mean execution time per batch¶

In [13]:

df_mean_time_per_batch = df_time.describe().ix['mean'].unstack(level='batch_size')
display_in_full(df_mean_time_per_batch)

	batch_size	1	2	3	4
model	num_threads
bvlc-alexnet	1	695.728000	1161.946667	1671.310000	2002.270000
	2	523.501333	797.164667	1118.656667	1251.966667
	3	473.642333	695.912000	969.469333	1053.286667
	4	483.740000	683.816667	905.435667	965.307333
bvlc-googlenet	1	1337.070000	2669.840000	4002.273333	5311.060000
	2	916.644000	1796.716667	2700.153333	3599.686667
	3	786.636000	1558.136667	2329.140000	3129.116667
	4	733.656000	1395.283333	2090.103333	2813.380000
deepscale-squeezenet-1.1	1	311.501667	614.007667	917.990333	1219.576667
	2	196.803333	396.328000	596.727333	787.178333
	3	171.031000	357.301000	514.430667	701.707000
	4	150.476000	307.291333	461.561000	638.137333

Mean execution time per image¶

In [14]:

batch_sizes = df_mean_time_per_batch.columns.tolist()
df_mean_time_per_image = df_mean_time_per_batch / batch_sizes
display_in_full(df_mean_time_per_image)

	batch_size	1	2	3	4
model	num_threads
bvlc-alexnet	1	695.728000	580.973333	557.103333	500.567500
	2	523.501333	398.582333	372.885556	312.991667
	3	473.642333	347.956000	323.156444	263.321667
	4	483.740000	341.908333	301.811889	241.326833
bvlc-googlenet	1	1337.070000	1334.920000	1334.091111	1327.765000
	2	916.644000	898.358333	900.051111	899.921667
	3	786.636000	779.068333	776.380000	782.279167
	4	733.656000	697.641667	696.701111	703.345000
deepscale-squeezenet-1.1	1	311.501667	307.003833	305.996778	304.894167
	2	196.803333	198.164000	198.909111	196.794583
	3	171.031000	178.650500	171.476889	175.426750
	4	150.476000	153.645667	153.853667	159.534333

Mean images per second¶

In [15]:

df_mean_seconds_per_image = 1e-3 * df_mean_time_per_image
df_mean_images_per_second = 1 / df_mean_seconds_per_image
display_in_full(df_mean_images_per_second)

	batch_size	1	2	3	4
model	num_threads
bvlc-alexnet	1	1.437343	1.721249	1.794999	1.997733
	2	1.910215	2.508892	2.681788	3.194973
	3	2.111298	2.873927	3.094476	3.797637
	4	2.067226	2.924761	3.313322	4.143758
bvlc-googlenet	1	0.747904	0.749109	0.749574	0.753145
	2	1.090936	1.113142	1.111048	1.111208
	3	1.271236	1.283584	1.288029	1.278316
	4	1.363037	1.433401	1.435336	1.421777
deepscale-squeezenet-1.1	1	3.210256	3.257288	3.268008	3.279827
	2	5.081215	5.046325	5.027422	5.081441
	3	5.846893	5.597521	5.831690	5.700385
	4	6.645578	6.508482	6.499683	6.268243

Graphs¶

In [17]:

models = df_mean_images_per_second.index.get_level_values('model').unique()
for model in models:
    df_model = df_mean_images_per_second \
        .ix[model] \
        .unstack() \
        .reset_index() \
        .rename(columns={0 : 'time (ms)'}) \
        .dropna() \
        .sort_values(by='batch_size', ascending=False)
    fig = plot_trisurf(df_model, title=model,
                 x_col='num_threads', x_label='Number of threads',
                 y_col='batch_size', y_label='Batch size',
                 z_col='time (ms)', z_label='Images per second')