Notebook

[PUBLIC] MLPerf Inference - Object Detection - TFLite¶

Table of Contents¶

Platforms
xavier
rpi4coral
Installation
Install system-wide prerequisites
1. Ubuntu 20.04
Install CK
Detect Python
Detect GCC
Install Python dependencies
Download the COCO 2017 validation dataset
Preprocess the COCO 2017 validation dataset
1. For SSD-MobileNet
- Preprocess using OpenCV
- Preprocess using Pillow
Install CMake
Install TFLite
Install the TFLite model
Install the TF Model Garden
Install the Python COCO API
Test (without LoadGen)
1. Quick
- Preprocessed using OpenCV
- Preprocessed using Pillow
1. Full
- Preprocessed using OpenCV
- Preprocessed using Pillow
Install LoadGen
Benchmarking accuracy
Single Stream
Offline
Benchmarking performance
Single Stream
xavier
rpi4coral
Offline
xavier
rpi4coral

Platforms¶

`xavier` (NVIDIA Jetson AGX Xavier)¶

arjun@xavier:~$ uname -a
Linux xavier 4.9.201-tegra #1 SMP PREEMPT Fri Jan 15 14:54:23 PST 2021 aarch64 aarch64 aarch64 GNU/Linux

arjun@xavier:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"

arjun@xavier:~$ lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           4
Vendor ID:           Nvidia
Model:               0
Model name:          ARMv8 Processor rev 0 (v8l)
Stepping:            0x0
CPU max MHz:         2265.6001
CPU min MHz:         115.2000
BogoMIPS:            62.50
L1d cache:           64K
L1i cache:           128K
L2 cache:            2048K
L3 cache:            4096K
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp

arjun@xavier:~$ sudo /usr/sbin/nvpmodel -m 0
arjun@xavier:~$ sudo /usr/sbin/nvpmodel -d cool
arjun@xavier:~$ sudo /usr/bin/jetson_clocks --store
arjun@xavier:~$ sudo /usr/bin/jetson_clocks
arjun@xavier:~$ sudo /usr/bin/jetson_clocks --show
SOC family:tegra194  Machine:Jetson-AGX
Online CPUs: 0-7
cpu0: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu1: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu2: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu3: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu4: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu5: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu6: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
cpu7: Online=1 Governor=schedutil MinFreq=2265600 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=0 c6=0
GPU MinFreq=1377000000 MaxFreq=1377000000 CurrentFreq=1377000000
EMC MinFreq=204000000 MaxFreq=2133000000 CurrentFreq=2133000000 FreqOverride=1
Fan: PWM=77
NV Power Mode: MAXN
arjun@xavier:~$ sudo /usr/bin/jetson_clocks --restore
arjun@xavier:~$ sudo /usr/bin/jetson_clocks --show
SOC family:tegra194  Machine:Jetson-AGX
Online CPUs: 0-7
cpu0: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=2265600 CurrentFreq=2265600 IdleStates: C1=1 c6=1
cpu1: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=2265600 CurrentFreq=1958400 IdleStates: C1=1 c6=1
cpu2: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=2265600 CurrentFreq=1958400 IdleStates: C1=1 c6=1
cpu3: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=2265600 CurrentFreq=1420800 IdleStates: C1=1 c6=1
cpu4: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=2265600 CurrentFreq=1804800 IdleStates: C1=1 c6=1
cpu5: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=2265600 CurrentFreq=1267200 IdleStates: C1=1 c6=1
cpu6: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=2265600 CurrentFreq=1497600 IdleStates: C1=1 c6=1
cpu7: Online=1 Governor=schedutil MinFreq=1190400 MaxFreq=2265600 CurrentFreq=2188800 IdleStates: C1=1 c6=1
GPU MinFreq=318750000 MaxFreq=1377000000 CurrentFreq=318750000
EMC MinFreq=204000000 MaxFreq=2133000000 CurrentFreq=408000000 FreqOverride=0
Fan: PWM=0
NV Power Mode: MAXN

`rpi4coral` (Raspberry Pi 4)¶

arjun@rpi4coral:~$ uname -a
Linux rpi4coral 5.4.0-1028-raspi #31-Ubuntu SMP PREEMPT Wed Jan 20 11:30:45 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux

arjun@rpi4coral:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.2 LTS"

arjun@rpi4coral:~$ lscpu
Architecture:                    aarch64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       1
Vendor ID:                       ARM
Model:                           3
Model name:                      Cortex-A72
Stepping:                        r0p3
CPU max MHz:                     1500.0000
CPU min MHz:                     600.0000
BogoMIPS:                        108.00
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Vulnerable
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fp asimd evtstrm crc32 cpuid

Installation¶

Install system-wide prerequisites¶

NB: Run the below commands for your Linux system with sudo or as superuser.

Ubuntu 20.04 or similar¶

$ sudo apt update -y
$ sudo apt install -y apt-utils
$ sudo apt upgrade -y
$ sudo apt install -y\
 python3 python3-pip gcc g++\
 autoconf autogen libtool make cmake patch\
 git curl wget zip libz-dev libssl-dev vim
$ sudo apt clean

CentOS 8 or similar¶

TODO

Install Collective Knowledge (CK)¶

anton@xavier:~$ export CK_PYTHON=/usr/bin/python3
anton@xavier:~$ $CK_PYTHON -m pip install --ignore-installed pip setuptools testresources --user
anton@xavier:~$ $CK_PYTHON -m pip install ck
anton@xavier:~$ echo 'export PATH=$HOME/.local/bin:$PATH' >> $HOME/.bashrc
anton@xavier:~$ source $HOME/.bashrc
anton@xavier:~$ ck version
V1.55.2

Install CK repositories¶

Install public repositories¶

anton@xavier:~$ ck pull repo --url=https://github.com/krai/ck-mlperf

Use generic Linux settings with dummy frequency setting scripts¶

anton@xavier:~$ ck detect platform.os --platform_init_uoa=generic-linux-dummy

OS CK UOA:            linux-64 (4258b5fe54828a50)

OS name:              Ubuntu 18.04.5 LTS
Short OS name:        Linux 4.9.201
Long OS name:         Linux-4.9.201-tegra-aarch64-with-Ubuntu-18.04-bionic
OS bits:              64
OS ABI:               aarch64

Platform init UOA:    -

Detect (system) Python¶

anton@xavier:~$ export CK_PYTHON=/usr/bin/python3
anton@xavier:~$ ck detect soft:compiler.python --full_path=$CK_PYTHON
anton@xavier:~$ ck show env --tags=compiler,python
Env UID:         Target OS: Bits: Name:  Version: Tags:

cae4f0c2690e6b80   linux-64    64 python 3.6.9    64bits,compiler,host-os-linux-64,lang-python,python,target-os-linux-64,v3,v3.6,v3.6.9

NB: CK can normally detect available Python interpreters automatically, but we are playing safe here.

Detect (system) GCC¶

anton@xavier:~$ export CK_CC=/usr/bin/gcc
anton@xavier:~$ ck detect soft:compiler.gcc --full_path=$CK_CC
anton@xavier:~$ ck show env --tags=compiler,gcc
Env UID:         Target OS: Bits: Name:          Version: Tags:

d57004dc9b28d525   linux-64    64 GNU C compiler 7.5.0    64bits,compiler,gcc,host-os-linux-64,lang-c,lang-cpp,target-os-linux-64,v7,v7.5,v7.5.0

NB: CK can normally detect compilers automatically, but we are playing safe here.

Install Python dependencies (in userspace)¶

Install implicit dependencies via pip¶

NB: These dependencies are implicit, i.e. CK will not try to satisfy them. If they are not installed, however, the workflow will fail.

$ export CK_PYTHON=/usr/bin/python3
$ $CK_PYTHON -m pip install --user --upgrade \
  wheel
Successfully installed...

Install explicit dependencies via CK (also via `pip`, but register with CK at the same time)¶

NB: These dependencies are explicit, i.e. CK will try to satisfy them automatically.

You can still install them explicitly as follows:

anton@xavier:~$ ck install package --tags=python-package,numpy
anton@xavier:~$ ck install package --tags=python-package,pillow
anton@xavier:~$ ck install package --tags=python-package,matplotlib
anton@xavier:~$ ck install package --tags=python-package,opencv-python-headless
anton@xavier:~$ ck install package --tags=python-package,absl
anton@xavier:~$ ck install package --tags=python-package,cython

anton@xavier:~$ ck show env --tags=python-package
Env UID:         Target OS: Bits: Name:                                                  Version:    Tags:

61046178de6ea4f9   linux-64    64 Python Pillow library                                  8.1.0       64bits,PIL,host-os-linux-64,lib,needs-python,needs-python-3.6.9,pillow,python-package,target-os-linux-64,v8,v8.1,v8.1.0,vmaster
ea795b39db83ac6d   linux-64    64 Python OpenCV library (OpenCV without contribs or GUI) 4.5.1.48    64bits,cv2,headless,host-os-linux-64,lib,needs-python,needs-python-3.6.9,opencv,opencv-python-headless,python-package,target-os-linux-64,v4,v4.5,v4.5.1,v4.5.1.48,without-contribs,without-gui
34fc9a86b613bd92   linux-64    64 Python NumPy library                                   1.19.5      64bits,host-os-linux-64,lib,needs-python,needs-python-3.6.9,numpy,python-package,target-os-linux-64,v1,v1.19,v1.19.5,vmaster
a39e2fe603c41d98   linux-64    64 Python Matplotlib library                              3.3.4       64bits,host-os-linux-64,lib,matplotlib,needs-python,needs-python-3.6.9,python-package,target-os-linux-64,v3,v3.3,v3.3.4,vmaster
ea921a2ee978fe56   linux-64    64 Python Abseil library                                  unversioned 64bits,absl,absl-py,host-os-linux-64,lib,needs-python,needs-python-3.6.9,python-package,target-os-linux-64,v0,vmaster

Download the dataset¶

NB: The COCO 2017 validation dataset (5,000 images) takes ~1.6G. Use --ask to confirm the destination directory.

anton@xavier:~$ ck install package --tags=dataset,coco,val,2017 --ask
anton@xavier:~$ du -hs $(ck locate env --tags=dataset,coco,val,2017)
1.6G    /datasets/dataset-coco-2017-val

NB: To save disk space, you can clean the training annotations after the installation:

anton@xavier:~$ du -hsc $(ck locate env --tags=dataset,coco,val,2017)/annotations/*train2017.json
88M     /datasets/dataset-coco-2017-val/annotations/captions_train2017.json
449M    /datasets/dataset-coco-2017-val/annotations/instances_train2017.json
228M    /datasets/dataset-coco-2017-val/annotations/person_keypoints_train2017.json
764M    total
anton@xavier:~$ rm $(ck locate env --tags=dataset,coco,val,2017)/annotations/*train2017.json
anton@xavier:~$ du -hs $(ck locate env --tags=dataset,coco,val,2017)
839M    /datasets/dataset-coco-2017-val

Preprocess the dataset¶

For SSD-MobileNet-v1¶

SSD-MobileNet-v1 requires resizing images to the 300x300 resolution.

NB: As the COCO 2017 validation dataset preprocessed to 300x300 takes ~1.3G, you may want to use the --ask flag to confirm the destination directory interactively.

Preprocess using OpenCV¶

anton@xavier:~$ ck install package --tags=dataset,coco.2017,preprocessed,using-opencv,full,side.300 --ask
anton@xavier:~$ du -hs $(ck locate env --tags=dataset,coco.2017,preprocessed,using-opencv,full,side.300)
1.3G    /home/anton/CK_TOOLS/dataset-object-detection-preprocessed-using-opencv-coco.2017-full-side.300

Preprocess using Pillow¶

anton@xavier:~$ ck install package --tags=dataset,coco.2017,preprocessed,using-pillow,full,side.300 --ask
anton@xavier:~$ du -hs $(ck locate env --tags=dataset,coco.2017,preprocessed,using-pillow,full,side.300)
1.3G    /home/anton/CK_TOOLS/dataset-object-detection-preprocessed-using-pillow-coco.2017-full-side.300

Detect (system) CMake or install CMake from source¶

Detect¶

Try to detect CMake on your system:

anton@xavier:~$ ck detect soft --tags=tool,cmake
anton@xavier:~$ ck show env --tags=tool,cmake
Env UID:         Target OS: Bits: Name: Version: Tags:

0305242ae838cf05   linux-64    64 cmake 3.17.3   64bits,cmake,host-os-linux-64,target-os-linux-64,tool,v3,v3.17,v3.17.3

Install¶

If this fails or if the CMake version is pre-3.16, try to install it from source:

anton@xavier:~$ ck install package --tags=tool,cmake,from.source
anton@xavier:~$ ck show env --tags=tool,cmake,from.source
Env UID:         Target OS: Bits: Name: Version: Tags:

49c8514914650815   linux-64    64 cmake 3.19.5   64bits,cmake,compiled,compiled-by-gcc,compiled-by-gcc-7.5.0,from.source,host-os-linux-64,source,target-os-linux-64,tool,v3,v3.19,v3.19.5

Install the TFLite library (inference engine)¶

Generic¶

anton@xavier:~$ ck install package --tags=lib,tflite,via-cmake

`xavier`¶

arjun@xavier:~$ ck install package --tags=lib,via-cmake,with.ruy

Install the SSD-MobileNet-v1 TFLite model¶

anton@xavier:~$ ck install package --tags=model,tflite,ssd-mobilenet --no_tags=edgetpu

Install the TensorFlow Model Garden (API)¶

anton@xavier:~$ ck install package --tags=tensorflowmodel,api

Install the Python COCO API ¶

anton@xavier:~$ ck install package --tags=tool,coco,api

Test (without LoadGen)¶

Quick¶

Preprocessed using OpenCV¶

anton@xavier:~$ ck benchmark program:object-detection-tflite \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-opencv,side.300 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.lib-tflite=via-cmake \
--env.CK_BATCH_COUNT=50 --env.CK_BATCH_SIZE=1
...
Evaluate metrics as coco ...
loading annotations into memory...
Done (t=0.91s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.41s).
Accumulating evaluation results...
DONE (t=0.59s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.256
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.409
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.272
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.026
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.226
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.610
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.223
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.268
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.270
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.030
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.228
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.644

Summary:
-------------------------------
All images loaded in 0.268237s
Average image load time: 0.005365s
All images detected in 1.657856s
Average detection time: 0.031017s
Total NMS time: 0.062935s
Average NMS time: 0.001259s
mAP: 0.25566726658312644
Recall: 0.2696417278289413
--------------------------------

Preprocessed using Pillow¶

anton@xavier:~$ ck benchmark program:object-detection-tflite \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-pillow,side.300 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.lib-tflite=via-cmake \
--env.CK_BATCH_COUNT=50 --env.CK_BATCH_SIZE=1
...
Evaluate metrics as coco ...
loading annotations into memory...
Done (t=0.89s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.31s).
Accumulating evaluation results...
DONE (t=0.46s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.228
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.359
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.236
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.191
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.560
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.205
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.243
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.243
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.026
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.196
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.596

Summary:
-------------------------------
All images loaded in 0.062045s
Average image load time: 0.001241s
All images detected in 1.728882s
Average detection time: 0.032712s
Total NMS time: 0.060354s
Average NMS time: 0.001207s
mAP: 0.2275409744679146
Recall: 0.24346685802358603
--------------------------------

Full¶

Preprocessed using OpenCV¶

anton@xavier:~$ ck benchmark program:object-detection-tflite \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-opencv,side.300 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.lib-tflite=via-cmake \
--env.CK_BATCH_COUNT=5000 --env.CK_BATCH_SIZE=1
...
Evaluate metrics as coco ...
loading annotations into memory...
Done (t=0.88s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.35s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=28.90s).
Accumulating evaluation results...
DONE (t=4.60s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.231
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.350
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.254
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.167
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.529
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.209
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.263
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.264
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.191
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.604

Summary:
-------------------------------
All images loaded in 46.736076s
Average image load time: 0.009347s
All images detected in 138.532166s
Average detection time: 0.026915s
Total NMS time: 3.927624s
Average NMS time: 0.000786s
mAP: 0.23129300490236354
Recall: 0.263527133917118
--------------------------------

Preprocessed using Pillow¶

anton@xavier:~$ ck benchmark program:object-detection-tflite \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-pillow,side.300 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.lib-tflite=via-cmake \
--env.CK_BATCH_COUNT=5000 --env.CK_BATCH_SIZE=1
...
Evaluate metrics as coco ...
loading annotations into memory...
Done (t=0.89s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.14s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=27.72s).
Accumulating evaluation results...
DONE (t=4.25s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.223
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.341
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.247
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.015
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.160
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.515
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.203
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.255
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.255
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.019
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.182
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.593

Summary:
-------------------------------
All images loaded in 5.572460s
Average image load time: 0.001114s
All images detected in 138.749115s
Average detection time: 0.026947s
Total NMS time: 3.966329s
Average NMS time: 0.000793s
mAP: 0.22349681099305302
Recall: 0.2550505369422975
--------------------------------

Install the MLPerf Inference repo and build LoadGen¶

anton@xavier:~$ ck install package --tags=mlperf,inference,source
anton@xavier:~$ ck install package --tags=python-package,mlperf,loadgen

Benchmarking accuracy¶

Single Stream¶

50 samples¶

Preprocessed using OpenCV¶

anton@xavier:~$ ck benchmark program:object-detection-tflite-loadgen \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-opencv,side.300 \
--dep_add_tags.library=tflite,via-cmake,v2.4 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.python=v3.7 --dep_add_tags.tool-coco=needs-python-3.7.5 \
--env.CK_LOADGEN_MODE=AccuracyOnly --env.CK_LOADGEN_SCENARIO=SingleStream \
--env.CK_LOADGEN_DATASET_SIZE=50 --env.CK_LOADGEN_BUFFER_SIZE=1024 \
--record --record_repo=local --process_multi_keys \
--record_uoa=mlperf.object-detection.ssd-mobilenet.tflite.accuracy.using-opencv.50 \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy,using-opencv,50
...
Graph file: /home/anton/CK_TOOLS/model-tflite-mlperf-ssd-mobilenet-downloaded-from-zenodo/detect_regular_nms.tflite
Image dir: /home/anton/CK_TOOLS/dataset-object-detection-preprocessed-using-opencv-coco.2017-full-side.300
Image list: original_dimensions.txt
Image size: 300*300
Image channels: 3
Result dir: detections
Batch count: 1
Batch size: 1
Normalize: 1
Subtract mean: 0
Image count in file: 50
Graph file: /home/anton/CK_TOOLS/model-tflite-mlperf-ssd-mobilenet-downloaded-from-zenodo/detect_regular_nms.tflite
Image dir: /home/anton/CK_TOOLS/dataset-object-detection-preprocessed-using-opencv-coco.2017-full-side.300
Image list: original_dimensions.txt
Image size: 300*300
Image channels: 3
Result dir: detections
Batch count: 1
Batch size: 1
Normalize: 1
Subtract mean: 0
Image count in file: 50

Loading graph...
Loaded model /home/anton/CK_TOOLS/model-tflite-mlperf-ssd-mobilenet-downloaded-from-zenodo/detect_regular_nms.tflite
resolved reporter

Number of threads: 8
tensors size: 184
nodes size: 64
number of inputs: 1
number of outputs: 4
input(0) name: normalized_input_image_tensor
...
Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/anton/CK_TOOLS/mlperf-inference-r1.0/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: SingleStream
LoadGen Mode: AccuracyOnly
CBllllllllllllllllllllllllllllllllllllllllllllllllll
QpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQp
U
  (post processing via CK (/home/anton/CK/ck-mlperf/script/object-detection, loadgen_postprocess)


--------------------------------
loading annotations into memory...
Done (t=0.88s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.02s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.29s).
Accumulating evaluation results...
DONE (t=0.50s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.256
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.409
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.272
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.026
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.226
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.610
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.223
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.268
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.270
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.030
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.228
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.644
mAP=25.554%

--------------------------------

Preprocessed using Pillow¶

anton@xavier:~$ ck benchmark program:object-detection-tflite-loadgen \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-pillow,side.300 \
--dep_add_tags.library=tflite,via-cmake,v2.4 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.python=v3.7 --dep_add_tags.tool-coco=needs-python-3.7.5 \
--env.CK_LOADGEN_MODE=AccuracyOnly --env.CK_LOADGEN_SCENARIO=SingleStream \
--env.CK_LOADGEN_DATASET_SIZE=50 --env.CK_LOADGEN_BUFFER_SIZE=1024 \
--record --record_repo=local --process_multi_keys \
--record_uoa=mlperf.object-detection.ssd-mobilenet.tflite.accuracy.using-pillow,50 \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy,using-pillow,50
...
Graph file: /home/anton/CK_TOOLS/model-tflite-mlperf-ssd-mobilenet-downloaded-from-zenodo/detect_regular_nms.tflite
Image dir: /home/anton/CK_TOOLS/dataset-object-detection-preprocessed-using-pillow-coco.2017-full-side.300
Image list: original_dimensions.txt
Image size: 300*300
Image channels: 3
Result dir: detections
Batch count: 1
Batch size: 1
Normalize: 1
Subtract mean: 0
Image count in file: 50
Graph file: /home/anton/CK_TOOLS/model-tflite-mlperf-ssd-mobilenet-downloaded-from-zenodo/detect_regular_nms.tflite
Image dir: /home/anton/CK_TOOLS/dataset-object-detection-preprocessed-using-pillow-coco.2017-full-side.300
Image list: original_dimensions.txt
Image size: 300*300
Image channels: 3
Result dir: detections
Batch count: 1
Batch size: 1
Normalize: 1
Subtract mean: 0
Image count in file: 50

Loading graph...
Loaded model /home/anton/CK_TOOLS/model-tflite-mlperf-ssd-mobilenet-downloaded-from-zenodo/detect_regular_nms.tflite
resolved reporter

Number of threads: 8
tensors size: 184
nodes size: 64
number of inputs: 1
number of outputs: 4
input(0) name: normalized_input_image_tensor
...
Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/anton/CK_TOOLS/mlperf-inference-r1.0/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: SingleStream
LoadGen Mode: AccuracyOnly
CBllllllllllllllllllllllllllllllllllllllllllllllllll
QpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQp
U
  (post processing via CK (/home/anton/CK/ck-mlperf/script/object-detection, loadgen_postprocess)


--------------------------------
loading annotations into memory...
Done (t=0.89s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.02s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.25s).
Accumulating evaluation results...
DONE (t=0.44s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.228
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.359
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.236
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.191
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.560
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.205
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.243
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.243
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.026
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.196
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.596
mAP=22.754%

--------------------------------

500 samples¶

Preprocessed using OpenCV¶

anton@xavier:~$ ck benchmark program:object-detection-tflite-loadgen \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-opencv,side.300 \
--dep_add_tags.library=tflite,via-cmake,v2.4 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.python=v3.7 --dep_add_tags.tool-coco=needs-python-3.7.5 \
--env.CK_LOADGEN_MODE=AccuracyOnly --env.CK_LOADGEN_SCENARIO=SingleStream \
--env.CK_LOADGEN_DATASET_SIZE=500 --env.CK_LOADGEN_BUFFER_SIZE=1024 \
--record --record_repo=local --process_multi_keys \
--record_uoa=mlperf.object-detection.ssd-mobilenet.tflite.accuracy.using-opencv.500 \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy,using-opencv,500
...
Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/anton/CK_TOOLS/mlperf-inference-r1.0/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: SingleStream
LoadGen Mode: AccuracyOnly
CBllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
QpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQppQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQppQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQp
U
  (post processing via CK (/home/anton/CK/ck-mlperf/script/object-detection, loadgen_postprocess)

--------------------------------
loading annotations into memory...
Done (t=0.87s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.03s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=2.46s).
Accumulating evaluation results...
DONE (t=1.18s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.254
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.377
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.280
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.020
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.181
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.543
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.228
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.281
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.281
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.190
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.585
mAP=25.382%

--------------------------------

Preprocessed using Pillow¶

arjun@xavier:~$ ck benchmark program:object-detection-tflite-loadgen \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-pillow,side.300 \
--dep_add_tags.library=tflite,via-cmake,v2.4 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.python=v3.7 --dep_add_tags.tool-coco=needs-python-3.7.5 \
--env.CK_LOADGEN_MODE=AccuracyOnly --env.CK_LOADGEN_SCENARIO=SingleStream \
--env.CK_LOADGEN_DATASET_SIZE=500 --env.CK_LOADGEN_BUFFER_SIZE=1024 \
--record --record_repo=local --process_multi_keys \
--record_uoa=mlperf.object-detection.ssd-mobilenet.tflite.accuracy.using-pillow.500 \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy,using-pillow,500
...
Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/anton/CK_TOOLS/mlperf-inference-r1.0/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: SingleStream
LoadGen Mode: AccuracyOnly
CBllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
QpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQppQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQppQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQp
U
  (post processing via CK (/home/anton/CK/ck-mlperf/script/object-detection, loadgen_postprocess)

...
--------------------------------
loading annotations into memory...
Done (t=0.87s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.02s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=2.33s).
Accumulating evaluation results...
DONE (t=1.15s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.245
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.362
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.268
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.014
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.177
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.517
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.223
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.271
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.272
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.016
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.186
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.561
mAP=24.475%

--------------------------------

5000 samples¶

Preprocessed using OpenCV¶

anton@xavier:~$ ck benchmark program:object-detection-tflite-loadgen \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-opencv,side.300 \
--dep_add_tags.library=tflite,via-cmake,v2.4 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.python=v3.7 --dep_add_tags.tool-coco=needs-python-3.7.5 \
--env.CK_LOADGEN_MODE=AccuracyOnly --env.CK_LOADGEN_SCENARIO=SingleStream \
--env.CK_LOADGEN_DATASET_SIZE=5000 --env.CK_LOADGEN_BUFFER_SIZE=1024 \
--record --record_repo=local --process_multi_keys \
--record_uoa=mlperf.object-detection.ssd-mobilenet.tflite.accuracy.using-opencv.5000 \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy,using-opencv,5000
...
Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/anton/CK_TOOLS/mlperf-inference-r1.0/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: SingleStream
LoadGen Mode: AccuracyOnly
CBl...
...
U
  (post processing via CK (/home/anton/CK/ck-mlperf/script/object-detection, loadgen_postprocess)

...
--------------------------------
loading annotations into memory...
Done (t=0.89s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.23s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=24.71s).
Accumulating evaluation results...
DONE (t=4.14s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.231
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.350
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.254
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.167
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.529
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.209
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.263
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.264
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.191
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.604
mAP=23.137%

--------------------------------

Preprocessed using Pillow¶

anton@xavier:~$ ck benchmark program:object-detection-tflite-loadgen \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-pillow,side.300 \
--dep_add_tags.library=tflite,via-cmake,v2.4 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.python=v3.7 --dep_add_tags.tool-coco=needs-python-3.7.5 \
--env.CK_LOADGEN_MODE=AccuracyOnly --env.CK_LOADGEN_SCENARIO=SingleStream \
--env.CK_LOADGEN_DATASET_SIZE=5000 --env.CK_LOADGEN_BUFFER_SIZE=1024 \
--record --record_repo=local --process_multi_keys \
--record_uoa=mlperf.object-detection.ssd-mobilenet.tflite.accuracy.using-pillow.5000 \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy,using-pillow,5000
...
Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/anton/CK_TOOLS/mlperf-inference-r1.0/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: SingleStream
LoadGen Mode: AccuracyOnly
CBl
...
U
  (post processing via CK (/home/anton/CK/ck-mlperf/script/object-detection, loadgen_postprocess)


--------------------------------
--------------------------------
loading annotations into memory...
Done (t=0.89s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.21s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=23.94s).
Accumulating evaluation results...
DONE (t=3.82s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.224
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.341
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.247
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.015
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.160
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.515
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.203
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.255
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.255
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.019
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.182
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.593
mAP=22.350%

--------------------------------

Offline¶

50 samples¶

Preprocessed using OpenCV¶

anton@xavier:~$ ck benchmark program:object-detection-tflite-loadgen \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-opencv,side.300 \
--dep_add_tags.library=tflite,via-cmake,v2.4 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.python=v3.7 --dep_add_tags.tool-coco=needs-python-3.7.5 \
--env.CK_LOADGEN_MODE=AccuracyOnly --env.CK_LOADGEN_SCENARIO=Offline \
--env.CK_LOADGEN_DATASET_SIZE=50 --env.CK_LOADGEN_BUFFER_SIZE=1024 \
--record --record_repo=local --process_multi_keys \
--record_uoa=mlperf.object-detection.ssd-mobilenet.tflite.accuracy.using-opencv.50.offline \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy,using-opencv,50,offline
...
Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/anton/CK_TOOLS/mlperf-inference-r1.0/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: Offline
LoadGen Mode: AccuracyOnly
CBllllllllllllllllllllllllllllllllllllllllllllllllll
Qpppppppppppppppppppppppppppppppppppppppppppppppppp
U
  (post processing via CK (/home/anton/CK/ck-mlperf/script/object-detection, loadgen_postprocess)


--------------------------------
loading annotations into memory...
Done (t=0.97s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.02s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.25s).
Accumulating evaluation results...
DONE (t=0.44s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.256
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.409
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.272
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.026
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.226
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.610
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.223
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.268
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.270
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.030
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.228
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.644
mAP=25.554%

--------------------------------

500 samples¶

Preprocessed using OpenCV¶

anton@xavier:~$ ck benchmark program:object-detection-tflite-loadgen \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-opencv,side.300 \
--dep_add_tags.library=tflite,via-cmake,v2.4 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.python=v3.7 --dep_add_tags.tool-coco=needs-python-3.7.5 \
--env.CK_LOADGEN_MODE=AccuracyOnly --env.CK_LOADGEN_SCENARIO=Offline \
--env.CK_LOADGEN_DATASET_SIZE=500 --env.CK_LOADGEN_BUFFER_SIZE=1024 \
--record --record_repo=local --process_multi_keys \
--record_uoa=mlperf.object-detection.ssd-mobilenet.tflite.accuracy.using-opencv.500.offline \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy,using-opencv,500,offline
...
Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/anton/CK_TOOLS/mlperf-inference-r1.0/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: Offline
LoadGen Mode: AccuracyOnly
CBllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
Qpppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppp
U
  (post processing via CK (/home/anton/CK/ck-mlperf/script/object-detection, loadgen_postprocess)


--------------------------------
loading annotations into memory...
Done (t=0.89s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.03s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=2.42s).
Accumulating evaluation results...
DONE (t=1.12s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.254
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.377
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.280
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.020
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.181
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.543
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.228
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.281
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.281
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.190
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.585
mAP=25.382%

--------------------------------

5000 samples¶

Preprocessed using OpenCV¶

anton@xavier:~$ ck benchmark program:object-detection-tflite-loadgen \
--speed --repetitions=1 --skip_print_timers --skip_stat_analysis \
--dep_add_tags.dataset=preprocessed,using-opencv,side.300 \
--dep_add_tags.library=tflite,via-cmake,v2.4 \
--dep_add_tags.weights=ssd-mobilenet \
--dep_add_tags.python=v3.7 --dep_add_tags.tool-coco=needs-python-3.7.5 \
--env.CK_LOADGEN_MODE=AccuracyOnly --env.CK_LOADGEN_SCENARIO=Offline \
--env.CK_LOADGEN_DATASET_SIZE=5000 --env.CK_LOADGEN_BUFFER_SIZE=1024 \
--record --record_repo=local --process_multi_keys \
--record_uoa=mlperf.object-detection.ssd-mobilenet.tflite.accuracy.using-opencv.5000.offline \
--tags=mlperf,object-detection,ssd-mobilenet,tflite,accuracy,using-opencv,5000,offline
...
Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/anton/CK_TOOLS/mlperf-inference-r1.0/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: Offline
LoadGen Mode: AccuracyOnly
CBl...
...
U
  (post processing via CK (/home/anton/CK/ck-mlperf/script/object-detection, loadgen_postprocess)


--------------------------------
loading annotations into memory...
Done (t=0.90s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.22s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=24.21s).
Accumulating evaluation results...
DONE (t=3.96s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.231
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.350
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.254
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.167
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.529
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.209
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.263
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.264
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.191
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.604
mAP=23.137%

--------------------------------

Benchmarking performance¶

Single Stream¶

A valid SingleStream performance run must reach a) the minimum duration of 600 seconds (NB: increased from 60 seconds for v1.0), and b) the minimum of 1,024 queries. Increasing the expected SingleStream target latency in user.conf from 10 milliseconds to above ~60 milliseconds decreases the number of queries that LoadGen issues from 6,000 (actually, 12,000 to account for variability) to 1,024. Note that it does not matter whether the expected latency is, say, 100 ms or 1000 ms, as long as it is above ~60 ms.

`xavier`¶

arjun@xavier:~$ ck benchmark program:object-detection-tflite-loadgen  --repetitions=1  --env.CK_METRIC_TYPE=COCO  --env.CK_LOADGEN_SCENARIO=SingleStream --record  --tags=mlperf,object-detection,ssd-mobilenet,tflite,performance  --process_multi_keys  --skip_stat_analysis --env.CK_LOADGEN_MODE=PerformanceOnly --env.CK_LOADGEN_DATASET_SIZE=500 --record --record_repo=local --record_uoa=mlperf-object-detection-ssd-mobilenet-tflite-performance --skip_print_timers

...

Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/arjun/CK-TOOLS/mlperf-inference-r0.7/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: SingleStream
LoadGen Mode: PerformanceOnly
CBllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
QpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQpQp

------------------------------------------------------------
|            LATENCIES (in nanoseconds and fps)            |
------------------------------------------------------------
Number of queries run: 1843
Min latency:                      26793327ns  (37.3227 fps)
Median latency:                   31173978ns  (32.078 fps)
Average latency:                  32579448ns  (30.6942 fps)
90 percentile latency:            36719640ns  (27.2334 fps)
Max latency:                      57667165ns  (17.3409 fps)
------------------------------------------------------------
U
  (post processing via CK (/home/arjun/CK/ck-mlperf/script/object-detection, loadgen_postprocess)


--------------------------------
--------------------------------


  (reading fine grain timers from tmp-ck-timer.json ...)


Execution time: 0.000 sec.
***************************************************************************************

arjun@xavier:~$ cat $( find program:object-detection-tflite-loadgen)/tmp/mlperf_log_summary.txt
================================================
MLPerf Results Summary
================================================
SUT name : TFLite_SUT
Scenario : Single Stream
Mode     : Performance
90th percentile latency (ns) : 36719640
Result is : VALID
  Min duration satisfied : Yes
  Min queries satisfied : Yes

================================================
Additional Stats
================================================
QPS w/ loadgen overhead         : 30.68
QPS w/o loadgen overhead        : 30.69

Min latency (ns)                : 26793327
Max latency (ns)                : 57667165
Mean latency (ns)               : 32579448
50.00 percentile latency (ns)   : 31173978
90.00 percentile latency (ns)   : 36719640
95.00 percentile latency (ns)   : 39314159
97.00 percentile latency (ns)   : 41228743
99.00 percentile latency (ns)   : 44026890
99.90 percentile latency (ns)   : 54121338

================================================
Test Parameters Used
================================================
samples_per_query : 1
target_qps : 100
target_latency (ns): 0
max_async_queries : 1
min_duration (ms): 60000
max_duration (ms): 0
min_query_count : 1024
max_query_count : 0
qsl_rng_seed : 12786827339337101903
sample_index_rng_seed : 12640797754436136668
schedule_rng_seed : 3135815929913719677
accuracy_log_rng_seed : 0
accuracy_log_probability : 0
accuracy_log_sampling_target : 0
print_timestamps : false
performance_issue_unique : false
performance_issue_same : false
performance_issue_same_index : 0
performance_sample_count : 256

No warnings encountered during test.

No errors encountered during test.

`rpi4coral`¶

Offline¶

A valid Offline performance run must reach a) the minimum duration of 60 seconds (NB: increased to 600 seconds for v1.0), and b) the minimum of 24,576 samples.

`rpi4coral`¶

arjun@rpi4coral:~$ ck benchmark program:object-detection-tflite-loadgen  --repetitions=1  --env.CK_METRIC_TYPE=COCO  --env.CK_LOADGEN_SCENARIO=Offline  --record  --tags=mlperf,object-detection,ssd-mobilenet,tflite,performance  --process_multi_keys  --env.CK_LOADGEN_MODE=PerformanceOnly --env.CK_LOADGEN_DATASET_SIZE=5 --record --record_repo=local --record_uoa=mlperf-object-detection-ssd-mobilenet-tflite-performance --skip_stat_analysis --skip_print_timers --env.CK_LOADGEN_TARGET_QPS=1 --env.CK_LOADGEN_BUFFER_SIZE=1024

  (run ...)
executing code ...
Graph file: /home/arjun/CK-TOOLS/model-tflite-mlperf-ssd-mobilenet-downloaded-from-zenodo/detect_regular_nms.tflite
Image dir: /home/arjun/CK-TOOLS/dataset-object-detection-preprocessed-using-pillow-coco.2017-first.20-side.300
Image list: original_dimensions.txt
Image size: 300*300
Image channels: 3
Result dir: detections
Batch count: 1
Batch size: 1
Normalize: 1
Subtract mean: 0
Image count in file: 5
Graph file: /home/arjun/CK-TOOLS/model-tflite-mlperf-ssd-mobilenet-downloaded-from-zenodo/detect_regular_nms.tflite
Image dir: /home/arjun/CK-TOOLS/dataset-object-detection-preprocessed-using-pillow-coco.2017-first.20-side.300
Image list: original_dimensions.txt
Image size: 300*300
Image channels: 3
Result dir: detections
Batch count: 1
Batch size: 1
Normalize: 1
Subtract mean: 0
Image count in file: 5

Loading graph...
Loaded model /home/arjun/CK-TOOLS/model-tflite-mlperf-ssd-mobilenet-downloaded-from-zenodo/detect_regular_nms.tflite
resolved reporter

Number of threads: 4
tensors size: 213
nodes size: 64
number of inputs: 1
number of outputs: 4
input(0) name: normalized_input_image_tensor

Input tensor dimensions (NHWC): 1*300*300*3
Detection boxes tensor dimensions: 1*100*4
Detection classes tensor dimensions: 1*100
Detection scores tensor dimensions: 1*100
Number of detections tensor dimensions: 1*1
Path to mlperf.conf : /home/arjun/CK-TOOLS/mlperf-inference-r0.7/inference/mlperf.conf
Path to user.conf : user.conf
Model Name: ssd-mobilenet
LoadGen Scenario: Offline
LoadGen Mode: PerformanceOnly
CBlllll
Qpppppppppppppppppppppppppppppppp......

------------------------------------------------------------
|            LATENCIES (in nanoseconds and fps)            |
------------------------------------------------------------
Number of queries run: 24576
Min latency:                      5537079183743ns  (0.000180601 fps)
Median latency:                   5537079183743ns  (0.000180601 fps)
Average latency:                  5537079183743ns  (0.000180601 fps)
90 percentile latency:            5537079183743ns  (0.000180601 fps)
Max latency:                      5537079183743ns  (0.000180601 fps)
------------------------------------------------------------
U
  (post processing via CK (/home/arjun/CK/ck-mlperf/script/object-detection, loadgen_postprocess)



--------------------------------
--------------------------------

[PUBLIC] MLPerf Inference - Object Detection - TFLite¶

Table of Contents¶

Platforms¶

xavier (NVIDIA Jetson AGX Xavier)¶

rpi4coral (Raspberry Pi 4)¶

Installation¶

Install system-wide prerequisites¶

Ubuntu 20.04 or similar¶

CentOS 8 or similar¶

Install Collective Knowledge (CK)¶

Install CK repositories¶

Install public repositories¶

Use generic Linux settings with dummy frequency setting scripts¶

Detect (system) Python¶

Detect (system) GCC¶

Install Python dependencies (in userspace)¶

Install implicit dependencies via pip¶

Install explicit dependencies via CK (also via pip, but register with CK at the same time)¶

Download the dataset¶

Preprocess the dataset¶

For SSD-MobileNet-v1¶

Preprocess using OpenCV¶

Preprocess using Pillow¶

Detect (system) CMake or install CMake from source¶

Detect¶

Install¶

Install the TFLite library (inference engine)¶

Generic¶

xavier¶

Install the SSD-MobileNet-v1 TFLite model¶

Install the TensorFlow Model Garden (API)¶

Install the Python COCO API¶

Test (without LoadGen)¶

Quick¶

Preprocessed using OpenCV¶

Preprocessed using Pillow¶

Full¶

Preprocessed using OpenCV¶

Preprocessed using Pillow¶

Install the MLPerf Inference repo and build LoadGen¶

Benchmarking accuracy¶

Single Stream¶

50 samples¶

Preprocessed using OpenCV¶

Preprocessed using Pillow¶

500 samples¶

Preprocessed using OpenCV¶

Preprocessed using Pillow¶

5000 samples¶

Preprocessed using OpenCV¶

Preprocessed using Pillow¶

Offline¶

50 samples¶

Preprocessed using OpenCV¶

500 samples¶

Preprocessed using OpenCV¶

5000 samples¶

Preprocessed using OpenCV¶

Benchmarking performance¶

Single Stream¶

xavier¶

rpi4coral¶

Offline¶

rpi4coral¶

`xavier` (NVIDIA Jetson AGX Xavier)¶

`rpi4coral` (Raspberry Pi 4)¶

Install explicit dependencies via CK (also via `pip`, but register with CK at the same time)¶

`xavier`¶

Install the Python COCO API ¶

`xavier`¶

`rpi4coral`¶

`rpi4coral`¶