In [1]:

%matplotlib inline
%reload_ext autoreload
%autoreload 2

In [2]:

from fastai.conv_learner import *
from fastai.dataset import *

from pathlib import Path
import json
from PIL import ImageDraw, ImageFont
from matplotlib import patches, patheffects
# torch.cuda.set_device(0)

Pascal VOC¶

We will be looking at the Pascal VOC dataset. It's quite slow, so you may prefer to download from this mirror. There are two different competition/research datasets, from 2007 and 2012. We'll be using the 2007 version. You can use the larger 2012 for better results, or even combine them (but be careful to avoid data leakage between the validation sets if you do this).

Unlike previous lessons, we are using the python 3 standard library pathlib for our paths and file access. Note that it returns an OS-specific class (on Linux, PosixPath) so your output may look a little different. Most libraries than take paths as input can take a pathlib object - although some (like cv2) can't, in which case you can use str() to convert it to a string.

Download the dataset¶

In [5]:

!pwd

/home/ubuntu/fastai/courses/dl2

In [9]:

!ln -s ~/data data

In [12]:

!ls -la data/

total 837144
drwxrwxr-x  4 ubuntu ubuntu      4096 May 20 05:28 .
drwxr-xr-x 24 ubuntu ubuntu      4096 May 20 15:01 ..
drwxrwxr-x  8 ubuntu ubuntu      4096 May 13 13:09 dogscats
-rw-rw-r--  1 ubuntu ubuntu 857214334 Apr  1  2017 dogscats.zip
drwxrwxr-x  2 ubuntu ubuntu      4096 May 20 06:02 spellbee

In [18]:

%cd data

/home/ubuntu/data

In [22]:

%mkdir pascal

In [23]:

%cd pascal/

/home/ubuntu/data/pascal

In [33]:

!tar -xf VOCtrainval_06-Nov-2007.tar

In [20]:

!aria2c --file-allocation=none -c -x 5 -s 5 http://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar

[#8be4bd 431MiB/438MiB(98%) CN:1 DL:34MiB]                        
06/15 21:53:04 [NOTICE] Download complete: /home/ubuntu/data/VOCtrainval_06-Nov-2007.tar

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
8be4bd|OK  |    33MiB/s|/home/ubuntu/data/VOCtrainval_06-Nov-2007.tar

Status Legend:
(OK):download completed.

In [21]:

!aria2c --file-allocation=none -c -x 5 -s 5 https://storage.googleapis.com/coco-dataset/external/PASCAL_VOC.zip

06/15 21:54:27 [NOTICE] Download complete: /home/ubuntu/data/PASCAL_VOC.zip

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
1350c8|OK  |   8.4MiB/s|/home/ubuntu/data/PASCAL_VOC.zip

Status Legend:
(OK):download completed.

In [33]:

!tar -xf VOCtrainval_06-Nov-2007.tar

In [35]:

!unzip PASCAL_VOC.zip

Archive:  PASCAL_VOC.zip
   creating: PASCAL_VOC/
  inflating: PASCAL_VOC/pascal_test2007.json  
  inflating: PASCAL_VOC/pascal_train2007.json  
  inflating: PASCAL_VOC/pascal_train2012.json  
  inflating: PASCAL_VOC/pascal_val2007.json  
  inflating: PASCAL_VOC/pascal_val2012.json

In [36]:

%mv PASCAL_VOC/*.json .

In [38]:

%rmdir PASCAL_VOC

In [39]:

%ls -la

total 462072
drwxrwxr-x 3 ubuntu ubuntu      4096 Jun 15 22:01 ./
drwxrwxr-x 5 ubuntu ubuntu      4096 Jun 15 21:56 ../
-rw-r--r-- 1 ubuntu ubuntu   2584743 Jul  7  2015 pascal_test2007.json
-rw-r--r-- 1 ubuntu ubuntu   1346236 Aug 19  2015 pascal_train2007.json
-rw-r--r-- 1 ubuntu ubuntu   2912167 Aug 19  2015 pascal_train2012.json
-rw-r--r-- 1 ubuntu ubuntu   1342257 Jul  7  2015 pascal_val2007.json
-rw-r--r-- 1 ubuntu ubuntu   2922699 Aug 19  2015 pascal_val2012.json
-rw-rw-r-- 1 ubuntu ubuntu   1998182 Jun 15 21:54 PASCAL_VOC.zip
drwxrwxr-x 3 ubuntu ubuntu      4096 Nov  6  2007 VOCdevkit/
-rw-rw-r-- 1 ubuntu ubuntu 460032000 Jun 15 21:53 VOCtrainval_06-Nov-2007.tar

In [41]:

%cd ~/fastai/courses/dl2

/home/ubuntu/fastai/courses/dl2

In [4]:

PATH = Path('data/pascal')
list(PATH.iterdir())

Out[4]:

[PosixPath('data/pascal/pascal_train2012.json'),
 PosixPath('data/pascal/VOCtrainval_06-Nov-2007.tar'),
 PosixPath('data/pascal/pascal_train2007.json'),
 PosixPath('data/pascal/models'),
 PosixPath('data/pascal/VOCdevkit'),
 PosixPath('data/pascal/pascal_val2007.json'),
 PosixPath('data/pascal/pascal_test2007.json'),
 PosixPath('data/pascal/pascal_val2012.json'),
 PosixPath('data/pascal/PASCAL_VOC.zip'),
 PosixPath('data/pascal/tmp')]

As well as the images, there are also annotations - bounding boxes showing where each object is. These were hand labeled. The original version were in XML, which is a little hard to work with nowadays, so we uses the more recent JSON version which you can download from this link.

You can see here how pathlib includes the ability to open files (amongst many other capabilities).

In [5]:

trn_j = json.load( (PATH / 'pascal_train2007.json').open() )
trn_j.keys()

Out[5]:

dict_keys(['images', 'type', 'annotations', 'categories'])

In [6]:

IMAGES, ANNOTATIONS, CATEGORIES = ['images', 'annotations', 'categories']
trn_j[IMAGES][:5]

Out[6]:

[{'file_name': '000012.jpg', 'height': 333, 'width': 500, 'id': 12},
 {'file_name': '000017.jpg', 'height': 364, 'width': 480, 'id': 17},
 {'file_name': '000023.jpg', 'height': 500, 'width': 334, 'id': 23},
 {'file_name': '000026.jpg', 'height': 333, 'width': 500, 'id': 26},
 {'file_name': '000032.jpg', 'height': 281, 'width': 500, 'id': 32}]

In [66]:

trn_j[ANNOTATIONS][:2]

Out[66]:

[{'segmentation': [[155, 96, 155, 270, 351, 270, 351, 96]],
  'area': 34104,
  'iscrowd': 0,
  'image_id': 12,
  'bbox': [155, 96, 196, 174],
  'category_id': 7,
  'id': 1,
  'ignore': 0},
 {'segmentation': [[184, 61, 184, 199, 279, 199, 279, 61]],
  'area': 13110,
  'iscrowd': 0,
  'image_id': 17,
  'bbox': [184, 61, 95, 138],
  'category_id': 15,
  'id': 2,
  'ignore': 0}]

In [68]:

trn_j[CATEGORIES][:8]

Out[68]:

[{'supercategory': 'none', 'id': 1, 'name': 'aeroplane'},
 {'supercategory': 'none', 'id': 2, 'name': 'bicycle'},
 {'supercategory': 'none', 'id': 3, 'name': 'bird'},
 {'supercategory': 'none', 'id': 4, 'name': 'boat'},
 {'supercategory': 'none', 'id': 5, 'name': 'bottle'},
 {'supercategory': 'none', 'id': 6, 'name': 'bus'},
 {'supercategory': 'none', 'id': 7, 'name': 'car'},
 {'supercategory': 'none', 'id': 8, 'name': 'cat'}]

It's helpful to use constants instead of strings, since we get tab-completion and don't mistype.

In [7]:

FILE_NAME, ID, IMG_ID, CAT_ID, BBOX = 'file_name', 'id', 'image_id', 'category_id', 'bbox'

In [75]:

cats = { o[ID]:o["name"] for o in trn_j[CATEGORIES] }
trn_fns = { o[ID]:o[FILE_NAME] for o in trn_j[IMAGES] }
trn_ids = { o[ID] for o in trn_j[IMAGES] }

In [81]:

list( (PATH / 'VOCdevkit/VOC2007').iterdir() )

Out[81]:

[PosixPath('data/pascal/VOCdevkit/VOC2007/JPEGImages'),
 PosixPath('data/pascal/VOCdevkit/VOC2007/SegmentationClass'),
 PosixPath('data/pascal/VOCdevkit/VOC2007/Annotations'),
 PosixPath('data/pascal/VOCdevkit/VOC2007/SegmentationObject'),
 PosixPath('data/pascal/VOCdevkit/VOC2007/ImageSets')]

In [82]:

JPEGS = 'VOCdevkit/VOC2007/JPEGImages'

In [83]:

IMG_PATH = PATH / JPEGS
list( IMG_PATH.iterdir() )[:5]

Out[83]:

[PosixPath('data/pascal/VOCdevkit/VOC2007/JPEGImages/001688.jpg'),
 PosixPath('data/pascal/VOCdevkit/VOC2007/JPEGImages/007189.jpg'),
 PosixPath('data/pascal/VOCdevkit/VOC2007/JPEGImages/003408.jpg'),
 PosixPath('data/pascal/VOCdevkit/VOC2007/JPEGImages/001604.jpg'),
 PosixPath('data/pascal/VOCdevkit/VOC2007/JPEGImages/000729.jpg')]

Each image has a unique ID.

In [84]:

im0_d = trn_j[IMAGES][0]
im0_d

Out[84]:

{'file_name': '000012.jpg', 'height': 333, 'width': 500, 'id': 12}

In [85]:

im0_d[FILE_NAME], im0_d[ID]

Out[85]:

('000012.jpg', 12)

A defaultdict is useful any time you want to have a default dictionary entry for new keys. Here we create a dict from image IDs to a list of annotations (tuple of bounding box and class id).

We convert VOC's height/width into top-left/bottom-right, and switch x/y coords to be consistent with numpy.

In [86]:

def hw_bb(bb):
    # Example, bb = [155, 96, 196, 174]
    return np.array([ bb[1], bb[0], bb[3] + bb[1] - 1, bb[2] + bb[0] - 1 ])

In [106]:

# VOC's bbox: column (x coord), row (of top left, y coord), height, width
#ix   0    1   2    3
bb = [155, 96, 196, 174]
bb[1], bb[0], bb[3] + bb[1] - 1, bb[2] + bb[0] - 1

Out[106]:

(96, 155, 269, 350)

In [107]:

trn_anno = collections.defaultdict(lambda:[])

for o in trn_j[ANNOTATIONS]:
    if not o['ignore']:
        bb = o[BBOX] # one bbox. looks like '[155, 96, 196, 174]'.
        bb = hw_bb(bb)
        trn_anno[o[IMG_ID]].append( (bb, o[CAT_ID]) )

len(trn_anno)

Out[107]:

In [115]:

# Test getting the first element from dict_values
list(trn_anno.values())[0]

Out[115]:

[(array([ 96, 155, 269, 350]), 7)]

In [117]:

print(im0_d[ID])

im_a = trn_anno[im0_d[ID]]
im_a

Out[117]:

[(array([ 96, 155, 269, 350]), 7)]

In [120]:

im0_a = im_a[0] # get first item (first bbox) from list. note: possible to have more than one bbox per image.
im0_a

Out[120]:

(array([ 96, 155, 269, 350]), 7)

In [121]:

cats[7]

Out[121]:

'car'

In [122]:

trn_anno[17]

Out[122]:

[(array([ 61, 184, 198, 278]), 15), (array([ 77,  89, 335, 402]), 13)]

In [123]:

cats[15], cats[13]

Out[123]:

('person', 'horse')

Some libs take VOC format bounding boxes, so this let's us convert back when required:

In [124]:

bb_voc = [155, 96, 196, 174]
bb_fastai = hw_bb(bb_voc)
bb_fastai

Out[124]:

array([ 96, 155, 269, 350])

In [125]:

def bb_hw(a):
    return np.array( [ a[1], a[0], a[3] - a[1] + 1, a[2] - a[0] + 1 ] )

In [126]:

f'expected: {bb_voc}, actual: {bb_hw(bb_fastai)}'

Out[126]:

'expected: [155, 96, 196, 174], actual: [155  96 196 174]'

You can use Visual Studio Code (vscode - open source editor that comes with recent versions of Anaconda, or can be installed separately), or most editors and IDEs, to find out all about the open_image function. vscode things to know:

Command palette (Ctrl-shift-p)
Select interpreter (for fastai env)
Select terminal shell
Go to symbol (Ctrl-t)
Find references (Shift-F12)
Go to definition (F12)
Go back (alt-left)
View documentation
Hide sidebar (Ctrl-b)
Zen mode (Ctrl-k,z)

In [127]:

im = open_image(IMG_PATH / im0_d[FILE_NAME])

Matplotlib's plt.subplots is a really useful wrapper for creating plots, regardless of whether you have more than one subplot. Note that Matplotlib has an optional object-oriented API which I think is much easier to understand and use (although few examples online use it!)

In [135]:

def show_img(im, figsize=None, ax=None):
    if not ax:
        fig, ax = plt.subplots(figsize=figsize)
    ax.imshow(im)
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
    return ax

A simple but rarely used trick to making text visible regardless of background is to use white text with black outline, or visa versa. Here's how to do it in matplotlib.

In [140]:

def draw_outline(o, lw):
    o.set_path_effects( [patheffects.Stroke(linewidth=lw, foreground='black'),
                          patheffects.Normal()] )

Note that * in argument lists is the splat operator. In this case it's a little shortcut compared to writing out b[-2],b[-1].

In [130]:

def draw_rect(ax, b):
    patch = ax.add_patch(patches.Rectangle(b[:2], *b[-2:], fill=False, edgecolor='white', lw=2))
    draw_outline(patch, 4)

In [330]:

def draw_text(ax, xy, txt, sz=14):
    text = ax.text(*xy, txt, verticalalignment='top', color='white', fontsize=sz, weight='bold')
    draw_outline(text, 1)

In [142]:

ax = show_img(im)
b = bb_hw(im0_a[0]) # convert bbox back to VOC format
draw_rect(ax, b)
draw_text(ax, b[:2], cats[im0_a[1]])

Packaging it all up¶

In [143]:

def draw_im(im, ann):
    # im is image, ann is annotations
    ax = show_img(im, figsize=(16, 8))
    for b, c in ann:
        # b is bbox, c is class id
        b = bb_hw(b)
        draw_rect(ax, b)
        draw_text(ax, b[:2], cats[c], sz=16)

In [146]:

def draw_idx(i):
    # i is image id
    im_a = trn_anno[i] # training annotations
    im = open_image(IMG_PATH / trn_fns[i]) # trn_fns is training image file names
    print(im.shape)
    draw_im(im, im_a) # im_a is an element of annotation

In [147]:

draw_idx(17) # image id is 17

(364, 480, 3)

Largest item classifier¶

A lambda function is simply a way to define an anonymous function inline. Here we use it to describe how to sort the annotation for each image - by bounding box size (descending).

In [244]:

def get_lrg(b):
    if not b:
        raise Exception()
    # x is tuple. e.g.: (array([96 155 269 350]), 16)
    # x[0] returns a numpy array. e.g.: [96 155 269 350]
    # x[0][-2:] returns a numpy array. e.g.: [269 350]. This is the width x height of a bbox.
    # x[0][:2] returns a numpy array. e.g.: [96 155]. This is the x/y coord of a bbox.
    # np.product(x[0][-2:] - x[0][:2]) returns a scalar. e.g.: 33735
    b = sorted(b, key=lambda x: np.product(x[0][-2:] - x[0][:2]), reverse=True)
    return b[0] # get the first element in the list, which is the largest bbox for one image.

In [240]:

# Debugging codes
np_prod = np.product(np.array([269, 350]) - np.array([96, 155]))
minus_mul = (269 - 96) * (350 - 155) # bbox volume (area): width x height at origin (0, 0)
print(np_prod)
assert np_prod == minus_mul

In [ ]:

# for k, v in trn_anno.items():
#     print(f"k: {k}, v: {v}")

In [242]:

# a is image id (int), b is tuple of bbox (numpy array) & class id (int)
trn_lrg_anno = { a: get_lrg(b) for a, b in trn_anno.items() if (a != 0 and a != 1) }

In [263]:

trn_lrg_anno[23]

Out[263]:

(array([  1,   2, 461, 242]), 15)

Now we have a dictionary from image id to a single bounding box - the largest for that image.

In [259]:

def draw_largest_bbox(img_id):
    b, c = trn_lrg_anno[img_id] # trn_lrg_anno is a tuple. destructuring syntax.
    print(f'### DEBUG ### bbox: {b.tolist()}, class id: {c}') # print numpy.ndarray using tolist method.

    b = bb_hw(b) # convert back fastai's bbox to VOC format
    ax = show_img(open_image(IMG_PATH / trn_fns[img_id]), figsize=(5, 10))
    draw_rect(ax, b)
    draw_text(ax, b[:2], cats[c], sz=16)

In [283]:

img_id = 695
draw_largest_bbox(img_id)

### DEBUG ### bbox: [125, 108, 365, 414], class id: 13

In [273]:

(PATH / 'tmp').mkdir(exist_ok=True)
CSV = PATH / 'tmp/lrg.csv'

Often it's easiest to simply create a CSV of the data you want to model, rather than trying to create a custom dataset. Here we use Pandas to help us create a CSV of the image filename and class.

In [274]:

df = pd.DataFrame({ 'fn': [trn_fns[o] for o in trn_ids],
                    'cat': [cats[trn_lrg_anno[o][1]] for o in trn_ids] }, columns=['fn', 'cat'])
df.to_csv(CSV, index=False)

In [284]:

f_model = resnet34
sz = 224
bs = 64

From here it's just like Dogs vs Cats!

In [285]:

tfms = tfms_from_model(f_model, sz, aug_tfms=transforms_side_on, crop_type=CropType.NO)
md = ImageClassifierData.from_csv(PATH, JPEGS, CSV, tfms=tfms, bs=bs)

In [288]:

x, y = next(iter(md.val_dl))

In [291]:

show_img(md.val_ds.denorm(to_np(x))[0])

Out[291]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f64b9d0ba58>

In [296]:

learn = ConvLearner.pretrained(f_model, md, metrics=[accuracy])

In [297]:

learn.opt_fn = optim.Adam

In [298]:

lrf = learn.lr_find(1e-5, 100)

HBox(children=(IntProgress(value=0, description='Epoch', max=1), HTML(value='')))

 78%|███████▊  | 25/32 [00:11<00:03,  2.09it/s, loss=13.1]

When your LR finder graph looks like this, you can ask for more points on each end:

In [299]:

learn.sched.plot()

In [300]:

learn.sched.plot(n_skip=5, n_skip_end=1)

In [301]:

lr = 2e-2

In [302]:

learn.fit(lr, 1, cycle_len=1)

HBox(children=(IntProgress(value=0, description='Epoch', max=1), HTML(value='')))

epoch      trn_loss   val_loss   accuracy                 
    0      1.259122   0.782595   0.78

Out[302]:

[array([0.78259]), 0.779999997138977]

In [303]:

lrs = np.array([lr/1000, lr/100, lr])

In [304]:

learn.freeze_to(-2)

In [305]:

lrf = learn.lr_find(lrs/1000)
learn.sched.plot(1)

HBox(children=(IntProgress(value=0, description='Epoch', max=1), HTML(value='')))

 84%|████████▍ | 27/32 [00:17<00:03,  1.52it/s, loss=5.12]

In [306]:

learn.fit(lrs/5, 1, cycle_len=1)

HBox(children=(IntProgress(value=0, description='Epoch', max=1), HTML(value='')))

epoch      trn_loss   val_loss   accuracy                  
    0      0.789873   0.674313   0.788

Out[306]:

[array([0.67431]), 0.7879999985694885]

In [307]:

learn.unfreeze()

Accuracy isn't improving much - since many images have multiple different objects, it's going to be impossible to be that accurate.

In [308]:

learn.fit(lrs/5, 1, cycle_len=2)

HBox(children=(IntProgress(value=0, description='Epoch', max=2), HTML(value='')))

epoch      trn_loss   val_loss   accuracy                  
    0      0.600366   0.672303   0.794     
    1      0.444746   0.691367   0.786

Out[308]:

[array([0.69137]), 0.786]

In [309]:

learn.save('clas_one')

In [310]:

learn.load('clas_one')

In [311]:

x, y = next(iter(md.val_dl))
probs = F.softmax(predict_batch(learn.model, x), -1)
x, preds = to_np(x), to_np(probs)
preds = np.argmax(preds, -1)

You can use the python debugger pdb to step through code.

pdb.set_trace() to set a breakpoint
%debug magic to trace an error

Commands you need to know:

s / n / c
u / d
p
l

In [331]:

fig, axes = plt.subplots(3, 4, figsize=(12, 8))

for i, ax in enumerate(axes.flat):
    ima = md.val_ds.denorm(x)[i]
    b = md.classes[preds[i]]
    ax = show_img(ima, ax=ax)
    draw_text(ax, (0, 0), b)

plt.tight_layout()

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

It's doing a pretty good job of classifying the largest object!

Bbox only¶

Now we'll try to find the bounding box of the largest object. This is simply a regression with 4 outputs. So we can use a CSV with multiple 'labels'.

In [8]:

BB_CSV = PATH / 'tmp/bb.csv'

In [333]:

bb = np.array([ trn_lrg_anno[o][0] for o in trn_ids ])
bbs = [' '.join( str(p) for p in o ) for o in bb]

df = pd.DataFrame({
    'fn': [ trn_fns[o] for o in trn_ids ],
    'bbox': bbs
}, columns=['fn', 'bbox'])
df.to_csv(BB_CSV, index=False)

In [334]:

BB_CSV.open().readlines()[:5] # read up to 5 lines

Out[334]:

['fn,bbox\n',
 '008197.jpg,186 450 226 496\n',
 '008199.jpg,84 363 374 498\n',
 '008202.jpg,110 190 371 457\n',
 '008203.jpg,187 37 359 303\n']

**================= ASIDE: A QUICK WAY TO GET THE BOUNDING BOXES IN CSV FORMAT =================**

The following Pandas data processing pipeline was based on this code snippet, thanks to Phani Srikanth (@binga)

By using Pandas, we can do things much simpler than using Python collections.defaultdict.

We can quickly get the bounding boxes into fastai CSV format, ready for bounding box regression.

The more you get to know Pandas, the more often you realize it is a good way to solve lots of different problems.

In [11]:

with open(PATH / 'pascal_train2007.json') as i:
    d = json.load(i)
    
print(d.keys())

categories = pd.DataFrame(d[CATEGORIES])
annotations = pd.DataFrame(d[ANNOTATIONS])
images = pd.DataFrame(d[IMAGES])

dict_keys(['images', 'type', 'annotations', 'categories'])

In [12]:

images.head()

Out[12]:

	file_name	height	id	width
0	000012.jpg	333	12	500
1	000017.jpg	364	17	480
2	000023.jpg	500	23	334
3	000026.jpg	333	26	500
4	000032.jpg	281	32	500

In [13]:

categories.head()

Out[13]:

	id	name	supercategory
0	1	aeroplane	none
1	2	bicycle	none
2	3	bird	none
3	4	boat	none
4	5	bottle	none

In [14]:

annotations.head()

Out[14]:

	area	bbox	category_id	id	image_id	segmentation
0	34104	[155, 96, 196, 174]	7	1	12	[[155, 96, 155, 270, 351, 270, 351, 96]]
1	13110	[184, 61, 95, 138]	15	2	17	[[184, 61, 184, 199, 279, 199, 279, 61]]
2	81326	[89, 77, 314, 259]	13	3	17	[[89, 77, 89, 336, 403, 336, 403, 77]]
3	64227	[8, 229, 237, 271]	2	4	23	[[8, 229, 8, 500, 245, 500, 245, 229]]
4	29505	[229, 219, 105, 281]	2	5	23	[[229, 219, 229, 500, 334, 500, 334, 219]]

In [16]:

data = (
    annotations
        .merge(categories, how='left', left_on=CAT_ID, right_on=ID)
        .merge(images, how='left', left_on=IMG_ID, right_on=ID)
)
data.head()

Out[16]:

	area	bbox	category_id	id_x	image_id	segmentation	id_y	name	supercategory	file_name	height	id	width
0	34104	[155, 96, 196, 174]	7	1	12	[[155, 96, 155, 270, 351, 270, 351, 96]]	7	car	none	000012.jpg	333	12	500
1	13110	[184, 61, 95, 138]	15	2	17	[[184, 61, 184, 199, 279, 199, 279, 61]]	15	person	none	000017.jpg	364	17	480
2	81326	[89, 77, 314, 259]	13	3	17	[[89, 77, 89, 336, 403, 336, 403, 77]]	13	horse	none	000017.jpg	364	17	480
3	64227	[8, 229, 237, 271]	2	4	23	[[8, 229, 8, 500, 245, 500, 245, 229]]	2	bicycle	none	000023.jpg	500	23	334
4	29505	[229, 219, 105, 281]	2	5	23	[[229, 219, 229, 500, 334, 500, 334, 219]]	2	bicycle	none	000023.jpg	500	23	334

In [17]:

largest_bbox = data.pivot_table(index=FILE_NAME, values='area', aggfunc=max).reset_index()
largest_bbox = largest_bbox.merge(data[['area', BBOX, IMG_ID, FILE_NAME, 'name']], how='left')
largest_bbox.head()

Out[17]:

	file_name	area	bbox	image_id	name
0	000012.jpg	34104	[155, 96, 196, 174]	12	car
1	000017.jpg	81326	[89, 77, 314, 259]	17	horse
2	000023.jpg	111101	[2, 1, 241, 461]	23	person
3	000026.jpg	21824	[89, 124, 248, 88]	26	car
4	000032.jpg	28832	[103, 77, 272, 106]	32	aeroplane

In [18]:

# Pandas version of bb_hw
def bb_hw_pandas(x):
    # Example, x = [155, 96, 196, 174]
    return [x[1], x[0], x[3] + x[1] - 1, x[2] + x[0] - 1]

In [28]:

# format bbox list as string. convert values to string.
largest_bbox['bbox_new'] = largest_bbox[BBOX].apply(lambda x: bb_hw_pandas(x))
largest_bbox['bbox_str'] = largest_bbox['bbox_new'].apply(lambda x: ' '.join(str(y) for y in x))
largest_bbox.head()

Out[28]:

	file_name	area	bbox	image_id	name	bbox_new	bbow_str	bbox_str
0	000012.jpg	34104	[155, 96, 196, 174]	12	car	[96, 155, 269, 350]	96 155 269 350	96 155 269 350
1	000017.jpg	81326	[89, 77, 314, 259]	17	horse	[77, 89, 335, 402]	77 89 335 402	77 89 335 402
2	000023.jpg	111101	[2, 1, 241, 461]	23	person	[1, 2, 461, 242]	1 2 461 242	1 2 461 242
3	000026.jpg	21824	[89, 124, 248, 88]	26	car	[124, 89, 211, 336]	124 89 211 336	124 89 211 336
4	000032.jpg	28832	[103, 77, 272, 106]	32	aeroplane	[77, 103, 182, 374]	77 103 182 374	77 103 182 374

In [29]:

largest_bbox[[FILE_NAME, 'bbox_str']].to_csv(BB_CSV, index=False)

In [32]:

!head -n 10 {BB_CSV}

file_name,bbox_str
000012.jpg,96 155 269 350
000017.jpg,77 89 335 402
000023.jpg,1 2 461 242
000026.jpg,124 89 211 336
000032.jpg,77 103 182 374
000033.jpg,106 8 262 498
000034.jpg,166 115 399 359
000035.jpg,97 217 317 464
000036.jpg,78 26 343 318

In [33]:

BB_CSV.open().readlines()[:5] # read up to 5 lines

Out[33]:

['file_name,bbox_str\n',
 '000012.jpg,96 155 269 350\n',
 '000017.jpg,77 89 335 402\n',
 '000023.jpg,1 2 461 242\n',
 '000026.jpg,124 89 211 336\n']

======================================== END OF ASIDE ========================================

In [335]:

f_model = resnet34
sz = 224
bs = 64

Set continuous=True to tell fastai this is a regression problem, which means it won't one-hot encode the labels, and will use MSE as the default crit.

Note that we have to tell the transforms constructor that our labels are coordinates, so that it can handle the transforms correctly.

Also, we use CropType.NO because we want to 'squish' the rectangular images into squares, rather than center cropping, so that we don't accidentally crop out some of the objects. (This is less of an issue in something like imagenet, where there is a single object to classify, and it's generally large and centrally located).

In [336]:

augs = [RandomFlip(),
        RandomRotate(30),
        RandomLighting(0.1,0.1)]

In [340]:

tfms = tfms_from_model(f_model, sz, crop_type=CropType.NO, aug_tfms=augs)
md = ImageClassifierData.from_csv(PATH, JPEGS, BB_CSV, tfms=tfms, continuous=True, bs=4)

In [341]:

idx = 3
fig, axes = plt.subplots(3, 3, figsize=(9, 9))

for i, ax in enumerate(axes.flat):
    x, y = next(iter(md.aug_dl))
    ima = md.val_ds.denorm(to_np(x))[idx]
    b = bb_hw(to_np(y[idx]))
    print('b:', b)
    show_img(ima, ax=ax)
    draw_rect(ax, b)

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

b: [  1.  89. 499. 192.]
b: [  1.  89. 499. 192.]

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

b: [  1.  89. 499. 192.]

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

b: [  1.  89. 499. 192.]
b: [  1.  89. 499. 192.]
b: [  1.  89. 499. 192.]
b: [  1.  89. 499. 192.]

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

b: [  1.  89. 499. 192.]
b: [  1.  89. 499. 192.]

In [342]:

augs = [RandomFlip(tfm_y=TfmType.COORD),
        RandomRotate(30, tfm_y=TfmType.COORD),
        RandomLighting(0.1,0.1, tfm_y=TfmType.COORD)]

In [343]:

tfms = tfms_from_model(f_model, sz, crop_type=CropType.NO, aug_tfms=augs, tfm_y=TfmType.COORD)
md = ImageClassifierData.from_csv(PATH, JPEGS, BB_CSV, tfms=tfms, continuous=True, bs=4)

In [344]:

idx = 3
fig, axes = plt.subplots(3, 3, figsize=(9, 9))

for i, ax in enumerate(axes.flat):
    x, y = next(iter(md.aug_dl))
    ima = md.val_ds.denorm(to_np(x))[idx]
    b = bb_hw(to_np(y[idx]))
    print(b)
    show_img(ima, ax=ax)
    draw_rect(ax, b)

[  1.  60. 221. 125.]

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

[  0.  12. 224. 211.]

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

[  0.   9. 224. 214.]

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

[  0.  21. 224. 202.]
[  0.   0. 224. 223.]

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

[  0.  55. 224. 135.]

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

[  0.  15. 224. 208.]

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

[  0.  31. 224. 182.]
[  0.  53. 224. 139.]

Note

You may notice that sometimes it looks odd like the middle on in the bottom row. This is the constraint of the information we have. If the object occupied the corners of the original bounding box, your new bounding box needs to be bigger after the image rotates. So you must be careful of not doing too higher rotations with bounding boxes because there is not enough information for them to stay accurate. If we were doing polygons or segmentations, we would not have this problem.

In [346]:

tfm_y = TfmType.COORD
augs = [RandomFlip(tfm_y=tfm_y),
        RandomRotate(3, p=0.5, tfm_y=tfm_y),
        RandomLighting(0.05,0.05, tfm_y=tfm_y)]

tfms = tfms_from_model(f_model, sz, crop_type=CropType.NO, tfm_y=tfm_y, aug_tfms=augs)
md = ImageClassifierData.from_csv(PATH, JPEGS, BB_CSV, tfms=tfms, bs=bs, continuous=True)

fastai let's you use a custom_head to add your own module on top of a convnet, instead of the adaptive pooling and fully connected net which is added by default. In this case, we don't want to do any pooling, since we need to know the activations of each grid cell.

The final layer has 4 activations, one per bounding box coordinate. Our target is continuous, not categorical, so the MSE loss function used does not do any sigmoid or softmax to the module outputs.

In [66]:

512*7*7

Out[66]:

In [348]:

head_reg4 = nn.Sequential(Flatten(), nn.Linear(512*7*7, 4))
learn = ConvLearner.pretrained(f_model, md, custom_head=head_reg4)
learn.opt_fn = optim.Adam
learn.crit = nn.L1Loss()

In [349]:

learn.summary()

Out[349]:

OrderedDict([('Conv2d-1',
              OrderedDict([('input_shape', [-1, 3, 224, 224]),
                           ('output_shape', [-1, 64, 112, 112]),
                           ('trainable', False),
                           ('nb_params', 9408)])),
             ('BatchNorm2d-2',
              OrderedDict([('input_shape', [-1, 64, 112, 112]),
                           ('output_shape', [-1, 64, 112, 112]),
                           ('trainable', False),
                           ('nb_params', 128)])),
             ('ReLU-3',
              OrderedDict([('input_shape', [-1, 64, 112, 112]),
                           ('output_shape', [-1, 64, 112, 112]),
                           ('nb_params', 0)])),
             ('MaxPool2d-4',
              OrderedDict([('input_shape', [-1, 64, 112, 112]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('Conv2d-5',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 36864)])),
             ('BatchNorm2d-6',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 128)])),
             ('ReLU-7',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('Conv2d-8',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 36864)])),
             ('BatchNorm2d-9',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 128)])),
             ('ReLU-10',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('BasicBlock-11',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('Conv2d-12',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 36864)])),
             ('BatchNorm2d-13',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 128)])),
             ('ReLU-14',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('Conv2d-15',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 36864)])),
             ('BatchNorm2d-16',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 128)])),
             ('ReLU-17',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('BasicBlock-18',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('Conv2d-19',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 36864)])),
             ('BatchNorm2d-20',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 128)])),
             ('ReLU-21',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('Conv2d-22',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 36864)])),
             ('BatchNorm2d-23',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('trainable', False),
                           ('nb_params', 128)])),
             ('ReLU-24',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('BasicBlock-25',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 64, 56, 56]),
                           ('nb_params', 0)])),
             ('Conv2d-26',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 73728)])),
             ('BatchNorm2d-27',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 256)])),
             ('ReLU-28',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('Conv2d-29',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 147456)])),
             ('BatchNorm2d-30',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 256)])),
             ('Conv2d-31',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 8192)])),
             ('BatchNorm2d-32',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 256)])),
             ('ReLU-33',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('BasicBlock-34',
              OrderedDict([('input_shape', [-1, 64, 56, 56]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('Conv2d-35',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 147456)])),
             ('BatchNorm2d-36',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 256)])),
             ('ReLU-37',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('Conv2d-38',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 147456)])),
             ('BatchNorm2d-39',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 256)])),
             ('ReLU-40',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('BasicBlock-41',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('Conv2d-42',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 147456)])),
             ('BatchNorm2d-43',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 256)])),
             ('ReLU-44',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('Conv2d-45',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 147456)])),
             ('BatchNorm2d-46',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 256)])),
             ('ReLU-47',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('BasicBlock-48',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('Conv2d-49',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 147456)])),
             ('BatchNorm2d-50',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 256)])),
             ('ReLU-51',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('Conv2d-52',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 147456)])),
             ('BatchNorm2d-53',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('trainable', False),
                           ('nb_params', 256)])),
             ('ReLU-54',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('BasicBlock-55',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 128, 28, 28]),
                           ('nb_params', 0)])),
             ('Conv2d-56',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 294912)])),
             ('BatchNorm2d-57',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-58',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-59',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-60',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('Conv2d-61',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 32768)])),
             ('BatchNorm2d-62',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-63',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('BasicBlock-64',
              OrderedDict([('input_shape', [-1, 128, 28, 28]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-65',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-66',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-67',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-68',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-69',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-70',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('BasicBlock-71',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-72',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-73',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-74',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-75',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-76',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-77',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('BasicBlock-78',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-79',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-80',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-81',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-82',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-83',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-84',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('BasicBlock-85',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-86',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-87',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-88',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-89',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-90',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-91',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('BasicBlock-92',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-93',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-94',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-95',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-96',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 589824)])),
             ('BatchNorm2d-97',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('trainable', False),
                           ('nb_params', 512)])),
             ('ReLU-98',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('BasicBlock-99',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 256, 14, 14]),
                           ('nb_params', 0)])),
             ('Conv2d-100',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 1179648)])),
             ('BatchNorm2d-101',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 1024)])),
             ('ReLU-102',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('nb_params', 0)])),
             ('Conv2d-103',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 2359296)])),
             ('BatchNorm2d-104',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 1024)])),
             ('Conv2d-105',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 131072)])),
             ('BatchNorm2d-106',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 1024)])),
             ('ReLU-107',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('nb_params', 0)])),
             ('BasicBlock-108',
              OrderedDict([('input_shape', [-1, 256, 14, 14]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('nb_params', 0)])),
             ('Conv2d-109',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 2359296)])),
             ('BatchNorm2d-110',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 1024)])),
             ('ReLU-111',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('nb_params', 0)])),
             ('Conv2d-112',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 2359296)])),
             ('BatchNorm2d-113',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 1024)])),
             ('ReLU-114',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('nb_params', 0)])),
             ('BasicBlock-115',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('nb_params', 0)])),
             ('Conv2d-116',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 2359296)])),
             ('BatchNorm2d-117',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 1024)])),
             ('ReLU-118',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('nb_params', 0)])),
             ('Conv2d-119',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 2359296)])),
             ('BatchNorm2d-120',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('trainable', False),
                           ('nb_params', 1024)])),
             ('ReLU-121',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('nb_params', 0)])),
             ('BasicBlock-122',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 512, 7, 7]),
                           ('nb_params', 0)])),
             ('Flatten-123',
              OrderedDict([('input_shape', [-1, 512, 7, 7]),
                           ('output_shape', [-1, 25088]),
                           ('nb_params', 0)])),
             ('Linear-124',
              OrderedDict([('input_shape', [-1, 25088]),
                           ('output_shape', [-1, 4]),
                           ('trainable', True),
                           ('nb_params', 100356)]))])

In [350]:

learn.lr_find(1e-5, 100)
learn.sched.plot(5)

HBox(children=(IntProgress(value=0, description='Epoch', max=1), HTML(value='')))

 78%|███████▊  | 25/32 [00:11<00:03,  2.14it/s, loss=529]

In [351]:

lr = 2e-3

In [352]:

learn.fit(lr, 2, cycle_len=1, cycle_mult=2)

HBox(children=(IntProgress(value=0, description='Epoch', max=3), HTML(value='')))

epoch      trn_loss   val_loss                            
    0      48.960351  35.755788 
    1      37.135304  29.60765                            
    2      31.466736  29.009163

Out[352]:

[array([29.00916])]

In [353]:

lrs = np.array([lr/100, lr/10, lr])

In [354]:

learn.freeze_to(-2)

In [355]:

lrf = learn.lr_find(lrs/1000)
learn.sched.plot(1)

HBox(children=(IntProgress(value=0, description='Epoch', max=1), HTML(value='')))

epoch      trn_loss   val_loss                            
    0      82.31227   1.4744848065204166e+17

In [356]:

learn.fit(lrs, 2, cycle_len=1, cycle_mult=2)

HBox(children=(IntProgress(value=0, description='Epoch', max=3), HTML(value='')))

epoch      trn_loss   val_loss                            
    0      25.858838  25.091344 
    1      22.565964  22.855172                           
    2      19.391733  21.236308

Out[356]:

[array([21.23631])]

In [357]:

learn.freeze_to(-3)

In [358]:

learn.fit(lrs, 1, cycle_len=2)

HBox(children=(IntProgress(value=0, description='Epoch', max=2), HTML(value='')))

epoch      trn_loss   val_loss                            
    0      18.009395  21.977178 
    1      16.113632  20.927288

Out[358]:

[array([20.92729])]

In [360]:

learn.save('reg4')

In [361]:

learn.load('reg4')

Let's see how our model did¶

In [362]:

x, y = next(iter(md.val_dl))
learn.model.eval()
preds = to_np(learn.model(VV(x)))

In [363]:

fig, axes = plt.subplots(3, 4, figsize=(12, 8))

for i, ax in enumerate(axes.flat):
    ima = md.val_ds.denorm(to_np(x))[i]
    b = bb_hw(preds[i])
    ax = show_img(ima, ax=ax)
    draw_rect(ax, b)
plt.tight_layout()

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

Single object detection¶

1. Data¶

In [364]:

f_model=resnet34
sz=224
bs=64

val_idxs = get_cv_idxs(len(trn_fns))

======================================== Start Debugging - CSV data ========================================

In [371]:

CSV_FILES = PATH / 'tmp'

In [372]:

!ls {CSV_FILES}

bb.csv	lrg.csv

BB_CSV (Bounding Box CSV File)¶

CSV of the bounding box of the largest object. This is simply a regression with 4 outputs (predicted values). So we can use a CSV with multiple 'labels'.

In [373]:

!head -n 10 {CSV_FILES}/bb.csv

fn,bbox
008197.jpg,186 450 226 496
008199.jpg,84 363 374 498
008202.jpg,110 190 371 457
008203.jpg,187 37 359 303
000012.jpg,96 155 269 350
008204.jpg,144 142 335 265
000017.jpg,77 89 335 402
008211.jpg,181 77 499 281
008213.jpg,125 291 166 330

CSV (Large Object CSV File)¶

CSV of the image filename and the class of the largest object (from annotations JSON).

In [374]:

!head -n 10 {CSV_FILES}/lrg.csv

fn,cat
008197.jpg,car
008199.jpg,person
008202.jpg,cow
008203.jpg,sofa
000012.jpg,car
008204.jpg,person
000017.jpg,horse
008211.jpg,person
008213.jpg,chair

======================================== End Debugging - CSV data ========================================

In [366]:

tfms = tfms_from_model(f_model, sz, crop_type=CropType.NO, tfm_y=TfmType.COORD, aug_tfms=augs)

# Model data for bounding box of the largest object.
md = ImageClassifierData.from_csv(PATH, JPEGS, BB_CSV, tfms=tfms,
                                  bs=bs, continuous=True, val_idxs=val_idxs)

In [367]:

# Model data for classification of the largest object.
md2 = ImageClassifierData.from_csv(PATH, JPEGS, CSV, tfms=tfms_from_model(f_model, sz))

A dataset can be anything with __len__ and __getitem__. Here's a dataset that adds a 2nd label to an existing dataset:

In [375]:

class ConcatLblDataset(Dataset):
    """
    A dataset that adds a second label to an existing dataset.
    """
    
    def __init__(self, ds, y2):
        self.ds, self.y2 = ds, y2
    
    def __len__(self):
        return len(self.ds)
    
    def __getitem__(self, i):
        x, y = self.ds[i]
        return (x, (y, self.y2[i]))

We'll use it to add the classes to the bounding boxes labels.

In [376]:

trn_ds2 = ConcatLblDataset(md.trn_ds, md2.trn_y)
val_ds2 = ConcatLblDataset(md.val_ds, md2.val_y)

In [380]:

# Grab the two 'label' (bounding box & class) from a record in the validation dataset.
val_ds2[0][1] # record at index 0. labels at index 1, input image x at index 0 (we are not grabbing this)

Out[380]:

(array([  0.,   1., 223., 178.], dtype=float32), 14)

We can replace the dataloaders' datasets with these new ones.

In [381]:

md.trn_dl.dataset = trn_ds2
md.val_dl.dataset = val_ds2

We have to denormalize the images from the dataloader before they can be plotted.

In [408]:

idx = 9

In [383]:

x, y = next(iter(md.val_dl)) # x is image array, y is labels

In [405]:

# Debug y variable
print(f'type of y: {type(y)}, y length: {len(y)}')
print(y[0].size()) # bounding box top-left coord & bottom-right coord values
print(y[1].size()) # object category (class)

type of y: <class 'list'>, y length: 2
torch.Size([64, 4])
torch.Size([64])

In [406]:

# y[0] returns 64 set of bounding boxes (labels).
# Here's we only grab the first 2 images' bounding boxes. The returned data type is PyTorch FloatTensor in GPU.
print(y[0][:2])

# Grab the first 2 images' object classes. The returned data type is PyTorch LongTensor in GPU.
print(y[1][:2])

   0    1  223  178
   7  123  186  194
[torch.cuda.FloatTensor of size 2x4 (GPU 0)]


 14
  3
[torch.cuda.LongTensor of size 2 (GPU 0)]

In [388]:

# Debug x data from GPU
x.size() # batch of 64 images, each image with 3 channels and size of 224x224

Out[388]:

torch.Size([64, 3, 224, 224])

In [393]:

# Debug x data from CPU
to_np(x).shape

Out[393]:

(64, 3, 224, 224)

In [409]:

ima = md.val_ds.ds.denorm(to_np(x))[idx] # reverse the normalization done to a batch of images.

In [410]:

b = bb_hw(to_np(y[0][idx]))
b

Out[410]:

array([134., 148.,  36.,  48.])

In [411]:

ax = show_img(ima)
draw_rect(ax, b)
draw_text(ax, b[:2], md2.classes[y[1][idx]])

2. Architecture [00:13:54]¶

We need one output activation for each class (for its probability) plus one for each bounding box coordinate. We'll use an extra linear layer this time, plus some dropout, to help us train a more flexible model.

In [419]:

head_reg4 = nn.Sequential(
    Flatten(),
    nn.ReLU(),
    nn.Dropout(0.5),
    nn.Linear(25088, 256),
    nn.ReLU(),
    nn.BatchNorm1d(256),
    nn.Dropout(0.5),
    nn.Linear(256, 4 + len(cats))
)
models = ConvnetBuilder(f_model, 0, 0, 0, custom_head=head_reg4)

learn = ConvLearner(md, models)
learn.opt_fn = optim.Adam

In [418]:

# DEBUG: what's inside cats
print(type(cats))
print(len(cats))
print('%s, %s' % (cats[1], cats[2]))

<class 'dict'>
20
aeroplane, bicycle

3. Loss Function [00:15:46]¶

Code comments:

input : activations.
target : ground truth.
bb_t, c_t = target : our custom dataset returns a tuple containing bounding box coordinates and classes. This assignment will destructure them.
bb_i, c_i = input[:, :4], input[:, 4:] : the first : is for the batch dimension. e.g.: 64 (for 64 images).
b_i = F.sigmoid(bb_i) * 224 : we know our image is 224 by 224. Sigmoid will force it to be between 0 and 1, and multiply it by 224 to help our neural net to be in the range of what it has to be.

In [429]:

def detn_loss(input, target):
    """
    Loss function for the position and class of the largest object in the image.
    """    
    bb_t, c_t = target
    # bb_i: the 4 values for the bbox
    # c_i: the 20 classes `len(cats)`
    bb_i, c_i = input[:, :4], input[:, 4:]
    bb_i = F.sigmoid(bb_i) * 224 # scale bbox values to stay between 0 and 224 (224 is the max img width or height)
    bb_l = F.l1_loss(bb_i, bb_t) # bbox loss
    clas_l = F.cross_entropy(c_i, c_t) # object class loss
    # I looked at these quantities separately first then picked a multiplier
    # to make them approximately equal
    return bb_l + clas_l * 20

In [430]:

def detn_l1(input, target):
    """
    Loss for the first 4 activations.

    L1Loss is like a Mean Squared Error — instead of sum of squared errors, it uses sum of absolute values
    """
    bb_t, _ = target
    bb_i = input[:, :4]
    bb_i = F.sigmoid(bb_i) * 224
    return F.l1_loss(V(bb_i), V(bb_t)).data

In [424]:

def detn_acc(input, target):
    """
    Accuracy
    """
    _, c_t = target
    c_i = input[:, 4:]
    return accuracy(c_i, c_t)

In [425]:

learn.crit = detn_loss
learn.metrics = [detn_acc, detn_l1]

In [426]:

# With the metrics defined, we find the learning rate
learn.lr_find()
learn.sched.plot()

HBox(children=(IntProgress(value=0, description='Epoch', max=1), HTML(value='')))

 97%|█████████▋| 31/32 [00:13<00:00,  2.32it/s, loss=478]

In [427]:

lr = 1e-2

In [428]:

learn.fit(lr, 1, cycle_len=3, use_clr=(32, 5))

HBox(children=(IntProgress(value=0, description='Epoch', max=3), HTML(value='')))

epoch      trn_loss   val_loss   detn_acc   detn_l1       
    0      71.055205  48.157942  0.754      33.202651 
    1      51.411235  39.722549  0.776      26.363626     
    2      42.721873  38.36225   0.786      25.658993

Out[428]:

[array([38.36225]), 0.7860000019073486, 25.65899333190918]

In [431]:

learn.save('reg1_0')

In [432]:

learn.freeze_to(-2)

In [433]:

lrs = np.array([lr/100, lr/10, lr])

In [434]:

learn.lr_find(lrs/1000)
learn.sched.plot(0)

HBox(children=(IntProgress(value=0, description='Epoch', max=1), HTML(value='')))

 91%|█████████ | 29/32 [00:19<00:02,  1.47it/s, loss=331]

In [435]:

learn.fit(lrs/5, 1, cycle_len=5, use_clr=(32, 10))

HBox(children=(IntProgress(value=0, description='Epoch', max=5), HTML(value='')))

epoch      trn_loss   val_loss   detn_acc   detn_l1       
    0      36.650519  37.198765  0.768      23.865814 
    1      30.822986  36.280846  0.776      22.743629     
    2      26.792856  35.199342  0.756      21.564384     
    3      23.786961  33.644777  0.794      20.626075     
    4      21.58091   33.194585  0.788      20.520627

Out[435]:

[array([33.19459]), 0.788, 20.52062666320801]

In [436]:

learn.save('reg1_1')

In [437]:

learn.load('reg1_1')

In [438]:

learn.unfreeze()

In [439]:

learn.fit(lrs/10, 1, cycle_len=10, use_clr=(32, 10))

HBox(children=(IntProgress(value=0, description='Epoch', max=10), HTML(value='')))

epoch      trn_loss   val_loss   detn_acc   detn_l1       
    0      19.133272  33.833656  0.804      20.774298 
    1      18.754909  35.271939  0.77       20.572007     
    2      17.824877  35.099138  0.776      20.494296     
    3      16.8321    33.782667  0.792      20.139132     
    4      15.968     33.525141  0.788      19.848904     
    5      15.356815  33.827995  0.782      19.483242     
    6      14.589975  33.49683   0.778      19.531291     
    7      13.811117  33.022376  0.794      19.462907     
    8      13.238251  33.300647  0.794      19.423868     
    9      12.613972  33.260653  0.788      19.346758

Out[439]:

[array([33.26065]), 0.7880000019073486, 19.34675830078125]

In [440]:

learn.save('reg1')

In [441]:

learn.load('reg1')

In [442]:

y = learn.predict()
x, _ = next(iter(md.val_dl))

In [61]:

from scipy.special import expit

In [443]:

from scipy.special import expit

In [62]:

fig, axes = plt.subplots(3, 4, figsize=(12, 8))
for i,ax in enumerate(axes.flat):
    ima=md.val_ds.ds.denorm(to_np(x))[i]
    bb = expit(y[i][:4])*224
    b = bb_hw(bb)
    c = np.argmax(y[i][4:])
    ax = show_img(ima, ax=ax)
    draw_rect(ax, b)
    draw_text(ax, b[:2], md2.classes[c])
plt.tight_layout()

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

In [445]:

fig, axes = plt.subplots(3, 4, figsize=(12, 8))

for i, ax in enumerate(axes.flat):
    ima = md.val_ds.ds.denorm(to_np(x))[i]
    bb = expit(y[i][:4]) * 224
    b = bb_hw(bb)
    c = np.argmax(y[i][4:])
    ax = show_img(ima, ax=ax)
    draw_rect(ax, b)
    draw_text(ax, b[:2], md2.classes[c])
plt.tight_layout()

Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

End¶

In [ ]: