Lesson 2 - Download - 1.2 Training¶

Training¶

Setup¶

We have already downloaded, verified and kept the dataset ready in last session. So we just go ahead and train the model here.

In [1]:

from fastai.vision import *

np.random.seed(42)
path = Path('data/bears')
data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2,
        ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

Training model¶

In [2]:

learn = create_cnn(data, models.resnet34, metrics=[error_rate, accuracy])

In [3]:

learn.fit_one_cycle(4)

Total time: 00:13

epoch	train_loss	valid_loss	error_rate	accuracy
1	0.944976	0.257580	0.056180	0.943820
2	0.534697	0.132155	0.033708	0.966292
3	0.373873	0.135866	0.022472	0.977528
4	0.282380	0.132642	0.022472	0.977528

Whoa! We got 96% accuracy!

In [4]:

learn.save('stage-1')

In [5]:

learn.unfreeze()

In [6]:

learn.lr_find()

LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

In [7]:

learn.recorder.plot()

In [8]:

learn.fit_one_cycle(2, max_lr=slice(3e-5,3e-4))

Total time: 00:06

epoch	train_loss	valid_loss	error_rate	accuracy
1	0.088693	0.156351	0.022472	0.977528
2	0.058850	0.146993	0.022472	0.977528

In [9]:

learn.save('stage-2')

Interpretation¶

In [10]:

learn.load('stage-2');

In [11]:

interp = ClassificationInterpretation.from_learner(learn)

In [12]:

interp.plot_confusion_matrix()

So one black bear was predicted wrongly as grizzly, and another/same as teddys. Note these results could vary every time the notebook is run.

Cleaning up¶

Let us remove the files responsible for top losses. These losses are not due to bad performance, but may be due to the images themselves not belonging to where they are.

In [13]:

from fastai.widgets import *
ds, idxs = DatasetFormatter().from_toplosses(learn, ds_type=DatasetType.Valid)

In [15]:

ImageCleaner(ds, idxs, path)

'No images to show :)'

In [20]:

ds, idxs = DatasetFormatter().from_similars(learn, ds_type=DatasetType.Valid)

Getting activations...

Interrupted


TypeErrorTraceback (most recent call last)
<ipython-input-20-8a8d23411355> in <module>
----> 1 ds, idxs = DatasetFormatter().from_similars(learn, ds_type=DatasetType.Valid)

/opt/conda/lib/python3.7/site-packages/fastai/widgets/image_cleaner.py in from_similars(cls, learn, layer_ls, **kwargs)
     35     def from_similars(cls, learn, layer_ls:list=[0, 7, 2], **kwargs):
     36         "Gets the indices for the most similar images in training and validation datasets"
---> 37         train_ds, train_idxs = cls.get_similars_idxs(learn, layer_ls, **kwargs)
     38         return train_ds, train_idxs
     39 

/opt/conda/lib/python3.7/site-packages/fastai/widgets/image_cleaner.py in get_similars_idxs(cls, learn, layer_ls, **kwargs)
     44         dl = learn.data.fix_dl
     45 
---> 46         ds_actns = cls.get_actns(learn, hook=hook, dl=dl, **kwargs)
     47         similarities = cls.comb_similarity(ds_actns, ds_actns, **kwargs)
     48         idxs = cls.sort_idxs(similarities)

/opt/conda/lib/python3.7/site-packages/fastai/widgets/image_cleaner.py in get_actns(learn, hook, dl, pool, pool_dim, **kwargs)
     58         learn.model.eval()
     59         with torch.no_grad():
---> 60             for (xb,yb) in progress_bar(dl):
     61                 learn.model(xb)
     62                 actns.append((hook.stored).cpu())

/opt/conda/lib/python3.7/site-packages/fastprogress/fastprogress.py in __iter__(self)
     63         self.update(0)
     64         try:
---> 65             for i,o in enumerate(self._gen):
     66                 yield o
     67                 if self.auto_update: self.update(i+1)

/opt/conda/lib/python3.7/site-packages/fastai/basic_data.py in __iter__(self)
     68     def __iter__(self):
     69         "Process and returns items from `DataLoader`."
---> 70         for b in self.dl:
     71             #y = b[1][0] if is_listy(b[1]) else b[1] # XXX: Why is this line here?
     72             yield self.proc_batch(b)

/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
    635                 self.reorder_dict[idx] = batch
    636                 continue
--> 637             return self._process_next_batch(batch)
    638 
    639     next = __next__  # Python 2 compatibility

/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _process_next_batch(self, batch)
    656         self._put_indices()
    657         if isinstance(batch, ExceptionWrapper):
--> 658             raise batch.exc_type(batch.exc_msg)
    659         return batch
    660 

TypeError: Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/opt/conda/lib/python3.7/site-packages/fastai/data_block.py", line 526, in __getitem__
    x = x.apply_tfms(self.tfms, **self.tfmargs)
TypeError: apply_tfms() got an unexpected keyword argument 'do_crop'