iMet Collection 2019 - FGVC6¶

Simple baseline for iMet Collection 2019 competition using fastai v1

Model: densenet201
Loss: Focal loss
Metric: $F_{2}$ score

What to try next?

Different models
Optimize hyperparameter choice
Few-shot learning to improve score on classes with very few samples

In [1]:

import fastai
from fastai.vision import *
fastai.__version__

Out[1]:

'1.0.50.post1'

Initial setup¶

In [2]:

path = Path('../input/imet-2019-fgvc6/') # iMet data path

In [3]:

# Making pretrained weights work without needing to find the default filename
from torch.utils import model_zoo
Path('models').mkdir(exist_ok=True)
!cp '../input/densenet201/densenet201.pth' 'models/'
def load_url(*args, **kwargs):
    model_dir = Path('models')
    filename  = 'densenet201.pth'
    if not (model_dir/filename).is_file(): raise FileNotFoundError
    return torch.load(model_dir/filename)
model_zoo.load_url = load_url

In [4]:

# Load train dataframe
train_df = pd.read_csv(path/'train.csv')
train_df.head()

Out[4]:

	id	attribute_ids
0	1000483014d91860	147 616 813
1	1000fe2e667721fe	51 616 734 813
2	1001614cb89646ee	776
3	10041eb49b297c08	51 671 698 813 1092
4	100501c227f8beea	13 404 492 903 1093

In [5]:

# Load labels dataframe
labels_df = pd.read_csv(path/'labels.csv')
labels_df.head()

Out[5]:

	attribute_id	attribute_name
0	0	culture::abruzzi
1	1	culture::achaemenid
2	2	culture::aegean
3	3	culture::afghan
4	4	culture::after british

In [6]:

# Load sample submission
test_df = pd.read_csv(path/'sample_submission.csv')
test_df.head()

Out[6]:

	id	attribute_ids
0	10023b2cc4ed5f68	0 1 2
1	100fbe75ed8fd887	0 1 2
2	101b627524a04f19	0 1 2
3	10234480c41284c6	0 1 2
4	1023b0e2636dcea8	0 1 2

Create data object using datablock API¶

In [7]:

train, test = [ImageList.from_df(df, path=path, cols='id', folder=folder, suffix='.png') 
               for df, folder in zip([train_df, test_df], ['train', 'test'])]
data = (train.split_by_rand_pct(0.2, seed=42)
        .label_from_df(cols='attribute_ids', label_delim=' ')
        .add_test(test)
        .transform(get_transforms(), size=128)
        .databunch(path=Path('.'), bs=64).normalize())

In [8]:

data.show_batch()

Create learner with densenet121 and FocalLoss¶

For problems with high class imbalance Focal Loss is usually a better choice than the usual Cross Entropy Loss.

In [9]:

# Source: https://www.kaggle.com/c/human-protein-atlas-image-classification/discussion/78109
class FocalLoss(nn.Module):
    def __init__(self, gamma=2):
        super().__init__()
        self.gamma = gamma

    def forward(self, logit, target):
        target = target.float()
        max_val = (-logit).clamp(min=0)
        loss = logit - logit * target + max_val + \
               ((-max_val).exp() + (-logit - max_val).exp()).log()

        invprobs = F.logsigmoid(-logit * (target * 2.0 - 1.0))
        loss = (invprobs * self.gamma).exp() * loss
        if len(loss.size())==2:
            loss = loss.sum(dim=1)
        return loss.mean()

In [10]:

learn = cnn_learner(data, base_arch=models.densenet201, loss_func=FocalLoss(), metrics=fbeta)

Train the model¶

In [11]:

# Find a good learning rate
learn.lr_find()
learn.recorder.plot()

LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

In [12]:

lr = 3e-2
learn.fit_one_cycle(3, slice(lr))

66.67% [2/3 38:07<19:03]

epoch	train_loss	valid_loss	fbeta	time
0	4.721717	36.794640	0.260738	19:23
1	4.134648	119.958412	0.302826	18:44

68.21% [931/1365 10:36<04:56 3.6315]

In [13]:

learn.unfreeze()

In [14]:

learn.lr_find()
learn.recorder.plot()

LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

In [15]:

lr = 1e-3
learn.fit_one_cycle(21, slice(lr/10, lr))

9.52% [2/21 38:44<6:08:04]

epoch	train_loss	valid_loss	fbeta	time
0	3.553812	508.780853	0.363391	19:22
1	3.448148	33.238255	0.373615	19:22

71.79% [980/1365 11:42<04:36 3.4875]

Get predictions¶

In [16]:

def find_best_fixed_threshold(preds, targs, do_plot=True):
    score = []
    thrs = np.arange(0, 0.5, 0.01)
    for thr in progress_bar(thrs):
        score.append(fbeta(valid_preds[0],valid_preds[1], thresh=thr))
    score = np.array(score)
    pm = score.argmax()
    best_thr, best_score = thrs[pm], score[pm].item()
    print(f'thr={best_thr:.3f}', f'F2={best_score:.3f}')
    if do_plot:
        plt.plot(thrs, score)
        plt.vlines(x=best_thr, ymin=score.min(), ymax=score.max())
        plt.text(best_thr+0.03, best_score-0.01, f'$F_{2}=${best_score:.3f}', fontsize=14);
        plt.show()
    return best_thr

i2c = np.array([[i, c] for c, i in learn.data.train_ds.y.c2i.items()]).astype(int) # indices to class number correspondence

def join_preds(preds, thr):
    return [' '.join(i2c[np.where(t==1)[0],1].astype(str)) for t in (preds[0].sigmoid()>thr).long()]

In [17]:

# Validation predictions
valid_preds = learn.get_preds(DatasetType.Valid)
best_thr = find_best_fixed_threshold(*valid_preds)

100.00% [50/50 00:13<00:00]

thr=0.280 F2=0.545

In [18]:

# Test predictions
test_preds = learn.get_preds(DatasetType.Test)
test_df.attribute_ids = join_preds(test_preds, best_thr)
test_df.head()

Out[18]:

	id	attribute_ids
0	10023b2cc4ed5f68	1059 195 369 587 766
1	100fbe75ed8fd887	1039 231
2	101b627524a04f19	147 784 79
3	10234480c41284c6	1046 111 147 480 483 725 738 776 830
4	1023b0e2636dcea8	1046 1092 147 322 383 477 584 671 737 738 776 ...

In [19]:

test_df.to_csv('submission.csv', index=False)