Code used for Hotel-ID to Combat Human Trafficking 2021 - FGVC8 kaggle competition. The task was to identify hotel to which given image of a room belongs to.
Detailed description: https://www.kaggle.com/c/hotel-id-2021-fgvc8/discussion/242207
For training I used only competition data rescaled and padded to 512x512 pixels but including external data (like Hotels-50K dataset) can improve the score significantly. Following autmentations were used during training: HorizontalFlip, VerticalFlip, ShiftScaleRotate, OpticalDistortion, IAAPerspective, CoarseDropout, RandomBrightness.
EDA: src/hotel-id-eda-with-plotly.ipynb (nbviewer)
Image preprocessing notebook: src/hotel-id-preprocess-images.ipynb
512x512 dataset: https://www.kaggle.com/michaln/hotelid-images-512x512-padded
256x256 dataset: https://www.kaggle.com/michaln/hotelid-images-256x256-padded
Notebook to download Hotels-50K dataset: src/download-hotels-50K.ipynb
Trained 3 types of models with different backbones:
ArcMargin model: src/training/hotel-id-arcmargin-training.ipynb
CosFace model: src/training/hotel-id-cosface-training.ipynb
Classification model: src/training/hotel-id-classification-training.ipynb
Parameters: Lookahead (k=3) + AdamW optimizer, OneCycleLR scheduler, CrossEntropyLoss/CosFace loss
These models were used to generate embeddings for each image which were then used to calculated cosine similarity of the test images to the train dataset. Product of similarities was used to ensemble output from different models and to find the top 5 most similar images from different hotels.
Trained models: https://www.kaggle.com/michaln/hotelid-trained-models
Inference notebook: src/hotel-id-inference.ipynb
Evaluation metric: Mean Average Precision @5
Type | Backbone | Embed size | Public LB | Private LB | Epochs |
---|---|---|---|---|---|
ArcMargin | eca_nfnet_l0 | 1024 | 0.6564 | 0.6704 | 6/6 |
ArcMargin | efficientnet_b1 | 4096 | 0.6780 | 0.6962 | 9/9 |
Classification | eca_nfnet_l0 | 4096 | 0.6691 | 0.6875 | 6/9 |
CosFace | ecaresnet50d_pruned | 4096 | 0.6702 | 0.6796 | 9/9 |
Ensemble | 0.7273 | 0.7446 |
- Prepare data: download the preprocessed dataset or run hotel-id-preprocess-images notebook to generate images
- Train models: run hotel-id-arcmargin-training, hotel-id-cosface-training, hotel-id-classification-training notebooks, or use trained models
- Inference: Edit models and paths in inference notebook and run it on Kaggle