Infection Team's solution for the UCC AI Quest 2023
We are team Infection. This repository demonstrates our methods in the UCC AI Quest 2023 in Cork, Ireland. This repo is built based on the provided baseline code.
A brief introduction of the competition:
Cork is blessed with breathtaking landscapes and serene greenery. This year, UCC AI Quest will focus on stunning aerial images of a high-resolution camera to recognise vegetation patches in Irish natural places. The challenge aims to foster the development of reliable artificial intelligence models with the goal of informing sustainable development. It includes the release of a new dataset of realistic drone images for benchmarking semantic segmentation from various above ground levels. There are a number of awards for the best team (€5,000), the most creative solution (€1,000) and the top women of influence (€1,000). Read more about the competition on their website.
-
Our main contributions are:
- Training a DinoV2-ViTB14 with a customized two-layer FCN for semantic segmentation task.
- Combining both region-based and class-based loss functions as objective function, namely Dice loss and OHEMCE loss
- Applying Mosaic Augmentation to generate variety of complex data scenarios to enhance models' training
Introducing simple yet effective technique to finetune small-scaled dataset on such large state-of-the-art model
- Freeze the encoder backbone layers, only train/fine-tune the segmentation head/decoder layers.
- Unfreeze all the layers of the network, fine-tune the whole model with 10 times smaller learning rate
Employing ensemble method that further boost the precision of predicted masks
We choose top 5 models that have highest metric score on our validation set:
- two deeplabv3+ (efficientnet B4&B5)
- one unet++ (efficientnet B4).
- two dinov2-base
-
Technical report and slides are included
- Warmup, Public set and private sets are provided from the competition organizers
- Unzip warmup.zip/public.zip/private.zip in data folder so it has the following structure
data/$PHASE/img/train
data/$PHASE/img/valid
data/$PHASE/img/test
data/$PHASE/ann/train
data/$PHASE/ann/valid
where $PHASE could be warmup/public/private
conda create -n ucc python=3.10
conda activate ucc
git clone https://github.com/kaylode/ucc-ai-quest-2023
cd ucc-ai-quest-2023
pip install -e .
-
Firtly, configuration can be adjusted inside
infection/configs
folder depending to the $PHASE -
To train models, run the below script
sh infection/scripts/train.sh $PHASE $MODEL_NAME
where $MODEL_NAME follows this format: $ARCHITECTURE.$BACKBONE (for example, deeplabv3plus.timm-efficientnet-b4) or other models could be implemented and specified accordingly.
-
Run experiments will be stored in
runs/$PHASE/$MODEL_NAME
. -
Our best performing model is DinoV2-base
-
Training pipeline can be seen in infection/tools/train.py
-
For evaluation and submission, prediction script should be run first
sh infection/scripts/predict.sh $PHASE $MODEL_NAME
-
then the prediction will be saved into $PREDICTION_FOLDER=
submissions/prediction/$PHASE/$MODEL_NAME
-
then the submission script can be run
sh infection/scripts/submission.sh $PHASE $PREDICTION_FOLDER
- this will result in a json file for submission in
submissions/submission
folder. Remember to zip it before submission
Warning
Before submission, please rename the json file to results.json
and zip it as results.zip
Tip
Ensemble method can be run using scripts/ensemble.sh
. It will gather results from all the npy files generated from scripts/predict.sh
and combine into a new folder. Then scripts/submission.sh
can be run to generate submission file from this folder.
What we have tried but did not work:
- Several state-of-the-art models: Original ViT, OneFormer, SegFormer, MaskFormer, Mask2Former,YOLOv8
- Different segmentation losses: Focal Tversky loss, Lovasz-Softmax loss
- “Smoothen” the boundary of the segmentation masks in predictions since data annotations do not have good quality
What we haven't tried:
- Implement hyperparameter tuning more thoroughly
- Replace two-layer FCNs with more complex segmentation head (for example, Mask2Former head)