This repository contains our research work on Aerial Object Detection.
This work proposes a novel deep learning approach which optimizes the detection of objects in aerial scenes captured by UAVs. In our setup, the power-constrained drone is used only for data collection, while the computationally intensive tasks are offloaded to a GPU edge server. Our work first categorises the current methods for aerial object detection using deep learning techniques and discusses how the task is different from general object detection scenarios. We delineate the specific challenges involved and experimentally demonstrate the key design decisions which significantly affect the accuracy and robustness of model. We further propose an optimized architecture which utilizes these optimal design choices along with the recent ResNeSt backbone in order to achieve superior performance in aerial object detection. Finally, we reflect on what we have achieved and further propose several shining directions of future work to inspire further research and advancement in aerial object detection.
To train RetinaNet with VGG16 or ResNet50 feature extractor :
python keras-retinanet/keras_retinanet/bin/train.py --gpu <gpu_id> --backbone <vgg16 | resnet50> --epochs <total_epochs> --tensorboard-dir <tensorboard_dir> --compute-val-loss --config <path_to_config> --snapshot-path <snapshot_save_dir> --random-transform --snapshot <resume_snapshot> csv <train_csv> <class_mapping_csv> --val-annotations <val_csv>
To train RetinaNet with ResNeSt50 feature extractor:
python detectron2-ResNeSt/tools/train_net.py --num-gpus <num_gpus> --config-file <path_to_config>
Bibtex to be uploaded soon!
- Retinanet Paper: https://arxiv.org/pdf/1708.02002.pdf
- Blog: https://blog.zenggyu.com/en/post/2018-12-05/retinanet-explained-and-demystified/
- https://towardsdatascience.com/review-retinanet-focal-loss-object-detection-38fba6afabe4
- https://towardsdatascience.com/review-fpn-feature-pyramid-network-object-detection-262fc7482610
- https://arxiv.org/pdf/1612.03144.pdf
- https://towardsdatascience.com/neural-networks-intuitions-5-anchors-and-object-detection-fc9b12120830
- https://medium.com/@andersasac/anchor-boxes-the-key-to-quality-object-detection-ddf9d612d4f9
- https://www.youtube.com/watch?v=0frKXR-2PBY
- https://blog.zenggyu.com/en/post/2019-01-07/beyond-retinanet-and-mask-r-cnn-single-shot-instance-segmentation-with-retinamask/#fnref1
- https://towardsdatascience.com/instance-segmentation-using-mask-r-cnn-7f77bdd46abd
- https://arxiv.org/pdf/1901.03353.pdf
- https://www.youtube.com/watch?v=g7z4mkfRjI4