Here we implement lightweight object detection
backbone support mobilenetv3、shufflenetv2、ghostnet、efficientnet
neck support FPN(cnn),PAN(cnn)、FPN_Slim(non-cnn),PAN_Slim(non-cnn)、BiFPN
head support gfl(Generalized Focal Loss)、gfl v2(custom)
Ubuntu18.04
PyTorch 1.7
Python 3.6
Different devices have different performance on the mobile side, and models with different capabilities can be run according to different performance. The library provides models for low-power mobile devices and powerful mobile devices.
If the model runs on a low-power mobile device, backbone can choose shufflenetv2 or moiblenetv3, and neck choosing FAN_Slim, is equivalent to an implementation of nanodet.
If it is a mobile device with strong performance, backbone can choose to only write b2 and b3 types in the effcientnet, configuration, and neck can choose BiFPN, to be the equivalent of an effcientdet implementation.
Backbone support mobilenetv3、shufflenetv2、ghostnet、efficientnet
neck support FPN(Convolution),PAN(Convolution)、FPN_Slim(Non-convolution),PAN_Slim(Non-convolution)、BiFPN
head support gfl(Generalized Focal Loss)、gfl v2(Custom version)
quarkdet can use the following permutations and combinations
EfficientNet + BiFPN + GFL
GhostNet + PAN + GFL
GhostNet + BiFPN + GFL
MobileNetV3 + PAN/PAN_Slim + GFLv2
ShuffleNetV2 + PAN/PAN_Slim + GFL and so on.
Just change the command line to a different configuration file during training, config file in the quarkdet/config folder
EfficientDet the original implementation is
EfficientNet + BiFPN + Box/Class Head
This place has been changed
EfficientNet + BiFPN + GFL(Generalized Focal Loss)
load_mosaic=False,
mosaic_probability=0.2
mosaic_area=16,
Quarkdet support to configure in the file, the sample file config/ghostnet_slim640.yml and the original mosaic data enhancement is different, not randomly change the size of the picture, but 4 pictures of the same size, using a fixed center, that is, 4 pictures, equal size, support 320 * 320,416 * 416,640 * 640 equal width and height of the same size.
Load_mosaic: indicates whether to start data enhancement.
What percentage of mosaic_probability: data is enhanced by mosaic data
If the mosaic_area:GT bbox size is less than this threshold, it will be filtered out.
Slightly different from the implementation of the official website, it can rise a little, only after the decimal point.
From class QuarkDetHead (GFLHead): # you can directly replace GFLHead with GFLHeadV2
For the original GFocalV2 implementation, please reference。
config file ghostnet_full.yml full network version ghostnet_slim.yml simplified network version
GhostNet following streamlines have been made
Remove all the layers in the stage5 whose expansion size is equal to 960. the removed layers also include
.
Conv2d 1 × 1 the number of output channels equals 960 and 1280 layers, average pooling layer and last fully connected layer
The network is intercepted from the beginning to hs2 (bn2)
For Single-GPU config
quarkdet.yml config example
device:
gpu_ids: [0]
python tools/train.py config/quarkdet.yml
For Multi-GPU config
quarkdet.yml config example
device:
gpu_ids: [0,1]
python -m torch.distributed.launch --nproc_per_node=2 --master_port 30001 tools/train.py config/quarkdet.yml
It can be used for demonstration
python ./demo/demo.py 'video' --path /media/ubuntu/data/1.mp4 --config config/efficientdet.yml --model ./workspace/efficientdet/model_best/model_best.pth
When the continuous n times of the monitored index has not been improved, the learning rate is reduced, where n is patience in the configuration.
lr_schedule:
name: ReduceLROnPlateau
mode: min
factor: 0.1
patience: 10
verbose: True
threshold: 0.00001
threshold_mode: rel
cooldown: 0
min_lr: 0
eps: 0.000000001 #1e-08
Resolution=320 * 320 epoch = 85
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.220
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.369
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.220
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.069
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.219
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.366
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.219
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.347
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.369
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.118
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.414
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.591
Computational complexity: 0.56 GFLOPs
Number of parameters: 1.77 M
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.198
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.339
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.198
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.059
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.197
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.323
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.211
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.340
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.362
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.105
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.410
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.583
If it is distributed training, it can be done in norm_cfg. Dict (type='BN', momentum=0.01,eps=1e-3, requires_grad=True). Type='BN' changed to type='SyncBN' Because no judgment is made here whether it is distributed or not, BN is written in between.
EfficientNet + BiFPN + GFL
The original was a feature of 5 level, which was reduced to 3 level here.
Automatic learning rate,epoch=190
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.230
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.369
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.237
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.078
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.246
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.357
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.236
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.377
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.397
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.136
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.459
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.609
Download
efficientdet-b2 download link
code:hl3o
https://github.com/huawei-noah/ghostnet
https://github.com/xiaolai-sqlai/mobilenetv3
https://github.com/RangiLyu/nanodet
https://github.com/ultralytics/yolov5
https://github.com/implus/GFocal
https://github.com/implus/GFocalV2