Skip to content

Latest commit

 

History

History
122 lines (90 loc) · 5.36 KB

README.md

File metadata and controls

122 lines (90 loc) · 5.36 KB

Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism

fps-trt7 fps-8

Benchmark

Model Size Self-Distill Pre-Train
Backbone
mAPval
0.5:0.95
Speed (fps)
T4-TRT7-FP16
bs1 / bs32
Speed (fps)
T4-TRT8-FP16
bs1 / bs32
Params
(M)
FLOPs
(G)
Weight
Gold-YOLO-N 640 39.9 563 / 1030 657 / 1191 5.6 12.1 Google Drive
cowtransfer
Gold-YOLO-S 640 46.4 286 / 446 308 / 492 21.5 46.0 Google Drive
cowtransfer
Gold-YOLO-M 640 51.1 152 / 220 157 / 241 41.3 87.5 Google Drive
cowtransfer
Gold-YOLO-L 640 53.3 88 / 116 94 / 137 75.1 151.7 Google Drive
cowtransfer

Table Notes

  • Results of the mAP and speed are evaluated on COCO val2017 dataset with the input resolution of 640×640.
  • Speed is tested with TensorRT 7.2 and TensorRT 8.5 on T4 GPU.

Environment

  • python requirements

    pip install -r requirements.txt
  • data:

    prepare COCO dataset, YOLO format coco labels and specify dataset paths in data.yaml

Train

Gold-YOLO-N

  • Step 1: Training a base model

    Be sure to open use_dfl mode in config file (use_dfl=True, reg_max=16)

    python -m torch.distributed.launch --nproc_per_node 8 tools/train.py \
    									--batch 128 \
    									--conf configs/gold_yolo-n.py \
    									--data data/coco.yaml \
    									--epoch 300 \
    									--fuse_ab \
    									--use_syncbn \
    									--device 0,1,2,3,4,5,6,7 \
    									--name gold_yolo-n
  • Step 2: Self-distillation training

    Be sure to open use_dfl mode in config file (use_dfl=True, reg_max=16)

    python -m torch.distributed.launch --nproc_per_node 8 tools/train.py \
    									--batch 128 \
    									--conf configs/gold_yolo-n.py \
    									--data data/coco.yaml \
    									--epoch 300 \
    									--device 0,1,2,3,4,5,6,7 \
    									--use_syncbn \
    									--distill \
    									--teacher_model_path runs/train/gold_yolo_n/weights/best_ckpt.pt \
    									--name gold_yolo-n

Gold-YOLO-S/M/L

  • Step 1: Training a base model

    Be sure to open use_dfl mode in config file (use_dfl=True, reg_max=16)

    python -m torch.distributed.launch --nproc_per_node 8 tools/train.py \
    									--batch 256 \
    									--conf configs/gold_yolo-s.py \ # gold_yolo-m/gold_yolo-l
    									--data data/coco.yaml \
    									--epoch 300 \
    									--fuse_ab \
    									--use_syncbn \
    									--device 0,1,2,3,4,5,6,7 \
    									--name gold_yolo-s # gold_yolo-m/gold_yolo-l
  • Step 2: Self-distillation training

    Be sure to open use_dfl mode in config file (use_dfl=True, reg_max=16)

    python -m torch.distributed.launch --nproc_per_node 8 tools/train.py \
    									--batch 256 \ # 128 for distillation of gold_yolo-l
    									--conf configs/gold_yolo-s.py \ # gold_yolo-m/gold_yolo-l
    									--data data/coco.yaml \
    									--epoch 300 \
    									--device 0,1,2,3,4,5,6,7 \
    									--use_syncbn \
    									--distill \
    									--teacher_model_path runs/train/gold_yolo-s/weights/best_ckpt.pt \
    									--name gold_yolo-s # gold_yolo-m/gold_yolo-l

Evaluation

python tools/eval.py --data data/coco.yaml --batch 32 --weights weights/Gold_s_pre_dist.pt --task val --reproduce_640_eval

Test speed

Please refer to Test speed

Acknowledgement

The implementation is based on YOLOv6, and some implementations borrowed from Topformer. Thanks for their open source code.