The key idea of Repvgg is that by using re-parameterization, the model architecture could be trained in multi-branch but validated in single branch. Figure 1 shows the basic model architecture of Repvgg. By utilizing different values for a and b, we could get various repvgg models. Repvgg could achieve better model performance with smaller model parameters on ImageNet-1K dataset compared with previous methods.[1]
Figure 1. Architecture of Repvgg [1]
mindspore | ascend driver | firmware | cann toolkit/kernel |
---|---|---|---|
2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
Please refer to the installation instruction in MindCV.
Please download the ImageNet-1K dataset for model training and validation.
- Distributed Training
It is easy to reproduce the reported results with the pre-defined training recipe. For distributed training on multiple Ascend 910 devices, please run
# distributed training on multiple NPU devices
msrun --bind_core=True --worker_num 8 python train.py --config configs/repvgg/repvgg_a1_ascend.yaml --data_dir /path/to/imagenet
For detailed illustration of all hyper-parameters, please refer to config.py.
Note: As the global batch size (batch_size x num_devices) is an important hyper-parameter, it is recommended to keep the global batch size unchanged for reproduction or adjust the learning rate linearly to a new global batch size.
- Standalone Training
If you want to train or finetune the model on a smaller dataset without distributed training, please run:
# standalone training on single NPU device
python train.py --config configs/repvgg/repvgg_a1_ascend.yaml --data_dir /path/to/dataset --distribute False
To validate the accuracy of the trained model, you can use validate.py
and parse the checkpoint path
with --ckpt_path
.
python validate.py -c configs/repvgg/repvgg_a1_ascend.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/ckpt
Our reproduced model performance on ImageNet-1K is reported as follows.
Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode.
model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s | acc@top1 | acc@top5 | recipe | weight |
---|---|---|---|---|---|---|---|---|---|---|---|---|
repvgg_a0 | 9.13 | 8 | 32 | 224x224 | O2 | 76s | 24.12 | 10613.60 | 72.29 | 90.78 | yaml | weights |
repvgg_a1 | 14.12 | 8 | 32 | 224x224 | O2 | 81s | 28.29 | 9096.13 | 73.68 | 91.51 | yaml | weights |
Experiments are tested on ascend 910 with mindspore 2.3.1 graph mode.
model name | params(M) | cards | batch size | resolution | jit level | graph compile | ms/step | img/s | acc@top1 | acc@top5 | recipe | weight |
---|---|---|---|---|---|---|---|---|---|---|---|---|
repvgg_a0 | 9.13 | 8 | 32 | 224x224 | O2 | 50s |
20.58 | 12439.26 | 72.19 | 90.75 | yaml | weights |
repvgg_a1 | 14.12 | 8 | 32 | 224x224 | O2 | 29s | 20.70 | 12367.15 | 74.19 | 91.89 | yaml | weights |
- top-1 and top-5: Accuracy reported on the validation set of ImageNet-1K.
[1] Ding X, Zhang X, Ma N, et al. Repvgg: Making vgg-style convnets great again[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 13733-13742.