See open-mmlab#1665
This issue is discovered many months ago, maybe requires further determine whether it still existes, and in which scenario.
The similar issue exist on model_wrapper_cfg too, so there're some changes in mmengine.runner.runner.wrap_model too.
optim_wrapper = dict(
    type = DeepSpeedOptimWrapper,
    optimizer = dict(type=AdamW, lr=lr, weight_decay=weight_decay),
    accumulative_counts = grad_accumulation,
    constructor = dict(type=DefaultOptimWrapperConstructor),
)This source file is related: mmengine/optim/optimizer/builder.py