- ymir 1.1.0 docker images are compatible with ymir 1.0.0
docker pull youdaoyzbx/ymir-executor:ymir1.0.0-detectron2-tmi
docker build -t ymir/ymir-executor:ymir1.1.0-cuda111-detectron2-tmi . -f cu111.dockerfile --build-arg SERVER_MODE=dev --build-arg YMIR=1.1.0
- do not support small batch size (=2) with large learning rates (>0.001).
FloatPointError: Loss became infinite or NaN at iteration=902!
loss_dict = {'loss_cls': nan, 'loss_box_reg': nan}
-
add folder
ymirfor utils, train, infer and mining -
modify
detectron2/engine/defaults.py default_writersto change tensorboard logging directory -
modify
detectron2/engine/hooks.py EvalHookto writemonitor.txtandresult.yaml -
modify
tools/train_net.pyto modify training configuration -
modify
detectron2/evaluation/coco_evaluation.pyto save EVAL_TMP_FILE