-
Notifications
You must be signed in to change notification settings - Fork 19
Details of Testing & Training scripts
Junyong Lee edited this page Mar 15, 2022
·
13 revisions
CUDA_VISIBLE_DEVICES=0 python run.py --mode [mode] --config [config] --data RealMCVSR --data_offset [data_offset] --output_offset [output_offset]
# e.g., CUDA_VISIBLE_DEVICES=0 python run.py --mode RefVSR_MFID --config config_RefVSR_MFID --data RealMCVSR --data_offset /data --output_offset ./result
-
--mode
: The name of a model to test. -
--config
: The name of a config file located as./config/[config].py
. If it is not specified, the config file used for training a model will be automatically loaded. Default:None
. -
--data
: The name of a dataset for evaluation. Default:RealMCVSR
- The data structure can be modified by the function
set_data_path(..)
in./configs/config.py
.
- The data structure can be modified by the function
-
-ckpt_name
: Loads the checkpoint with the name of the checkpoint under[LOG_ROOT]/RefVSR_CVPR2022/[mode]/checkpoint/train/epoch/ckpt/
(e.g.,python run.py --mode RefVSR --data RealMCVSR--ckpt_name RefVSR_00100.pytorch
). -
-ckpt_abs_name
. Loads the checkpoint of the absolute path (e.g.,python run.py --mode RefVSR --data RealMCVSR --ckpt_abs_name ./ckpt/RefVSR.pytorch
). -
-ckpt_epoch
: Loads the checkpoint of the specified epoch (e.g.,python run.py --mode RefVSR --data RealMCVSR --ckpt_epoch 100
). -
-ckpt_sc
: Loads the checkpoint with the best validation score (e.g.,python run.py --mode RefVSR --data RealMCVSR -ckpt_sc
). -
-vid_name
: evaluates only the specified video name (e.g.,python run.py --mode RefVSR --data RealMCVSR -ckpt_sc -vid_name 0024 0074 0121
). -
-eval_mode
: evaluation mode (e.g.,python run.py --mode RefVSR --data RealMCVSR -ckpt_sc --eval_mode quan_qual
):quan_qual
|FOV
|conf
. Default:quan_qual
. -
-quantitative_only
: compute quantitative measures (PSNR and SSIM) only. Valid only if-eval_mode
isquan_qual
(e.g.,python run.py --mode RefVSR --data RealMCVSR -ckpt_sc -quantitative_only
). Default:False
. -
-qualitative_only
: save qualitative results. Valid only if-eval_mode
isquan_qual
orFOV
(e.g.,python run.py --mode RefVSR --data RealMCVSR -ckpt_sc -is_quan -qualitative_only
). Default:False
.
# multi GPU (with DistributedDataParallel) example
CUDA_VISIBLE_DEVICES=0,1,2,3 python -B -m torch.distributed.launch --nproc_per_node=4 --master_port=9000 run.py \
--is_train \
--mode RefVSR_MFID \
--config config_RefVSR_MFID \
--data RealMCVSR \
-b 1 \
-th 8 \
-dl \
-ss \
-dist
# resuming example 1 (trainer will load a checkpoint and state (*e.g.*, learning rate, parameters of an optimizer) saved after 100 epoch, training will resume from 101 epoch)
CUDA_VISIBLE_DEVICES=0,1,2,3 python -B -m torch.distributed.launch --nproc_per_node=4 --master_port=9000 run.py \
... \
-th 8 \
-r 100 \
-ss \
-dist
# resuming example 2 (trainer will load only a checkpoint given in absolute path. Need for fine-tuning a model for the adaptation stage)
CUDA_VISIBLE_DEVICES=0,1,2,3 python -B -m torch.distributed.launch --nproc_per_node=4 --master_port=9000 run.py \
... \
-th 8 \
-ra ./ckpt/RefVSR_MFID.pytorch \
-ss \
-dist
# single GPU (with DataParallel) example
CUDA_VISIBLE_DEVICES=0 python -B run.py \
... \
-ss
# For PyTorch >= 1.10.x, (especially when running the small model using PyTorch AMP)
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 --master_port=9000 run.py \
... \
-
--is_train
: If it is specified,run.py
will train the network. Default:False
-
--mode
: The name of a model to train. The logging folder named with the[mode]
will be created as[LOG_ROOT]/RefVSR_CVPR2022/[mode]/
. Default:RefVSR
-
--config
: The name of a config file located as./config/[config].py
. Default:None
, and the default should not be changed. -
--trainer
: The name of a trainer file located as./models/trainers/[trainer].py
. Default: `` -
--network
: The name of a network file located as./models/archs/[network].py
. Default: `` -
-b
,--batch_size
: The batch size. For the multi GPUs (DistributedDataParallel
), the total batch size will be,nproc_per_node * b
. Default: 8 -
-th
,--thread_num
: The number of threads (num_workers
) for the data loader. Default: 8 -
-dl
,--delete_log
: The option whether to delete logs under[mode]
(i.e.,[LOG_ROOT]/RefVSR_CVPR2022/[mode]/*
). The option works only when--is_train
is specified. Default:False
-
-r
,--resume
: Resume training with the checkpoint saved in specified epoch (e.g.,-r 100
). Note that-dl
should not be specified with this option. Default:None
-
-ra
,--resume_ab
: Resume training with the checkpoint given with the absolute path (e.g.,./ckpt/RefVSR_MFID.pytorch
). Note that-dl
should not be specified with this option. Default:None
-
-ss
,--save_sample
: Save sample images for both training and testing. Images will be saved in[LOG_ROOT]/RefVSR_CVPR2022/[mode]/sample/
. Default:False
-
-dist
: Enables multi-processing withDistributedDataParallel
. Default:False
-
--is_crop_valid
: Crop frames of the validation set during the training phase. This is mainly due to the out-of-memory issue. Default:False