We have used DeepLabv3+ semantic segmentation model trained on eMARG-15k images(Good/Bad).
Specify the model architecture with '--model ARCH_NAME' and set the output stride using '--output_stride OUTPUT_STRIDE'.
DeepLabV3 | DeepLabV3+ |
---|---|
deeplabv3_resnet50 | deeplabv3plus_resnet50 |
deeplabv3_resnet101 | deeplabv3plus_resnet101 |
deeplabv3_mobilenet | deeplabv3plus_mobilenet |
deeplabv3_hrnetv2_48 | deeplabv3plus_hrnetv2_48 |
deeplabv3_hrnetv2_32 | deeplabv3plus_hrnetv2_32 |
All pretrained model checkpoints: Drive
model.load_state_dict( torch.load( CKPT_PATH )['model_state'] )
outputs = model(images)
preds = outputs.max(1)[1].detach().cpu().numpy()
colorized_preds = val_dst.decode_target(preds).astype('uint8') # To RGB images, (N, H, W, 3), ranged 0~255, numpy array
# Do whatever you like here with the colorized segmentation maps
colorized_preds = Image.fromarray(colorized_preds[0]) # to PIL Image
python main.py --model deeplabv3plus_mobilenet --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0 --lr 0.1 --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG
python main.py --model deeplabv3plus_resnet101 --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0 --lr 0.1 --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG
Results will be saved at ./results.
python main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results
python main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results
Single image:
python predict.py --input datasets/data/eMARG/leftImg8bit/train/city0/PE-AR-7382-157_2_leftImg8bit --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results
Image folder:
python predict.py --input datasets/data/eMARG/leftImg8bit/train/city0 --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results
Training: 768x768 random crop
validation: 512x384
Model | Batch Size | mIoU | Overall_Accuracy | Mean_Accuracy | lr | checkpoint_link |
---|---|---|---|---|---|---|
DeepLabV3Plus-MobileNet | 4 | 0.558 | 0.896 | 0.694 | 0.01 | Download |
DeepLabV3Plus-ResNet101 | 4 | 0.600 | 0.854 | 0.741 | 0.01 | Download |
We have used Segmentation Backbone of DeepLabv3+ model pre-trained on eMARG-15k(Good/Bad) and extended it for Binary Classification by adding simple Conv + FC layer combination.
DeepLabV3 | DeepLabV3+ |
---|---|
deeplabv3_resnet50 | deeplabv3plus_resnet50 |
deeplabv3_resnet101 | deeplabv3plus_resnet101 |
deeplabv3_mobilenet | deeplabv3plus_mobilenet |
deeplabv3_hrnetv2_48 | deeplabv3plus_hrnetv2_48 |
deeplabv3_hrnetv2_32 | deeplabv3plus_hrnetv2_32 |
All pretrained model checkpoints: Drive
model.load_state_dict( torch.load( CKPT_PATH )['model_state'] )
Single image:
python predict.py --input datasets/data/eMARG/leftImg8bit/train/city0/PE-AR-7382-157_2_leftImg8bit --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results
Image folder:
python predict.py --input datasets/data/eMARG/leftImg8bit/train/city0 --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results
python main.py --model deeplabv3plus_mobilenet --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0 --lr 0.1 --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG
python main.py --model deeplabv3plus_resnet101 --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0 --lr 0.1 --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG
Results will be saved at ./results.
python main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results
python main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results
Training: 768x768 random crop
validation: 512x384
Model | Batch Size | Accuracy | Precision | Recall | F1-score | checkpoint_link |
---|---|---|---|---|---|---|
DeepLabV3Plus-ResNet101 | 4 | 0.884 | 0.8618 | 0.915 | 0.887 | Download |
DeepLabV3Plus-MobileNet | 8 | 0.869 | 0.841 | 0.908 | 0.874 | Download |
1. Download eMARG and extract it likewise Cityscapes dataset in this format 'datasets/data/cityscapes'
/datasets
/data
/eMARG
/gtFine
/leftImg8bit
pip install -r requirements.txt
[1] Rethinking Atrous Convolution for Semantic Image Segmentation
[2] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation