Skip to content

Binary classification using DeeplabV3plus on E-marg(custom) dataset.

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



67 Commits

Repository files navigation

"eMARG" Automated Road Quality Inspection PoC

I. Semantic Segmentation

We have used DeepLabv3+ semantic segmentation model trained on eMARG-15k images(Good/Bad).

Quick Start

1. Available Architectures


Specify the model architecture with '--model ARCH_NAME' and set the output stride using '--output_stride OUTPUT_STRIDE'.

DeepLabV3 DeepLabV3+
deeplabv3_resnet50 deeplabv3plus_resnet50
deeplabv3_resnet101 deeplabv3plus_resnet101
deeplabv3_mobilenet deeplabv3plus_mobilenet
deeplabv3_hrnetv2_48 deeplabv3plus_hrnetv2_48
deeplabv3_hrnetv2_32 deeplabv3plus_hrnetv2_32

All pretrained model checkpoints: Drive

2. Load the pretrained model:

model.load_state_dict( torch.load( CKPT_PATH )['model_state']  )

3. Visualize segmentation outputs:

outputs = model(images)
preds = outputs.max(1)[1].detach().cpu().numpy()
colorized_preds = val_dst.decode_target(preds).astype('uint8') # To RGB images, (N, H, W, 3), ranged 0~255, numpy array
# Do whatever you like here with the colorized segmentation maps
colorized_preds = Image.fromarray(colorized_preds[0]) # to PIL Image

4. Train your model on eMARG likewise Cityscapes.

python --model deeplabv3plus_mobilenet --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0  --lr 0.1  --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG
python --model deeplabv3plus_resnet101 --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0  --lr 0.1  --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG 

5. Testing

Results will be saved at ./results.

python --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results
python --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results

6. Prediction

Single image:

python --input datasets/data/eMARG/leftImg8bit/train/city0/PE-AR-7382-157_2_leftImg8bit  --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results

Image folder:

python --input datasets/data/eMARG/leftImg8bit/train/city0  --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results


1. Performance on eMARG (6 classes, 512 x 384)

Training: 768x768 random crop
validation: 512x384

Model Batch Size mIoU Overall_Accuracy Mean_Accuracy lr checkpoint_link
DeepLabV3Plus-MobileNet 4 0.558 0.896 0.694 0.01 Download
DeepLabV3Plus-ResNet101 4 0.600 0.854 0.741 0.01 Download

Segmentation Results on Cityscapes (DeepLabv3Plus-ResNet101)

Image Overlay Prediction Target

II. Image Classification

We have used Segmentation Backbone of DeepLabv3+ model pre-trained on eMARG-15k(Good/Bad) and extended it for Binary Classification by adding simple Conv + FC layer combination.

Quick Start

Architecture of DeeplabV3+ Fine-tuned for Binary Classification.


DeepLabV3 DeepLabV3+
deeplabv3_resnet50 deeplabv3plus_resnet50
deeplabv3_resnet101 deeplabv3plus_resnet101
deeplabv3_mobilenet deeplabv3plus_mobilenet
deeplabv3_hrnetv2_48 deeplabv3plus_hrnetv2_48
deeplabv3_hrnetv2_32 deeplabv3plus_hrnetv2_32

All pretrained model checkpoints: Drive

1. Load the pretrained model:

model.load_state_dict( torch.load( CKPT_PATH )['model_state']  )

2. Prediction

Single image:

python --input datasets/data/eMARG/leftImg8bit/train/city0/PE-AR-7382-157_2_leftImg8bit  --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results

Image folder:

python --input datasets/data/eMARG/leftImg8bit/train/city0  --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results

3. Train your model on eMARG likewise Cityscapes.

python --model deeplabv3plus_mobilenet --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0  --lr 0.1  --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG
python --model deeplabv3plus_resnet101 --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0  --lr 0.1  --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG 

4. Testing

Results will be saved at ./results.

python --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results
python --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results


1. Performance on eMARG (6 classes, 512 x 384)

Training: 768x768 random crop
validation: 512x384

Model Batch Size Accuracy Precision Recall F1-score checkpoint_link
DeepLabV3Plus-ResNet101 4 0.884 0.8618 0.915 0.887 Download
DeepLabV3Plus-MobileNet 8 0.869 0.841 0.908 0.874 Download

GradCAM Results on eMARG (DeepLabv3Plus-MobileNet/ResNet-101)


eMARG Dataset

1. Download eMARG and extract it likewise Cityscapes dataset in this format 'datasets/data/cityscapes'


2. Requirements

pip install -r requirements.txt


[1] Rethinking Atrous Convolution for Semantic Image Segmentation

[2] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation