Skip to content

Binary classification using DeeplabV3plus on E-marg(custom) dataset that comprises of images and video analytics.

License

Notifications You must be signed in to change notification settings

shubhampundhir/dlv3plus_binaryClf_VideoAnalytics

Repository files navigation

Video Classification on Indian-Roads using DeepLabV3Plus-Pytorch

We have used Segmentation Backbone of DeepLabv3+ model pre-trained on eMARG-15k(Good/Bad) and extended it for Binary Classification by adding simple Conv + FC layer combination layers.

Quick Overview

Architecture of DeeplabV3+ Fine-tuned for Binary Classification.

Image

DeepLabV3 DeepLabV3+
deeplabv3_resnet50 deeplabv3plus_resnet50
deeplabv3_resnet101 deeplabv3plus_resnet101
deeplabv3_mobilenet deeplabv3plus_mobilenet
deeplabv3_hrnetv2_48 deeplabv3plus_hrnetv2_48
deeplabv3_hrnetv2_32 deeplabv3plus_hrnetv2_32

All pretrained model checkpoints: Drive

1. Load the pretrained model:

model.load_state_dict( torch.load( CKPT_PATH )['model_state']  )

2. Prediction

Single image:

python predict.py --input datasets/data/eMARG/leftImg8bit/train/city0/PE-AR-7382-157_2_leftImg8bit  --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results

Image folder:

python predict.py --input datasets/data/eMARG/leftImg8bit/train/city0  --dataset cityscapes --model deeplabv3plus_mobilenet --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --save_val_results_to test_results

Results

1. Performance on eMARG (6 classes, 512 x 384)

Training: 768x768 random crop
validation: 512x384

Model Batch Size Accuracy Precision Recall F1-score checkpoint_link
DeepLabV3Plus-ResNet101 4 0.884 0.8618 0.915 0.887 Download
DeepLabV3Plus-MobileNet 8 0.869 0.841 0.908 0.874 Download

GradCAM Results on eMARG (DeepLabv3Plus-MobileNet/ResNet-101)

Image

eMARG Dataset

1. Requirements

pip install -r requirements.txt

2. Download eMARG and extract it likewise Cityscapes dataset in this format 'datasets/data/eMARG'

/datasets
    /data
        /eMARG
            /gtFine
            /leftImg8bit

3. Train your model on eMARG likewise Cityscapes.

python main.py --model deeplabv3plus_mobilenet --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0  --lr 0.1  --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG
python main.py --model deeplabv3plus_resnet101 --dataset cityscapes --enable_vis --vis_port 28333 --gpu_id 0  --lr 0.1  --crop_size 768 --batch_size 16 --output_stride 16 --data_root ./datasets/data/eMARG 

4. Testing

Results will be saved at ./results.

python main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results
python main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16 --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --test_only --save_val_results

Reference

[1] Rethinking Atrous Convolution for Semantic Image Segmentation

[2] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

About

Binary classification using DeeplabV3plus on E-marg(custom) dataset that comprises of images and video analytics.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages