Important: This repository does not include a data loader. Sorry for the inconvenience, and I strongly recommend using these results as "a reference only".
This project aims to perform semantic segmentation on the Railsem19 dataset using the DeepLabV3Plus model. The goal is to classify rail and track as well as detect other railway-related structures and objects to improve object detection performance in various railway environments.
- Perform semantic segmentation on the Railsem19 dataset.
- Classify rail and track, and detect other railway-related structures.
- Evaluate and improve object detection performance in railway environments.
- Discuss a dataset construction plan based on the experiments.
- This project uses the DeepLabV3Plus model.
- For more details on the model, visit the following repository: DeepLabV3Plus for Beginners.
- Dataset used in this project is the Railsem19 dataset.
- For more information on the dataset, visit: Railsem19 Dataset.
- This repository provides a visualization tool that complements existing tools: example-vis_quad.py
Three experiments were conducted in this project:
- Training only 8 classes excluding the background class.
- tram-track, rail-track, traffic-light, traffic-sign, on-rails, rail-raised, rail-embedded, human.
- Training only 8 classes including the background class.
- Same as above, additionally background.
- Training with all labels including the background class.
- OS: Ubuntu 18.04.6 LTS
- GPU: 4 x NVIDIA RTX2080Ti
- Python: 3.8.5
- PyTorch: 1.7.1
- CUDA: 11.0
- Train: 7,650 images (90%), rs00000.jpg ~ rs07649.jpg
- Validation: 850 images (10%), rs07650.jpg ~ rs08499.jpg
- Encoder: ResNet101-OS16
- Epochs: 200
- Batch Size: 4
- Learning Rate: 0.01
- Optimizer: SGD
- Scheduler: CosineAnnealingLR
Clone the repository:
git clone https://github.com/yourusername/railsem19-semantic-segmentation.git
cd railsem19-semantic-segmentation
In single GPU
python train_single.py --batch-size 4 --num-classes 19
In multi GPUs
python -m torch.distributed.launch --nproc_per_node=4 train.py --batch-size 4 --num-classes 19
python evaluate.py --weight ./saved_model/best.pth --num-classes 19
Note: this repository doesn't contain checkpoints.
- This repository offers about 60 samples per each task.
- The images are located in outputs directory, name start with bad, good, or merged.
- Below shows only good cases.
- Achieved 0.717 mIoU of semantic segmentation learning results for 8 classes on the RailSem19 dataset.
- Ignoring or masking everything except the target object.
- Rail/Tram Track may be confused depending on the terrain. (additional study and Balanced Training required)
Original image, Ground truth, Prediction, Blended(Image+Prediction) in orders
Class | IoU | Class | IoU | Class | IoU | Class | IoU |
---|---|---|---|---|---|---|---|
tram-track | 0.7106 | rail-track | 0.8785 | traffic-light | 0.7759 | on-rails | 0.8192 |
traffic-sign | 0.5961 | rail-raised | 0.7597 | human | 0.6202 | rail-embedded | 0.5790 |
mIoU | 0.7174 | - | - | - | - | - | - |
- Achieved 0.605 mIoU learning results for semantic segmentation with 8 classes and background on RailSem19 dataset.
- All objects other than the target object are processed as background to proceed with learning.
- For classes with insufficient instances, the IoU decreases significantly (traffic-related classes, etc.)
Original image, Ground truth, Prediction, Blended(Image+Prediction) in orders
Class | IoU | Class | IoU | Class | IoU | Class | IoU |
---|---|---|---|---|---|---|---|
tram-track | 0.7332 | rail-track | 0.8611 | traffic-light | 0.4522 | on-rails | 0.5715 |
traffic-sign | 0.3348 | rail-raised | 0.5952 | human | 0.4601 | rail-embedded | 0.4576 |
background | 0.9792 | mIoU | 0.6050 | - | - | - | - |
- Achieved 0.578 mIoU of semantic segmentation learning results for all classes on the RailSem19 dataset.
- As all classes are learned, overall IoU decreases (dataset expansion required for smooth learning)
Original image, Ground truth, Prediction, Blended(Image+Prediction) in orders
- Left two columns are the 8 classes used above, and right two columns are the others.
Class | IoU | Class | IoU | Class | IoU | Class | IoU |
---|---|---|---|---|---|---|---|
tram-track | 0.4272 | rail-track | 0.8381 | road | 0.5354 | terrain | 0.6277 |
traffic-light | 0.5091 | on-rails | 0.4667 | sidewalk | 0.5501 | sky | 0.9537 |
traffic-sign | 0.3491 | rail-raised | 0.6020 | construction | 0.7279 | car | 0.6825 |
human | 0.5682 | rail-embedded | 0.3260 | fence | 0.5075 | truck | 0.1540 |
- | - | - | - | pole | 0.6240 | trackbed | 0.7053 |
- | - | - | - | vegetation | 0.8449 | Total mIoU | 0.5789 |
- Rail-track and rail-raised are found well, but tram-track and on-rails are relatively difficult to find.
- When rail-tracks cross each other, it is easy to find, but when rail-tracks and tram-tracks cross each other, confusion occurs.
- The above problem is believed to be caused by pixel-level class imbalance, therefore, additional research on balanced train techniques is needed.
- Address pixel-level class imbalance.
- Experiment with other segmentation models and techniques.
- Expand the dataset to include more diverse railway environments.
Special thanks to:
- @J911 for providing invaluable experiences and guidance.
- @Testworks for their support.