This project presents a robust autonomous driving pipeline that combines Self-Supervised Learning (SSL) for perception and Proximal Policy Optimization (PPO) for decision-making in the CARLA simulator. We leverage RGB, Depth, and Semantic Segmentation inputs fused into a unified representation that drives a reinforcement learning agent to navigate complex environments.
- ๐ Multi-modal Perception: Fusion of RGB, depth, and segmentation modalities.
- ๐ง Self-Supervised Learning: Contrastive loss and reconstruction loss for learning high-level visual embeddings.
- ๐น๏ธ Deep Reinforcement Learning: PPO agent trained with perception embeddings for control in CARLA.
- ๐ Temporal Consistency: GRU-based temporal modeling in the perception system.
- ๐งช Robust Reward Function: Penalizes collisions, lane deviation, off-road behavior; rewards smooth turns and optimal speed.
Inputs โ Encoders (ResNet/DeepLab) โ Fusion Transformer โ Temporal GRU
โ SSL heads (contrastive, reconstruction) โ 256D Perception Embedding โ PPO Policy
Additional modules:
- Perception decoder for SSL supervision
- Reinforcement Learning agent with continuous control output
- Simulator: CARLA
- RL Algorithm: PPO (Stable-Baselines3)
- SSL Tasks: Contrastive learning, pixel reconstruction
- Evaluation Metrics:
| Metric | Baseline RL | SSL + RL |
|---|---|---|
| Total Reward | 1500 | 4200 |
| Collision Rate | 4.5 | 1.2 |
| Lane Deviation | 0.78 m | 0.32 m |
| Off-road % | 22% | 4% |
- Pretraining: SSL on multi-modal CARLA scenes
- Reinforcement Learning: PPO using 256D SSL embeddings
- Hardware: NVIDIA A100 (64GB), training time ~6 days
- Losses Used:
- ๐น InfoNCE contrastive loss
- ๐น L1 reconstruction loss
- ๐น PPO loss with entropy regularization
โโโ src/
โ โโโ models/ # ResNet/DeepLab, Transformer, GRU
โ โโโ ssl_trainer.py # Self-supervised pretraining
โ โโโ rl_training_with_ssl.py # Trained SSL + RL
โโโ rl_agent/
โ โโโ ppo_policy.py # PPO policy and learning loop
โ โโโ reward_function.py # Custom reward function
โโโ evaluation/
โ โโโ metrics.py # Evaluation metrics
โ โโโ visualize.py # Trajectory and reward plots
โโโ carla_utils/
โ โโโ environment_wrapper.py
โโโ configs/
โ โโโ perception.yaml
โ โโโ rl.yaml
โโโ README.mdgit clone https://github.com/your-username/self-driving-ssl-rl.git
cd self-driving-ssl-rlconda create -n sslrl python=3.6
conda activate sslrl
pip install -r requirements.txtpython perception/train_ssl.py --config configs/perception.yamlpython rl_agent/ppo_policy.py --config configs/rl.yamlpython rl_agent/ppo_policy.py --config configs/rl.yaml- ๐ Smoother and safer driving behavior
- ๐ 3.3x reduction in collision rate
- ๐ฆ Improved handling at junctions and turns
- ๐ค Human-in-the-loop reward tuning
- ๐ Online self-supervised learning with exploration
- ๐งฉ Integration with template-based planning modules
If you use or reference this work in academic or industrial projects:
@misc{SushmaMareddy2025sslrl,
title={Self-Supervised Multi-Modal Perception and Reinforcement Learning for Autonomous Driving},
author={Sushma Mareddy, Ravan Ranveer Budda},
year={2025},
note={Project Report, NYU Courant}
}