A self-contained reinforcement learning project where an agent learns to navigate randomly generated mazes.
- Custom Gymnasium environment:
GridMazeEnv - DQN agent in PyTorch with:
- Experience replay
- Target network
- Epsilon-greedy exploration with linear decay
- Gradient clipping
- Train/Evaluate scripts and model checkpointing
- Deterministic training via
--seed
python -m venv .venv
# macOS/Linux
source .venv/bin/activate
# Windows (PowerShell)
# .venv\Scripts\Activate.ps1
pip install -r requirements.txt
# Train DQN on a fixed maze (seeded for reproducibility)
python rl_maze/train_dqn.py --episodes 400 --maze-size 9 --seed 42 --save-dir runs/dqn_9x9
# Evaluate on randomized mazes of the same size (generalization)
python rl_maze/evaluate.py --episodes 100 --maze-size 9 --seed 123 --load runs/dqn_9x9/best.pt- Success rate: fraction of episodes reaching the goal
- Average steps to goal (over successful episodes)
- Episode return (reward)
- Wall bump rate: attempted moves into walls / steps
rl_maze_project/
├── README.md
├── requirements.txt
└── rl_maze
├── __init__.py
├── agents
│ ├── __init__.py
│ ├── dqn.py
│ └── replay_buffer.py
├── envs
│ ├── __init__.py
│ └── grid_maze_env.py
├── evaluate.py
└── train_dqn.py