A Reinforcement Learning project where an agent (Thief) learns to navigate a hazardous 10x10 city grid to reach a getaway vehicle while avoiding police officers, hardcoded walls, and a central police station's surveillance zone.
- Environment: Custom Gymnasium-compatible environment (
StealthThiefEnv) with:- 10x10 grid with specified coordinate mapping (01 to 100).
- Randomly spawned Police officers.
- Central Police Station (44, 45, 54, 55) with a 1-grid "Busted" search radius.
- Static Wall obstacles.
- Agent: Advanced Dueling Double DQN architecture for stable and efficient learning.
- Persistence:
- Automatic saving and loading of model weights (
models/stealth_thief_latest.pth). - Persistent Replay Buffer (
data/stealth_replay_buffer.pkl) to preserve experience across training sessions.
- Automatic saving and loading of model weights (
- Continuous Training: Training script runs in a loop, resetting episodes automatically but allowing manual termination via the 'Q' key in the game window, which triggers a final persistent save.
- Install dependencies:
pip install -r requirements.txt
- Assets: Ensure the
assets/directory contains:agent.png(Thief)police.png(Police)wall.png(Walls)car.png(Getaway Vehicle)
Watch the agent explore the environment and learn optimal escape routes:
python scripts/train.pyPress 'Q' while focused on the game window to save progress and exit.
Watch the best-trained model attempt the escape:
python scripts/play.pyThis project is developed for game simulation and research purposes. Please refer to DISCLAIMER.md for more information.