A comprehensive reinforcement learning project comparing different agent types for autonomous lunar landing using OpenAI Gymnasium's LunarLander-v2 environment. This project demonstrates the superior performance of Deep Q-Networks (DQN) over Linear Q-Learning and Random agents across various environmental conditions.
This project implements and compares three different reinforcement learning approaches:
- Random Agent: Baseline random action selection
- Linear Q-Learning Agent: Traditional linear Q-learning approach
- Deep Q-Network (DQN) Agent: Neural network-based Q-learning with experience replay
The agents are tested under various environmental conditions including wind and turbulence to evaluate robustness and performance.
- DQN Agent: Achieves up to 90.3% success rate in optimal conditions
- Linear Q-Learning: Moderate performance with simpler implementation
- Random Agent: Baseline performance for comparison
Performance varies significantly under different wind and turbulence conditions, with DQN showing superior adaptability.
- Multiple Agent Types: Random, Linear Q-Learning, and DQN implementations
- Environmental Variations: Testing with/without wind and turbulence
- Comprehensive Analysis: Statistical comparison using t-tests
- Visualization: Training curves, performance plots, and success rate pie charts
- Data Export: CSV files with episode scores for further analysis
- Model Persistence: Save and load trained models
Install the required dependencies using:
pip install -r requirements.txtgym==0.26.2- OpenAI Gymnasium environmentmatplotlib==3.8.2- Plotting and visualizationnumpy==1.26.3- Numerical computationspandas==2.2.0- Data manipulation and analysisscipy==1.12.0- Statistical teststorch==2.1.2- PyTorch for neural networkstqdm==4.65.0- Progress bars
git clone https://github.com/yourusername/lunarlander.git
cd lunarlanderpip install -r requirements.txtcd src
python main.pycd src
python stats.pylunarlander/
โโโ src/
โ โโโ main.py # Main training and testing script
โ โโโ stats.py # Statistical analysis and comparison
โ โโโ dqnAgent.py # Deep Q-Network agent implementation
โ โโโ linearAgent.py # Linear Q-Learning agent implementation
โ โโโ randomAgent.py # Random agent implementation
โโโ data/ # Generated data and results
โ โโโ last/ # Latest run results
โ โโโ model_*/ # Timestamped experiment results
โโโ requirements.txt # Python dependencies
โโโ README.md # This file
The main script (main.py) allows you to train and test different agents:
# Configure environment conditions
wind = True # Enable/disable wind
wind_powerInput = 15.0 # Wind strength
turbulence_powerInput = 1.5 # Turbulence level
# Enable/disable agent types
randomAgent = True
dqnMethod = True
linearQLearning = TrueRun the statistical comparison:
python stats.pyThis performs t-tests comparing agent performance under different conditions.
- Architecture: Configurable neural network layers
- Features: Experience replay, epsilon-greedy exploration, target network
- Performance: Best overall performance with 90.3% success rate
- Architecture: Linear function approximation
- Features: Traditional Q-learning with linear state representation
- Performance: Moderate success rate, faster training
- Architecture: Random action selection
- Features: Baseline comparison agent
- Performance: Low success rate, used for statistical significance testing
- Success Rate: Percentage of successful landings (score โฅ 200)
- Average Score: Mean episode reward
- Statistical Significance: T-test comparisons between agents
- No Wind, No Turbulence: Optimal conditions
- Wind (15.0), No Turbulence: Wind-only challenge
- Wind (15.0), Turbulence (1.5): Most challenging conditions
- DQN significantly outperforms other agents across all conditions
- Environmental complexity affects all agents but DQN shows best adaptability
- Statistical tests confirm significant performance differences
For comprehensive analysis, methodology, and detailed results, see the full project report: Reinforcement Learning Project - Lunar Landing
- Episodes: 2000 training episodes
- Testing: 1000 test episodes
- Epsilon Decay: 0.995 with minimum 0.0
- Episode Limit: 1000 steps maximum
- Models achieving score โฅ 200 are automatically saved
- Saved models include timestamp and configuration details
- Models can be loaded for continued training or testing
- Training Plots: Episode scores over time
- Test Results: Success/failure pie charts
- CSV Files: Raw score data for each experiment
- Statistical Analysis: T-test results and significance
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). This is part of a Reinforcement Learning course final project.
For more details, see the LICENSE file or visit https://www.gnu.org/licenses/agpl-3.0.html.
- Efe Gรถrkem ลirin
- Nihat Aksu
Date: 30/01/2024
- OpenAI Gymnasium for the LunarLander-v2 environment
- PyTorch team for the deep learning framework
- Course instructors and teaching assistants