Solving the LunarLander-v3 Environment: A 60-hour approach for the final term report in Foundation of Artificial Intelligence, VNU-UET (May 2025).
-
Updated
Jun 8, 2025 - Jupyter Notebook
Solving the LunarLander-v3 Environment: A 60-hour approach for the final term report in Foundation of Artificial Intelligence, VNU-UET (May 2025).
Apollo DQN — reproducible DQN variants for LunarLander-v3 with a Residual MLP backbone, PER-lite replay, strict evaluation pipeline and human baselines.
Going through the Hugging Face Deep Reinforcement Learning course.
Project for CENG567 Reinforcement Course that I took at IZTECH
Empirical study of over-estimation bias in DDPG vs TD3 on LunarLanderContinuous-v3.
Implementation of PPO algorithm - https://arxiv.org/pdf/1707.06347
Open-source implementation/adaptation of DeepSeek GRPO applied to Reinforcement Learning control problems. Example on LunarLander-V3.
A RL agent trained to solve the LunarLander-v3 Gymnasium environment. The agent uses a standard DQN assembled with PyTorch and TorchRL.
Add a description, image, and links to the lunarlander-v3 topic page so that developers can more easily learn about it.
To associate your repository with the lunarlander-v3 topic, visit your repo's landing page and select "manage topics."