LunarLander doc
Recently, reinforcement learning has been successfully applied in different problems like self-driving cars, trading and finance, and playing video games. In this paper, we solve a well-known robotic control problem — the lunar lander problem using Deep Q-Learning under OpenAI Gym’s LunarLander-v2 Environment. The winning agent can achieve over 266 average rewards for 100 test episodes. The project will also show that different hyper-parameters, like batch size, learning rate and update size, affect both training speed in episodes and performance in rewards.