This is an implementation of Deep Reinforcement Learning for a navigation task. Specifically, DQN algorithm with experience replay method is used to solve the task.
THe environment is a Unity Environment which consists of a square surface with Yellow and Blue Bananas scattered around.
The agent needs to collect as many yellow bananas as possible while avoiding the blue bananas.
- move forward
- move backward
- turn left
- turn right
- Yellow banana
+1
reward - Blue banana
-1
reward
The banana collection is an episodic game. Idea is to maximise the total score in an episode. The environment is said to be solved if the agent learns to secure an average score of at least +13
points over 100
consecutive episodes.
- Gain a basic understanding of Unity Environment
- Set up a Python 3.6 Environment to install Dependencies involving PyTorch, the ML-Agent toolkit and a few more Python packages.
- Download a Unity Environment for Windows(64-bit)/Windows(32-bit)/Mac OSX/LINUX
- Run Navigation.ipynb
- See a glimpse of my agent during training and my trained agent collecting bananas on YouTube.
- Do checkout my Report for more theoretical explanation of the project implementation.