Playing Snake using Deep Reinforcement Learning

In this project, I have implemented a deep Q-network (DQN) to play a simple game of Snake. The goal is to maximize the score of game in a 10x10 space and modify the DQN to stretch out the performance as much as possible.

Find full explanation here

Q-learning and Deep Q-Networks

There are two main approaches to solving RL problems:

Methods based on value functions
Methods based on policy search

There is also a hybrid actor-critic approach that employs both value functions and policy search. As far as the scope of this project is concerned, I have used the value functions approach to play the Snake game.

In Q-learning, which is a value functions approach, we use a Q-table that maps environment states with actions taken on them. For each (state, action) pair, there is a reward that the agent achieves. The idea is to pick a value that maximizes the cumulative reward by the end of an episode. To do this, we use an equation called the Bellman equation, which is shown below:

Creating the game

The first step is to create an environment for our agent to operate on. This has been implemented in a simple way using Python coroutines and will be rendered (or displayed) using Pyplot (from Matplotlib). It can also be implemented in a way similar to OpenAI Gym environments, but I specifically want my own environment to control all the aspects of its execution behavior.

Modeling an Agent

The agents takes the parameters and uses a Convolutional Neural Network to build a DQN. It then obtains an instance of the game environment and begins training on it. The training process comprises of the following steps:

For half of the nb_epochs, take random actions and capture the game states and rewards - referred as Experience Replay.
While the first half of the training consists of random actions (exploration), the other half is all about the model taking actions, known as exploitation.
The set of experiences are sub-sampled into a batches of size batch_size.
The batches are iterated through and a set of target Q-values are calculated using the Bellman equation.
These Q-values are mapped to the states, which gives us kind of a Q-table that is fed to the neural network for training.

Let's play!

After training for about 30,000 training epochs and testing for 100 episodes, the agent could achieve a highest score of 7.

Conclusion

In this project, I have applied the concept of deep reinforcement learning on the classic game - Snake. I used the approach called Q-learning, which is based on value functions that estimates the expected return of being in a given state. I extended this approach to deep Q-network and used a convolutional neural network to implement it. Using this approach, the maximum score I could achieve was 7 (seven).

Future Work

This project can be further extended using concepts like Policy search and Actor-Critic method. An interesting implementation would be to incorporate Genetic algorithm to create a population of agents and filter out the best through various generations.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.ipynb_checkpoints		.ipynb_checkpoints
game_weights		game_weights
model_weights		model_weights
Bellman_equation.PNG		Bellman_equation.PNG
DeepQLearning_Snake_v1.ipynb		DeepQLearning_Snake_v1.ipynb
Image Recognition.ipynb		Image Recognition.ipynb
PlayingSnake_DRL-47455841.pdf		PlayingSnake_DRL-47455841.pdf
Q-Learning_lessons.ipynb		Q-Learning_lessons.ipynb
Q_learning.PNG		Q_learning.PNG
README.md		README.md
Twitter sentiment analysis.ipynb		Twitter sentiment analysis.ipynb
cartpole.PNG		cartpole.PNG
output_WX5pMq.gif		output_WX5pMq.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Playing Snake using Deep Reinforcement Learning

Q-learning and Deep Q-Networks

Creating the game

Modeling an Agent

Let's play!

Conclusion

Future Work

References

About

Uh oh!

Releases

Packages

Languages

sahiljohari/thisthinglearns

Folders and files

Latest commit

History

Repository files navigation

Playing Snake using Deep Reinforcement Learning

Q-learning and Deep Q-Networks

Creating the game

Modeling an Agent

Let's play!

Conclusion

Future Work

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages