Self-Play Reinforcement Model for Connect 4

A simple CNN that learns Connect-4 through self play.

The model predicts the probability of winning given the current state it is in based on previously played games. That's it, no tree search, no policies.

All the training is through self-play, no external data or players are used.

The two key elements of making this work are exploration (driven here by adding noise to the predictions) and sampling to eliminate correlation between training samples.

Exploration:

The exploration part is developed by adding noise to the neural net predictions:

preds = preds + np.random.normal(0, std_noise, len(preds))

That's it. This leads the model to not replay the same games over and over while also allowing it to play relatively good moves.

Sampling:

Model Architecture:

A simple 3 layer convolutional neural network does the trick here.

Evaluation:

A negamax player:

checks for a winning move and takes it
blocks the winning move of an opponent
avoids moves that allow the opponent to win in the next round

So basically the only way to win against a negamax player is by forcing a win.

Setup:

run python nn_player.py

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
LICENSE		LICENSE
README.md		README.md
nnplayer.py		nnplayer.py
play_against.py		play_against.py
reqirements.txt		reqirements.txt
tests.py		tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Play Reinforcement Model for Connect 4

Exploration:

Sampling:

Model Architecture:

Evaluation:

Setup:

Play against the model

Gradio Setup

Speeding up the model

About

Releases

Packages

Languages

License

apapiu/connect_4_cnn

Folders and files

Latest commit

History

Repository files navigation

Self-Play Reinforcement Model for Connect 4

Exploration:

Sampling:

Model Architecture:

Evaluation:

Setup:

Play against the model

Gradio Setup

Speeding up the model

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages