ZeroGo is an attempt to use machine learning techniques to develop agents for the game of Go using limited hardware resources. It uses the AlphaGo style Monte-Carlo Tree Search to combine the results of a policy agent and a value agent. By reducing the dimension of board encoder and the number of layers/units of the network, the ZeroGo agent is able to operate on macbooks and PCs. Depending on the search depth and rollout times, ZeroGo agent reaches up to 1-dan player level.
Click to see all deliverables of ZeroGo
- $python3 dl_app.py
ZeroGo is developed using tensorflow version 1.13, you might need to use tensorflow 1.x to run the program
- models/AC: actor critic models based on 5x5 board
- models/AlphaGo: policy and value agents on 19x19 board. Policy v0-0-0 is based on the previous NN model with 27% accuracy
- random (5x5 and 19x19)
- greedy (5x5 and 19x19)
- depth_pruned (5x5)
- alpha_beta (5x5)
- mcts (5x5)
- actor_critic (5x5)
- NN (19x19)
- __5x5_host.py
- __19x19_host.py
- process to simulate games and train the RL agents
- process to debug and test the alphago MCTS agent
- AlphaGo MCTS bugs
- dlgo-rl-simulate.py line 50: white_player, black_player = agent1, agent2
- end to end script from downloading data to open web app
- steps:
- set encoder
- set data processor/generator
- construct NN layers
- compile model and train
- save model
- initiate deep learning agent
- start web app