State your prime directive! - "... to ... roll ..." 🤖 🎲
Yahtzotron is a bot for Yahtzee and Yatzy, trained via advantage actor-critic (A2C) through self-play. Yahtzotron is implemented through the JAX library ecosystem (JAX + Haiku + optax + rlax).
Yahtzee is a game of chance played with 5 dice and involves making strategic decisions based on the outcome of your rolls early in the game. This makes for a surprisingly challenging task for reinforcement learning.
The pre-trained agents are close to perfect play (average scores are around 240 for both Yahtzee and Yatzy, just 5-10 points below perfect play).
Read my blog post about the making of Yahtzotron here.
Just clone the repository and run
$ pip install .
Then, you can use the Yahtzotron command-line interface:
$ yahtzotron --help
Usage: yahtzotron [OPTIONS] COMMAND [ARGS]...
This is Yahtzotron, the friendly robot that beats you in Yahtzee.
Options:
--version Show the version and exit.
-v, --loglevel [debug|info|warning|error]
--help Show this message and exit.
Commands:
evaluate Evaluate performance of trained agents.
origin Show Yahtzotron's origin story.
play Play a game against Yahtzotron.
train Train a new model through self-play.
Why don't you try a game against one of the pre-trained agents?
$ yahtzotron play pretrained/yahtzee-score.pkl
When you play Yahtzotron, it is going to tell you what its current strategy is before every action (to teach us puny humans how to play):
> My turn!
> Roll #1: [3, 3, 3, 5, 6].
> I think I should go for Threes, so I'm keeping [3, 3, 3].
> Roll #2: [3, 3, 3, 3, 4].
> I think I should go for Threes or Yatzy, so I'm keeping [3, 3, 3, 3].
> Roll #3: [1, 3, 3, 3, 3].
> I'll pick the "Threes" category for that.