AI-related issues to address #75

Bartleby2718 · 2018-10-25T16:13:18Z

How should I design my reward function? Having read an OpenAI post, I'm thinking of...
- a big constant function for winning the game
- an exponentially decaying function for 'good' or 'bad' behavior
  - ex) good: shouting "die" when the chance of winning is low
  - ex) bad: shouting "die" when the chance of winning is high
What should the value of my epsilon be?
- a constant function: e_k = .1 (.05 afterwards) (source)
- an inverse function: e_k = 1/k (source)
- an exponentially decaying function: e_k = 0.9 * a^k for 0 < a < 1 (source)
How should I implement multiple rules? Rules should change depending on whether an input is timed or what happens when the sums of both players are equal. I'm thinking of subclassing Game and overriding necessary classes/methods, but not sure yet.
Both players have the same knowledge about their own or the opponent's deck, like the game of Go, Chess, and Gomoku. That means the same algorithm can be applied to both sides in a game, doubling the data received per game. However, I also read that "[Q learning] isn't likely to lead to very good results if you assume that the opponent can also learn. "

The text was updated successfully, but these errors were encountered:

Bartleby2718 added the question Further information is requested label Oct 25, 2018

Bartleby2718 self-assigned this Oct 25, 2018

Bartleby2718 removed their assignment Mar 12, 2024

Provide feedback