zuma-flow Ai learns to play zuma baseline evaluation with random policy, 10 episodes: 81.0 todo: [ ] add verbose mode to zumaEnvirnment work plan: [ ] learn dqn [ ] implement dqn with 1D action space :( [ ] research other multi-dimentional action space solutions :)