-
state_after_action(self, a)
- test
-
successors()
- test
-
set_children()
- test
-
get_children()
- test
- class structure
- tests
- for a given state, build a game tree until either max_steps is reached or the game is finished
- test
- clean code
- track metrics to separate file for later plotting
- implement different search algorithms
- Backtracking
- tests
- Depth First Search (DFS)
- tests
- Breadth First Search (BFS)
- tests
- Uniform Cost Search (UCS)
- tests
- A*
- tests
-
manhattan_distance
+ test -
manhattan_heuristic
+ test - other Heuristics
- Backtracking
- implement MCTS and make it runnable
- different policies
random
eps-greedy
- different policies
- Single Agent from Feng et al., 2020 (page 6)
- implement Resnets/ConvNets for Learning
- implement MCTS for Planning
- tests
- AlphaGo
- MCTS
- Integrate DCNN to predict value and probability of states
- implement deadlock detection
- train CNN to predict best possible action for a given state
- How to play with one world to test agents behaviour
- Research on what algo's to implement
- Deadlock detection (helps making the game tree sparser)
- change
dfs
,bfs
to recursive implementation
- When comparing the previous and new network's performance on some level, how to choose for a set of Sokoban environments on where to test the performance on?
- How to structure the NN architecture?