-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from FarInHeight/vitabile
Final commit, maybe
- Loading branch information
Showing
20 changed files
with
5,133 additions
and
583 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,79 @@ | ||
# Computational-Intelligence-Project | ||
# Computational Intelligence Project | ||
|
||
## Players - Design Choices | ||
|
||
Since during the semester we developed several agents based on the techniques explained in the lectures, we mainly focused our project on methods which we did not develop in the laboratories or in the additional material we proposed in our personal repositories. | ||
|
||
Keeping this in mind, we decided to implement the following methods: | ||
- [x] Human Player | ||
- [x] MinMax | ||
- [x] MinMax + Alpha-Beta pruning | ||
- [x] Monte Carlo Reinforcement Learning (TD learning + Symmetries) | ||
- [x] Monte Carlo Tree Search | ||
|
||
Although _Monte Carlo Tree Search_ is not a topic of the course, we included it because _Quixo_ has a great branching factor value and we wanted an agent that could overcome this problem. | ||
|
||
### Space Optimization | ||
|
||
Since the _Quixo_ game has a huge amount of states, we focused our attention on optimizing the space required by our serialized agents. Before this new representation, the Monte Carlo RL player weighed more than a GB, while now its size is 57 KB. | ||
|
||
### Players Improvements | ||
|
||
To improve the performance of the players we implemented the following improvements: | ||
- [x] parallelization | ||
- [x] hash tables | ||
- [x] symmetries | ||
|
||
### Failed Attempts | ||
|
||
We also tried to include in the project a Q-learning player, but we failed resoundingly due to the huge amount of _state-action_ pairs to learn. For this reason, we removed it from the repository. | ||
|
||
We tried to use the same agents implemented for the last laboratory, but we failed because the formulas we used were not sufficient to learn the return of rewards of the millions and millions of states in which _Quixo_ can be found. \ | ||
We performed several trials and after a consultation with [Riccardo Cardona](https://github.com/Riden15/Computational-Intelligence), we found that the formula he used for the project is quite efficient and effective. | ||
|
||
## Repository Structure | ||
|
||
- [players](players): this directory contains the implemented agents | ||
- [human_player.py](players/human_player.py): class which implements a human player | ||
- [min_max.py](players/min_max.py): class which implements the MinMax algorithm and the Alpha-Beta pruning technique | ||
- [monte_carlo_rl.py](players/monte_carlo_rl.py): class which implements the Monte Carlo Reinforcement Learning player | ||
- [monte_carlo_tree_search.py](players/monte_carlo_tree_search.py): class which implements the Monte Carlo Tree Search algorithm | ||
- [random_player.py](players/random_player.py): class which implements a player that plays randomly | ||
- [trained_agents](trained_agents): this directory contains the trained agents | ||
- [utils](utils): this directory contains files which are necessary for the agents to play and implement performance improvements | ||
- [investigate_game.py](utils/investigate_game.py): class which extends `Game` and it is used by our agents | ||
- [symmetry.py](utils/symmetry.py): class which implements all the possible symmetries and it is used by our agents | ||
- [project_summary.ipynb](project_summary.ipynb): notebook used to train agents and to show results | ||
|
||
The serialized `MinMax` and `MinMax + Alpha-Beta pruning` players with a non-empty hash table can be found in the release section. | ||
|
||
## How to run | ||
|
||
To run a specific `module.py` file, open the terminal and type the following command from the root of the project: | ||
```bash | ||
python -m folder.module | ||
``` | ||
As an example, run the `min_max.py` file as follows: | ||
```bash | ||
python -m players.min_max | ||
``` | ||
|
||
If you are using VS Code as editor, you can add | ||
```json | ||
"terminal.integrated.env.[your os]": | ||
{ | ||
"PYTHONPATH": "${workspaceFolder}" | ||
} | ||
``` | ||
to your settings and run the module directly using the <kbd>▶</kbd> button. | ||
|
||
## Resources | ||
|
||
* Sutton & Barto, _Reinforcement Learning: An Introduction_ [2nd Edition] | ||
* Russel, Norvig, _Artificial Intelligence: A Modern Approach_ [4th edition] | ||
* Nils J. Nilsson, _Artificial Intelligence: A New Synthesis_ Morgan Kaufmann Publishers, Inc. (1998) | ||
* [Quixo Is Solved](https://arxiv.org/pdf/2007.15895.pdf) | ||
* [aimacode/aima-python](https://github.com/aimacode/aima-python/tree/master) + [Monte Carlo Tree Search implementation example](https://github.com/aimacode/aima-python/blob/master/games4e.py#L178) | ||
|
||
## License | ||
[MIT License](LICENSE) |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,8 @@ | ||
from game import Game | ||
from min_max import MinMaxPlayer, AlphaBetaMinMaxPlayer | ||
from random_player import RandomPlayer | ||
from human_player import HumanPlayer | ||
import time | ||
from utils.investigate_game import InvestigateGame | ||
from players.random_player import RandomPlayer | ||
|
||
|
||
if __name__ == '__main__': | ||
g = Game() | ||
g.print() | ||
# player1 = AlphaBetaMinMaxPlayer(0, depth=4) | ||
player1 = RandomPlayer() | ||
# player2 = RandomPlayer() | ||
player2 = AlphaBetaMinMaxPlayer(1, depth=5, symmetries=True) | ||
start = time.time() | ||
winner = g.play(player1, player2) | ||
total_time = time.time() - start | ||
g.print() | ||
print(f"Winner: Player {winner}") | ||
print(f'Game duration: {total_time:.2E} sec, {total_time / 60:.2E} min') | ||
g.play(RandomPlayer(), RandomPlayer()) |
Oops, something went wrong.