Skip to content

Commit

Permalink
Merge pull request #22 from Max-We/train
Browse files Browse the repository at this point in the history
Train script and documentation
  • Loading branch information
Max-We authored May 23, 2024
2 parents 07e72bd + 373a050 commit 3c4097f
Show file tree
Hide file tree
Showing 18 changed files with 1,374 additions and 607 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -165,3 +165,5 @@ local_experiments/
# Examples / training outputs
examples/runs
runs
wandb
videos
42 changes: 22 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,41 +1,43 @@
![logo](./docs/_static/logo.png "Tetris Gymnasium")

> Tetris Gymnasium is currently under early development!
Tetris Gymnasium is tightly integrated with Gymnasium and exposes a simple API for training agents to play Tetris.

The environment offers state-of-the-art performance and holds a high standard for code quality. With it, researchers and developers can focus on their research and development, rather than the environment itself.

Getting started is easy. Here is a simple example of an environment with random actions:

```python
import gymnasium as gym
from tetris_gymnasium.envs import Tetris

env = gym.make("tetris_gymnasium/Tetris")
observation, info = env.reset(seed=42)
for _ in range(1000):
action = env.action_space.sample() # this is where you would insert your policy
observation, reward, terminated, truncated, info = env.step(action)

if terminated or truncated:
observation, info = env.reset()
env = gym.make("tetris_gymnasium/Tetris", render_mode="ansi")
env.reset(seed=42)

env.close()
terminated = False
while not terminated:
print(env.render() + "\n")
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
print("Game Over!")
```

##
## Background

Tetris Gymnasium tries to solve problems of other environments by being modular, understandable and adjustable. You can read more about the background in our paper: _Piece by Piece: Assembling a Modular Reinforcement Learning Environment for Tetris_ (SOON).

Abstract:

The documentation can be found on [GitHub Pages](https://max-we.github.io/Tetris-Gymnasium/)
>The game of Tetris is an open challenge in machine learning and especially Reinforcement Learning (RL). Despite its popularity, contemporary environments for the game lack key qualities, such as a clear documentation, an up-to-date codebase or game related features.
This work introduces Tetris Gymnasium, a modern RL environment built with Gymnasium, that aims to address these problems by being modular, understandable and adjustable.
To evaluate Tetris Gymnasium on these qualities, a Deep Q Learning agent was trained and compared to a baseline environment, and it was found that it fulfills all requirements of a feature-complete RL environment while being adjustable to many different requirements.
The source-code and documentation is available at on GitHub and can be used for free under the MIT license.

## Documentation

The full documentation of the project can be found on [GitHub Pages](https://max-we.github.io/Tetris-Gymnasium/).

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgements

We would like to thank the creators and maintainers of Gymnasium, CleanRL and Tetris-deep-Q-learning-pytorch for providing a powerful frameworks and reference implementations

---

Enjoy using the Gymnasium Tetris Environment for your reinforcement learning experiments! If you have any questions or need further assistance, don't hesitate to reach out to us. Happy coding! 🎮🕹️
We would like to thank the creators and maintainers of [Gymnasium](https://github.com/Farama-Foundation/Gymnasium), [CleanRL](https://github.com/vwxyzjn/cleanrl) and [Tetris-deep-Q-learning-pytorch](https://github.com/uvipen/Tetris-deep-Q-learning-pytorch) for providing a powerful frameworks and reference implementations.
53 changes: 52 additions & 1 deletion docs/development/contributing.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,54 @@
# Contributing

Under construction.
Contributions are always welcome, no matter how large or small. Feel free to create an issue or a pull request.
If you want to contribute, please read the following guidelines.

The following section is based on the [Gymnasium contributing guidelines](https://github.com/Farama-Foundation/Gymnasium/blob/main/CONTRIBUTING.md).

## Type checking

The project uses `pyright` to check types.
To type check locally, install `pyright` per official [instructions](https://github.com/microsoft/pyright#command-line).
It's configuration lives within `pyproject.toml`. It includes list of included and excluded files currently supporting type checks.
To run `pyright` for the project, run the pre-commit process (`pre-commit run --all-files`) or `pyright --project=pyproject.toml`
Alternatively, pyright is a built-in feature of VSCode that will automatically provide type hinting.

### Adding typing to more modules and packages

If you would like to add typing to a module in the project,
the list of included, excluded and strict files can be found in pyproject.toml (pyproject.toml -> [tool.pyright]).
To run `pyright` for the project, run the pre-commit process (`pre-commit run --all-files`) or `pyright`

## Git hooks

The CI will run several checks on the new code pushed to the Gymnasium repository. These checks can also be run locally without waiting for the CI by following the steps below:

1. [install `pre-commit`](https://pre-commit.com/#install),
2. Install the Git hooks by running `pre-commit install`.

Once those two steps are done, the Git hooks will be run automatically at every new commit.
The Git hooks can also be run manually with `pre-commit run --all-files`, and if needed they can be skipped (not recommended) with `git commit --no-verify`.
**Note:** you may have to run `pre-commit run --all-files` manually a couple of times to make it pass when you commit, as each formatting tool will first format the code and fail the first time but should pass the second time.

Additionally, for pull requests, the project runs a number of tests for the whole project using [pytest](https://docs.pytest.org/en/latest/getting-started.html#install-pytest).
These tests can be run locally with `pytest` in the root folder. If any doctest is modified, run `pytest --doctest-modules --doctest-continue-on-failure gymnasium` to check the changes.

## Docstrings

Pydocstyle has been added to the pre-commit process such that all new functions follow the [google docstring style](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html).
All new functions require either a short docstring, a single line explaining the purpose of a function
or a multiline docstring that documents each argument and the return type (if there is one) of the function.
In addition, new file and class require top docstrings that should outline the purpose of the file/class.
For classes, code block examples can be provided in the top docstring and not the constructor arguments.

To check your docstrings are correct, run `pre-commit run --all-files` or `pydocstyle --source --explain --convention=google`.
If all docstrings that fail, the source and reason for the failure is provided.

## Building the docs

You can use `sphinx-autobuild` to build the documentation locally.

```shell
cd docs
poetry run sphinx-autobuild -b dirhtml --watch ../tetris_gymnasium --re-ignore "pickle$" . _build
```
45 changes: 33 additions & 12 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,29 +11,50 @@ lastpage:
```{project-heading}
A customisable, easy-to-use and performant Tetris environment for Gymnasium
```

Tetris Gymnasium is tightly integrated with Gymnasium and exposes a simple API for training agents to play Tetris.

The environment offers state-of-the-art performance and holds a high standard for code quality. With it, researchers and developers can focus on their research and development, rather than the environment itself.

Getting started is easy. Here is a simple example of an environment with random actions:

```{code-block} python
import gymnasium as gym
from tetris_gymnasium.envs import Tetris
env = gym.make("tetris_gymnasium/Tetris")
observation, info = env.reset(seed=42)
for _ in range(1000):
action = env.action_space.sample() # this is where you would insert your policy
observation, reward, terminated, truncated, info = env.step(action)
env = gym.make("tetris_gymnasium/Tetris", render_mode="ansi")
env.reset(seed=42)
if terminated or truncated:
observation, info = env.reset()
env.close()
terminated = False
while not terminated:
print(env.render() + "\n")
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
print("Game Over!")
```

# Information

## Background

Tetris Gymnasium tries to solve problems of other environments by being modular, understandable and adjustable. You can read more about the background in our paper: _Piece by Piece: Assembling a Modular Reinforcement Learning Environment for Tetris_ (SOON).

Abstract:

>The game of Tetris is an open challenge in machine learning and especially Reinforcement Learning (RL). Despite its popularity, contemporary environments for the game lack key qualities, such as a clear documentation, an up-to-date codebase or game related features.
This work introduces Tetris Gymnasium, a modern RL environment built with Gymnasium, that aims to address these problems by being modular, understandable and adjustable.
To evaluate Tetris Gymnasium on these qualities, a Deep Q Learning agent was trained and compared to a baseline environment, and it was found that it fulfills all requirements of a feature-complete RL environment while being adjustable to many different requirements.
The source-code and documentation is available at on GitHub and can be used for free under the MIT license.

## Documentation

The full documentation of the project can be found on [GitHub Pages](https://max-we.github.io/Tetris-Gymnasium/).

## License

This project is licensed under the MIT License - see the [LICENSE](../LICENSE) file for details.

## Acknowledgements

We would like to thank the creators and maintainers of [Gymnasium](https://github.com/Farama-Foundation/Gymnasium), [CleanRL](https://github.com/vwxyzjn/cleanrl) and [Tetris-deep-Q-learning-pytorch](https://github.com/uvipen/Tetris-deep-Q-learning-pytorch) for providing a powerful frameworks and reference implementations.

```{toctree}
:maxdepth: 2
:caption: Introduction
Expand Down
6 changes: 3 additions & 3 deletions docs/introduction/installation.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Installation

At the moment, the only way to install the package is to clone the repository and to install it using poetry.
At the moment, you can install the environment by cloning the repository and running `poetry install`.

> In the near future, this library will be distributed via PyPI.
```{code-block} bash
git clone https://github.com/Max-We/Tetris-Gymnasium.git
cd Tetris-Gymnasium
poetry install
```

In the near future, this library will be distributed via PyPI.
2 changes: 2 additions & 0 deletions docs/introduction/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,5 @@ poetry run python examples/train_lin.py # uses a linear model
# or
poetry run python examples/train_cnn.py # uses convolutions
```

You can refer to the [CleanRL documentation](https://docs.cleanrl.dev/rl-algorithms/dqn/) for more information on the training script. Note: If you have tracking enabled, you will be prompted to login to Weights & Biases during the first run. This behavior can be adjusted in the script.
6 changes: 3 additions & 3 deletions docs/utilities/wrappers.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@ shape of the observation space or for adding additional information to the obser
### Implementations

```{eval-rst}
.. autoclass:: tetris_gymnasium.wrappers.observation.CnnObservation
.. autoclass:: tetris_gymnasium.wrappers.observation.RgbObservation
```

#### Methods
```{eval-rst}
.. automethod:: tetris_gymnasium.wrappers.observation.CnnObservation.observation
.. automethod:: tetris_gymnasium.wrappers.observation.CnnObservation.render
.. automethod:: tetris_gymnasium.wrappers.observation.RgbObservation.observation
.. automethod:: tetris_gymnasium.wrappers.observation.RgbObservation.render
```
28 changes: 13 additions & 15 deletions examples/play_interactive.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,48 +7,46 @@

if __name__ == "__main__":
# Create an instance of Tetris
tetris_game = gym.make("tetris_gymnasium/Tetris", render_mode="human")
tetris_game.reset(seed=42)
env = gym.make("tetris_gymnasium/Tetris", render_mode="human")
env.reset(seed=42)

# Main game loop
terminated = False
while not terminated:
# Render the current state of the game as text
tetris_game.render()
env.render()

# Pick an action from user input mapped to the keyboard
action = None
while action is None:
key = cv2.waitKey(1)

if key == ord("a"):
action = tetris_game.unwrapped.actions.move_left
action = env.unwrapped.actions.move_left
elif key == ord("d"):
action = tetris_game.unwrapped.actions.move_right
action = env.unwrapped.actions.move_right
elif key == ord("s"):
action = tetris_game.unwrapped.actions.move_down
action = env.unwrapped.actions.move_down
elif key == ord("w"):
action = tetris_game.unwrapped.actions.rotate_counterclockwise
action = env.unwrapped.actions.rotate_counterclockwise
elif key == ord("e"):
action = tetris_game.unwrapped.actions.rotate_clockwise
action = env.unwrapped.actions.rotate_clockwise
elif key == ord(" "):
action = tetris_game.unwrapped.actions.hard_drop
action = env.unwrapped.actions.hard_drop
elif key == ord("q"):
action = tetris_game.unwrapped.actions.swap
action = env.unwrapped.actions.swap
elif key == ord("r"):
tetris_game.reset(seed=42)
env.reset(seed=42)
break

if (
cv2.getWindowProperty(
tetris_game.unwrapped.window_name, cv2.WND_PROP_VISIBLE
)
cv2.getWindowProperty(env.unwrapped.window_name, cv2.WND_PROP_VISIBLE)
== 0
):
sys.exit()

# Perform the action
observation, reward, terminated, truncated, info = tetris_game.step(action)
observation, reward, terminated, truncated, info = env.step(action)

# Game over
print("Game Over!")
32 changes: 15 additions & 17 deletions examples/play_interactive_cnn.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,53 +4,51 @@
import gymnasium as gym

from tetris_gymnasium.envs import Tetris
from tetris_gymnasium.wrappers.observation import CnnObservation
from tetris_gymnasium.wrappers.observation import RgbObservation

if __name__ == "__main__":
# Create an instance of Tetris
tetris_game = gym.make("tetris_gymnasium/Tetris", render_mode="human")
tetris_game.reset(seed=42)
tetris_game = CnnObservation(tetris_game)
env = gym.make("tetris_gymnasium/Tetris", render_mode="human")
env = RgbObservation(env)
env.reset(seed=42)

# Main game loop
terminated = False
while not terminated:
# Render the current state of the game as text
tetris_game.render()
env.render()

# Pick an action from user input mapped to the keyboard
action = None
while action is None:
key = cv2.waitKey(1)

if key == ord("a"):
action = tetris_game.unwrapped.actions.move_left
action = env.unwrapped.actions.move_left
elif key == ord("d"):
action = tetris_game.unwrapped.actions.move_right
action = env.unwrapped.actions.move_right
elif key == ord("s"):
action = tetris_game.unwrapped.actions.move_down
action = env.unwrapped.actions.move_down
elif key == ord("w"):
action = tetris_game.unwrapped.actions.rotate_counterclockwise
action = env.unwrapped.actions.rotate_counterclockwise
elif key == ord("e"):
action = tetris_game.unwrapped.actions.rotate_clockwise
action = env.unwrapped.actions.rotate_clockwise
elif key == ord(" "):
action = tetris_game.unwrapped.actions.hard_drop
action = env.unwrapped.actions.hard_drop
elif key == ord("q"):
action = tetris_game.unwrapped.actions.swap
action = env.unwrapped.actions.swap
elif key == ord("r"):
tetris_game.reset(seed=42)
env.reset(seed=42)
break

if (
cv2.getWindowProperty(
tetris_game.unwrapped.window_name, cv2.WND_PROP_VISIBLE
)
cv2.getWindowProperty(env.unwrapped.window_name, cv2.WND_PROP_VISIBLE)
== 0
):
sys.exit()

# Perform the action
observation, reward, terminated, truncated, info = tetris_game.step(action)
observation, reward, terminated, truncated, info = env.step(action)

# Game over
print("Game Over!")
20 changes: 5 additions & 15 deletions examples/play_random.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,12 @@
from tetris_gymnasium.envs.tetris import Tetris

if __name__ == "__main__":
# Create an instance of Tetris
tetris_game = gym.make("tetris_gymnasium/Tetris", render_mode="ansi")
tetris_game.reset(seed=42)
env = gym.make("tetris_gymnasium/Tetris", render_mode="ansi")
env.reset(seed=42)

# Main game loop
terminated = False
while not terminated:
# Render the current state of the game as text
ansi = tetris_game.render()
print(ansi + "\n")

# Take a random action (for demonstration purposes)
action = tetris_game.action_space.sample()

# Perform the action
observation, reward, terminated, truncated, info = tetris_game.step(action)

# Game over
print(env.render() + "\n")
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
print("Game Over!")
Loading

0 comments on commit 3c4097f

Please sign in to comment.