Merge pull request #22 from Max-We/train

Train script and documentation
Max-We · May 23, 2024 · 3c4097f · 3c4097f
2 parents 07e72bd + 373a050
commit 3c4097f
Show file tree

Hide file tree

Showing 18 changed files with 1,374 additions and 607 deletions.
diff --git a/.gitignore b/.gitignore
@@ -165,3 +165,5 @@ local_experiments/
 # Examples / training outputs
 examples/runs
 runs
+wandb
+videos
diff --git a/README.md b/README.md
@@ -1,41 +1,43 @@
 ![logo](./docs/_static/logo.png "Tetris Gymnasium")
 
-> Tetris Gymnasium is currently under early development!
-
 Tetris Gymnasium is tightly integrated with Gymnasium and exposes a simple API for training agents to play Tetris.
 
-The environment offers state-of-the-art performance and holds a high standard for code quality. With it, researchers and developers can focus on their research and development, rather than the environment itself.
-
 Getting started is easy. Here is a simple example of an environment with random actions:
 
 ```python
 import gymnasium as gym
 from tetris_gymnasium.envs import Tetris
 
-env = gym.make("tetris_gymnasium/Tetris")
-observation, info = env.reset(seed=42)
-for _ in range(1000):
-   action = env.action_space.sample()  # this is where you would insert your policy
-   observation, reward, terminated, truncated, info = env.step(action)
-
-   if terminated or truncated:
-      observation, info = env.reset()
+env = gym.make("tetris_gymnasium/Tetris", render_mode="ansi")
+env.reset(seed=42)
 
-env.close()
+terminated = False
+while not terminated:
+    print(env.render() + "\n")
+    action = env.action_space.sample()
+    observation, reward, terminated, truncated, info = env.step(action)
+print("Game Over!")
 ```
 
-##
+## Background
+
+Tetris Gymnasium tries to solve problems of other environments by being modular, understandable and adjustable. You can read more about the background in our paper: _Piece by Piece: Assembling a Modular Reinforcement Learning Environment for Tetris_ (SOON).
+
+Abstract:
 
-The documentation can be found on [GitHub Pages](https://max-we.github.io/Tetris-Gymnasium/)
+>The game of Tetris is an open challenge in machine learning and especially Reinforcement Learning (RL). Despite its popularity, contemporary environments for the game lack key qualities, such as a clear documentation, an up-to-date codebase or game related features.
+This work introduces Tetris Gymnasium, a modern RL environment built with Gymnasium, that aims to address these problems by being modular, understandable and adjustable.
+To evaluate Tetris Gymnasium on these qualities, a Deep Q Learning agent was trained and compared to a baseline environment, and it was found that it fulfills all requirements of a feature-complete RL environment while being adjustable to many different requirements.
+The source-code and documentation is available at on GitHub and can be used for free under the MIT license.
+
+## Documentation
+
+The full documentation of the project can be found on [GitHub Pages](https://max-we.github.io/Tetris-Gymnasium/).
 
 ## License
 
 This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
 
 ## Acknowledgements
 
-We would like to thank the creators and maintainers of Gymnasium, CleanRL and Tetris-deep-Q-learning-pytorch for providing a powerful frameworks and reference implementations
-
----
-
-Enjoy using the Gymnasium Tetris Environment for your reinforcement learning experiments! If you have any questions or need further assistance, don't hesitate to reach out to us. Happy coding! 🎮🕹️
+We would like to thank the creators and maintainers of [Gymnasium](https://github.com/Farama-Foundation/Gymnasium), [CleanRL](https://github.com/vwxyzjn/cleanrl) and [Tetris-deep-Q-learning-pytorch](https://github.com/uvipen/Tetris-deep-Q-learning-pytorch) for providing a powerful frameworks and reference implementations.
diff --git a/docs/development/contributing.md b/docs/development/contributing.md
@@ -1,3 +1,54 @@
 # Contributing
 
-Under construction.
+Contributions are always welcome, no matter how large or small. Feel free to create an issue or a pull request.
+If you want to contribute, please read the following guidelines.
+
+The following section is based on the [Gymnasium contributing guidelines](https://github.com/Farama-Foundation/Gymnasium/blob/main/CONTRIBUTING.md).
+
+## Type checking
+
+The project uses `pyright` to check types.
+To type check locally, install `pyright` per official [instructions](https://github.com/microsoft/pyright#command-line).
+It's configuration lives within `pyproject.toml`. It includes list of included and excluded files currently supporting type checks.
+To run `pyright` for the project, run the pre-commit process (`pre-commit run --all-files`) or `pyright --project=pyproject.toml`
+Alternatively, pyright is a built-in feature of VSCode that will automatically provide type hinting.
+
+### Adding typing to more modules and packages
+
+If you would like to add typing to a module in the project,
+the list of included, excluded and strict files can be found in pyproject.toml (pyproject.toml -> [tool.pyright]).
+To run `pyright` for the project, run the pre-commit process (`pre-commit run --all-files`) or `pyright`
+
+## Git hooks
+
+The CI will run several checks on the new code pushed to the Gymnasium repository. These checks can also be run locally without waiting for the CI by following the steps below:
+
+1. [install `pre-commit`](https://pre-commit.com/#install),
+2. Install the Git hooks by running `pre-commit install`.
+
+Once those two steps are done, the Git hooks will be run automatically at every new commit.
+The Git hooks can also be run manually with `pre-commit run --all-files`, and if needed they can be skipped (not recommended) with `git commit --no-verify`.
+**Note:** you may have to run `pre-commit run --all-files` manually a couple of times to make it pass when you commit, as each formatting tool will first format the code and fail the first time but should pass the second time.
+
+Additionally, for pull requests, the project runs a number of tests for the whole project using [pytest](https://docs.pytest.org/en/latest/getting-started.html#install-pytest).
+These tests can be run locally with `pytest` in the root folder. If any doctest is modified, run `pytest --doctest-modules --doctest-continue-on-failure gymnasium` to check the changes.
+
+## Docstrings
+
+Pydocstyle has been added to the pre-commit process such that all new functions follow the [google docstring style](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html).
+All new functions require either a short docstring, a single line explaining the purpose of a function
+or a multiline docstring that documents each argument and the return type (if there is one) of the function.
+In addition, new file and class require top docstrings that should outline the purpose of the file/class.
+For classes, code block examples can be provided in the top docstring and not the constructor arguments.
+
+To check your docstrings are correct, run `pre-commit run --all-files` or `pydocstyle --source --explain --convention=google`.
+If all docstrings that fail, the source and reason for the failure is provided.
+
+## Building the docs
+
+You can use `sphinx-autobuild` to build the documentation locally.
+
+```shell
+cd docs
+poetry run sphinx-autobuild -b dirhtml --watch ../tetris_gymnasium --re-ignore "pickle$" . _build
+```
diff --git a/docs/index.md b/docs/index.md
@@ -11,29 +11,50 @@ lastpage:
 ```{project-heading}
 A customisable, easy-to-use and performant Tetris environment for Gymnasium
 ```
-
 Tetris Gymnasium is tightly integrated with Gymnasium and exposes a simple API for training agents to play Tetris.
 
-The environment offers state-of-the-art performance and holds a high standard for code quality. With it, researchers and developers can focus on their research and development, rather than the environment itself.
-
 Getting started is easy. Here is a simple example of an environment with random actions:
 
 ```{code-block} python
 import gymnasium as gym
 from tetris_gymnasium.envs import Tetris
 
-env = gym.make("tetris_gymnasium/Tetris")
-observation, info = env.reset(seed=42)
-for _ in range(1000):
-   action = env.action_space.sample()  # this is where you would insert your policy
-   observation, reward, terminated, truncated, info = env.step(action)
+env = gym.make("tetris_gymnasium/Tetris", render_mode="ansi")
+env.reset(seed=42)
 
-   if terminated or truncated:
-      observation, info = env.reset()
-
-env.close()
+terminated = False
+while not terminated:
+    print(env.render() + "\n")
+    action = env.action_space.sample()
+    observation, reward, terminated, truncated, info = env.step(action)
+print("Game Over!")
 ```
 
+# Information
+
+## Background
+
+Tetris Gymnasium tries to solve problems of other environments by being modular, understandable and adjustable. You can read more about the background in our paper: _Piece by Piece: Assembling a Modular Reinforcement Learning Environment for Tetris_ (SOON).
+
+Abstract:
+
+>The game of Tetris is an open challenge in machine learning and especially Reinforcement Learning (RL). Despite its popularity, contemporary environments for the game lack key qualities, such as a clear documentation, an up-to-date codebase or game related features.
+This work introduces Tetris Gymnasium, a modern RL environment built with Gymnasium, that aims to address these problems by being modular, understandable and adjustable.
+To evaluate Tetris Gymnasium on these qualities, a Deep Q Learning agent was trained and compared to a baseline environment, and it was found that it fulfills all requirements of a feature-complete RL environment while being adjustable to many different requirements.
+The source-code and documentation is available at on GitHub and can be used for free under the MIT license.
+
+## Documentation
+
+The full documentation of the project can be found on [GitHub Pages](https://max-we.github.io/Tetris-Gymnasium/).
+
+## License
+
+This project is licensed under the MIT License - see the [LICENSE](../LICENSE) file for details.
+
+## Acknowledgements
+
+We would like to thank the creators and maintainers of [Gymnasium](https://github.com/Farama-Foundation/Gymnasium), [CleanRL](https://github.com/vwxyzjn/cleanrl) and [Tetris-deep-Q-learning-pytorch](https://github.com/uvipen/Tetris-deep-Q-learning-pytorch) for providing a powerful frameworks and reference implementations.
+
 ```{toctree}
 :maxdepth: 2
 :caption: Introduction

diff --git a/docs/introduction/installation.md b/docs/introduction/installation.md
@@ -1,11 +1,11 @@
 # Installation
 
-At the moment, the only way to install the package is to clone the repository and to install it using poetry.
+At the moment, you can install the environment by cloning the repository and running `poetry install`.
+
+> In the near future, this library will be distributed via PyPI.
 
 ```{code-block} bash
 git clone https://github.com/Max-We/Tetris-Gymnasium.git
 cd Tetris-Gymnasium
 poetry install
 ```
-
-In the near future, this library will be distributed via PyPI.
diff --git a/docs/introduction/quickstart.md b/docs/introduction/quickstart.md
@@ -43,3 +43,5 @@ poetry run python examples/train_lin.py # uses a linear model
 # or
 poetry run python examples/train_cnn.py # uses convolutions
 ```
+
+You can refer to the [CleanRL documentation](https://docs.cleanrl.dev/rl-algorithms/dqn/) for more information on the training script. Note: If you have tracking enabled, you will be prompted to login to Weights & Biases during the first run. This behavior can be adjusted in the script.
diff --git a/docs/utilities/wrappers.md b/docs/utilities/wrappers.md
@@ -12,11 +12,11 @@ shape of the observation space or for adding additional information to the obser
 ### Implementations
 
 ```{eval-rst}
-.. autoclass:: tetris_gymnasium.wrappers.observation.CnnObservation
+.. autoclass:: tetris_gymnasium.wrappers.observation.RgbObservation
 ```
 
 #### Methods
 ```{eval-rst}
-.. automethod:: tetris_gymnasium.wrappers.observation.CnnObservation.observation
-.. automethod:: tetris_gymnasium.wrappers.observation.CnnObservation.render
+.. automethod:: tetris_gymnasium.wrappers.observation.RgbObservation.observation
+.. automethod:: tetris_gymnasium.wrappers.observation.RgbObservation.render
 ```
diff --git a/examples/play_interactive.py b/examples/play_interactive.py
@@ -7,48 +7,46 @@
 
 if __name__ == "__main__":
     # Create an instance of Tetris
-    tetris_game = gym.make("tetris_gymnasium/Tetris", render_mode="human")
-    tetris_game.reset(seed=42)
+    env = gym.make("tetris_gymnasium/Tetris", render_mode="human")
+    env.reset(seed=42)
 
     # Main game loop
     terminated = False
     while not terminated:
         # Render the current state of the game as text
-        tetris_game.render()
+        env.render()
 
         # Pick an action from user input mapped to the keyboard
         action = None
         while action is None:
             key = cv2.waitKey(1)
 
             if key == ord("a"):
-                action = tetris_game.unwrapped.actions.move_left
+                action = env.unwrapped.actions.move_left
             elif key == ord("d"):
-                action = tetris_game.unwrapped.actions.move_right
+                action = env.unwrapped.actions.move_right
             elif key == ord("s"):
-                action = tetris_game.unwrapped.actions.move_down
+                action = env.unwrapped.actions.move_down
             elif key == ord("w"):
-                action = tetris_game.unwrapped.actions.rotate_counterclockwise
+                action = env.unwrapped.actions.rotate_counterclockwise
             elif key == ord("e"):
-                action = tetris_game.unwrapped.actions.rotate_clockwise
+                action = env.unwrapped.actions.rotate_clockwise
             elif key == ord(" "):
-                action = tetris_game.unwrapped.actions.hard_drop
+                action = env.unwrapped.actions.hard_drop
             elif key == ord("q"):
-                action = tetris_game.unwrapped.actions.swap
+                action = env.unwrapped.actions.swap
             elif key == ord("r"):
-                tetris_game.reset(seed=42)
+                env.reset(seed=42)
                 break
 
             if (
-                cv2.getWindowProperty(
-                    tetris_game.unwrapped.window_name, cv2.WND_PROP_VISIBLE
-                )
+                cv2.getWindowProperty(env.unwrapped.window_name, cv2.WND_PROP_VISIBLE)
                 == 0
             ):
                 sys.exit()
 
         # Perform the action
-        observation, reward, terminated, truncated, info = tetris_game.step(action)
+        observation, reward, terminated, truncated, info = env.step(action)
 
     # Game over
     print("Game Over!")
diff --git a/examples/play_interactive_cnn.py b/examples/play_interactive_cnn.py
@@ -4,53 +4,51 @@
 import gymnasium as gym
 
 from tetris_gymnasium.envs import Tetris
-from tetris_gymnasium.wrappers.observation import CnnObservation
+from tetris_gymnasium.wrappers.observation import RgbObservation
 
 if __name__ == "__main__":
     # Create an instance of Tetris
-    tetris_game = gym.make("tetris_gymnasium/Tetris", render_mode="human")
-    tetris_game.reset(seed=42)
-    tetris_game = CnnObservation(tetris_game)
+    env = gym.make("tetris_gymnasium/Tetris", render_mode="human")
+    env = RgbObservation(env)
+    env.reset(seed=42)
 
     # Main game loop
     terminated = False
     while not terminated:
         # Render the current state of the game as text
-        tetris_game.render()
+        env.render()
 
         # Pick an action from user input mapped to the keyboard
         action = None
         while action is None:
             key = cv2.waitKey(1)
 
             if key == ord("a"):
-                action = tetris_game.unwrapped.actions.move_left
+                action = env.unwrapped.actions.move_left
             elif key == ord("d"):
-                action = tetris_game.unwrapped.actions.move_right
+                action = env.unwrapped.actions.move_right
             elif key == ord("s"):
-                action = tetris_game.unwrapped.actions.move_down
+                action = env.unwrapped.actions.move_down
             elif key == ord("w"):
-                action = tetris_game.unwrapped.actions.rotate_counterclockwise
+                action = env.unwrapped.actions.rotate_counterclockwise
             elif key == ord("e"):
-                action = tetris_game.unwrapped.actions.rotate_clockwise
+                action = env.unwrapped.actions.rotate_clockwise
             elif key == ord(" "):
-                action = tetris_game.unwrapped.actions.hard_drop
+                action = env.unwrapped.actions.hard_drop
             elif key == ord("q"):
-                action = tetris_game.unwrapped.actions.swap
+                action = env.unwrapped.actions.swap
             elif key == ord("r"):
-                tetris_game.reset(seed=42)
+                env.reset(seed=42)
                 break
 
             if (
-                cv2.getWindowProperty(
-                    tetris_game.unwrapped.window_name, cv2.WND_PROP_VISIBLE
-                )
+                cv2.getWindowProperty(env.unwrapped.window_name, cv2.WND_PROP_VISIBLE)
                 == 0
             ):
                 sys.exit()
 
         # Perform the action
-        observation, reward, terminated, truncated, info = tetris_game.step(action)
+        observation, reward, terminated, truncated, info = env.step(action)
 
     # Game over
     print("Game Over!")
diff --git a/examples/play_random.py b/examples/play_random.py
@@ -3,22 +3,12 @@
 from tetris_gymnasium.envs.tetris import Tetris
 
 if __name__ == "__main__":
-    # Create an instance of Tetris
-    tetris_game = gym.make("tetris_gymnasium/Tetris", render_mode="ansi")
-    tetris_game.reset(seed=42)
+    env = gym.make("tetris_gymnasium/Tetris", render_mode="ansi")
+    env.reset(seed=42)
 
-    # Main game loop
     terminated = False
     while not terminated:
-        # Render the current state of the game as text
-        ansi = tetris_game.render()
-        print(ansi + "\n")
-
-        # Take a random action (for demonstration purposes)
-        action = tetris_game.action_space.sample()
-
-        # Perform the action
-        observation, reward, terminated, truncated, info = tetris_game.step(action)
-
-    # Game over
+        print(env.render() + "\n")
+        action = env.action_space.sample()
+        observation, reward, terminated, truncated, info = env.step(action)
     print("Game Over!")