Update AEC & Parallel docs to include examples, fix all docs links to…

… use sphinx links (#1055)
Farama-Foundation · Aug 11, 2023 · a3c096d · a3c096d
1 parent a83b98c
commit a3c096d
Show file tree

Hide file tree

Showing 20 changed files with 35 additions and 28 deletions.
diff --git a/docs/api/aec.md b/docs/api/aec.md
@@ -8,13 +8,16 @@ title: AEC
 
 By default, PettingZoo models games as [*Agent Environment Cycle*](https://arxiv.org/abs/2009.13051) (AEC) environments. This allows PettingZoo to represent any type of game multi-agent RL can consider.
 
+For more information, see [About AEC](#about-aec) or [*PettingZoo: A Standard API for Multi-Agent Reinforcement Learning*](https://arxiv.org/pdf/2009.14471.pdf).
+
+[PettingZoo Wrappers](/api/wrappers/pz_wrappers/) can be used to convert between Parallel and AEC environments, with some restrictions (e.g., an AEC env must only update once at the end of each cycle).
+
+## Examples
 [PettingZoo Classic](/environments/classic/) provides standard examples of AEC environments for turn-based games, many of which implement [Illegal Action Masking](#action-masking).
 
 We provide a [tutorial](/content/environment_creation/) for creating a simple Rock-Paper-Scissors AEC environment, showing how games with simultaneous actions can also be represented with AEC environments.
 
-[PettingZoo Wrappers](/api/wrappers/pz_wrappers/) can be used to convert between Parallel and AEC environments, with some restrictions (e.g., an AEC env must only update once at the end of each cycle).
 
-For more information, see [About AEC](#about-aec) or [*PettingZoo: A Standard API for Multi-Agent Reinforcement Learning*](https://arxiv.org/pdf/2009.14471.pdf).
 
 ## Usage
 
@@ -69,12 +72,12 @@ env.close()
 
 Note: action masking is optional, and can be implemented using either `observation` or `info`.
 
-* [PettingZoo Classic](https://pettingzoo.farama.org/environments/classic/) environments store action masks in the `observation` dict:
+* [PettingZoo Classic](/environments/classic/) environments store action masks in the `observation` dict:
   * `mask = observation["action_mask"]`
 * [Shimmy](https://shimmy.farama.org/)'s [OpenSpiel environments](https://shimmy.farama.org/environments/open_spiel/) stores action masks in the `info` dict:
   * `mask = info["action_mask"]`
 
-To implement action masking in a custom environment, see [Environment Creation: Action Masking](https://pettingzoo.farama.org/tutorials/environmentcreation/3-action-masking/)
+To implement action masking in a custom environment, see [Environment Creation: Action Masking](/tutorials/environmentcreation/3-action-masking/)
 
 For more information on action masking, see [A Closer Look at Invalid Action Masking in Policy Gradient Algorithms](https://arxiv.org/abs/2006.14171) (Huang, 2022)
 

diff --git a/docs/api/parallel.md b/docs/api/parallel.md
@@ -11,7 +11,11 @@ For a comparison with the AEC API, see [About AEC](https://pettingzoo.farama.org
 
 [PettingZoo Wrappers](/api/wrappers/pz_wrappers/) can be used to convert between Parallel and AEC environments, with some restrictions (e.g., an AEC env must only update once at the end of each cycle).
 
-We provide tutorials for creating two custom Parallel environments: [Rock-Paper-Scissors](https://pettingzoo.farama.org/content/environment_creation/#example-custom-parallel-environment), and a simple [gridworld environment](https://pettingzoo.farama.org/tutorials/environmentcreation/2-environment-logic/)
+## Examples
+
+[PettingZoo Butterfly](/environments/butterfly/) provides standard examples of Parallel environments, such as [Pistonball](/environments/butterfly/pistonball).
+
+We provide tutorials for creating two custom Parallel environments: [Rock-Paper-Scissors (Parallel)](https://pettingzoo.farama.org/content/environment_creation/#example-custom-parallel-environment), and a simple [gridworld environment](/tutorials/environmentcreation/2-environment-logic/)
 
 ## Usage
 

diff --git a/docs/api/wrappers/supersuit_wrappers.md b/docs/api/wrappers/supersuit_wrappers.md
@@ -6,7 +6,7 @@ title: Supersuit Wrappers
 
 The [SuperSuit](https://github.com/Farama-Foundation/SuperSuit) companion package (`pip install supersuit`) includes a collection of pre-processing functions which can applied to both [AEC](/api/aec/) and [Parallel](/api/parallel/) environments.
 
-To convert [space invaders](https://pettingzoo.farama.org/environments/atari/space_invaders/) to a greyscale observation space and stack the last 4 frames:
+To convert [space invaders](/environments/atari/space_invaders/) to a greyscale observation space and stack the last 4 frames:
 
 ``` python
 from pettingzoo.atari import space_invaders_v2

diff --git a/docs/environments/atari.md b/docs/environments/atari.md
@@ -52,7 +52,7 @@ Install ROMs using [AutoROM](https://github.com/Farama-Foundation/AutoROM), or s
 
 ### Usage
 
-To launch a [Space Invaders](https://pettingzoo.farama.org/environments/atari/space_invaders/) environment with random agents:
+To launch a [Space Invaders](/environments/atari/space_invaders/) environment with random agents:
 ```python
 from pettingzoo.atari import space_invaders_v2
 

diff --git a/docs/environments/butterfly.md b/docs/environments/butterfly.md
@@ -21,9 +21,9 @@ Butterfly environments are challenging scenarios created by Farama, using Pygame
 All environments require a high degree of coordination and require learning of emergent behaviors to achieve an optimal policy. As such, these environments are currently very challenging to learn.
 
 Environments are highly configurable via arguments specified in their respective documentation:
-[Cooperative Pong](https://pettingzoo.farama.org/environments/butterfly/cooperative_pong/),
-[Knights Archers Zombies](https://pettingzoo.farama.org/environments/butterfly/knights_archers_zombies/),
-[Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/).
+[Cooperative Pong](/environments/butterfly/cooperative_pong/),
+[Knights Archers Zombies](/environments/butterfly/knights_archers_zombies/),
+[Pistonball](/environments/butterfly/pistonball/).
 
 ### Installation
 The unique dependencies for this set of environments can be installed via:
@@ -34,7 +34,7 @@ pip install pettingzoo[butterfly]
 
 ### Usage
 
-To launch a [Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environment with random agents:
+To launch a [Pistonball](/environments/butterfly/pistonball/) environment with random agents:
 ```python
 from pettingzoo.butterfly import pistonball_v6
 
@@ -49,7 +49,7 @@ while env.agents:
 env.close()
 ```
 
-To launch a [Knights Archers Zombies](https://pettingzoo.farama.org/environments/butterfly/knights_archers_zombies/) environment with interactive user input (see [manual_policy.py](https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/butterfly/knights_archers_zombies/manual_policy.py)):
+To launch a [Knights Archers Zombies](/environments/butterfly/knights_archers_zombies/) environment with interactive user input (see [manual_policy.py](https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/butterfly/knights_archers_zombies/manual_policy.py)):
 ```python
 import pygame
 from pettingzoo.butterfly import knights_archers_zombies_v10

diff --git a/docs/environments/classic.md b/docs/environments/classic.md
@@ -36,7 +36,7 @@ pip install pettingzoo[classic]
 
 ### Usage
 
-To launch a [Texas Holdem](https://pettingzoo.farama.org/environments/classic/texas_holdem/) environment with random agents:
+To launch a [Texas Holdem](/environments/classic/texas_holdem/) environment with random agents:
 ``` python
 from pettingzoo.classic import texas_holdem_v4
 

diff --git a/docs/environments/mpe.md b/docs/environments/mpe.md
@@ -34,7 +34,7 @@ pip install pettingzoo[mpe]
 ````
 
 ### Usage
-To launch a [Simple Tag](https://pettingzoo.farama.org/environments/mpe/simple_tag/) environment with random agents:
+To launch a [Simple Tag](/environments/mpe/simple_tag/) environment with random agents:
 
 ``` python
 from pettingzoo.mpe import simple_tag_v3

diff --git a/docs/environments/sisl.md b/docs/environments/sisl.md
@@ -27,7 +27,7 @@ pip install pettingzoo[sisl]
 ````
 
 ### Usage
-To launch a [Waterworld](https://pettingzoo.farama.org/environments/sisl/waterworld/) environment with random agents:
+To launch a [Waterworld](/environments/sisl/waterworld/) environment with random agents:
 
 ```python
 from pettingzoo.sisl import waterworld_v4

diff --git a/docs/environments/third_party_envs.md b/docs/environments/third_party_envs.md
@@ -106,7 +106,7 @@ Interactive PettingZoo implementation of the [Cathedral](https://en.wikipedia.or
 [![PettingZoo version dependency](https://img.shields.io/badge/PettingZoo-v1.22.4-blue)]()
 [![HuggingFace likes](https://img.shields.io/badge/stars-_2-blue)]()
 
- Play [Connect Four](https://pettingzoo.farama.org/environments/classic/connect_four/) in real-time against an [RLlib](https://docs.ray.io/en/latest/rllib/index.html) agent trained via self-play and PPO.
+ Play [Connect Four](/environments/classic/connect_four/) in real-time against an [RLlib](https://docs.ray.io/en/latest/rllib/index.html) agent trained via self-play and PPO.
 * Online game demo (using [Gradio](https://www.gradio.app/) and [HuggingFace Spaces](https://huggingface.co/docs/hub/spaces-overview)): [link](https://huggingface.co/spaces/ClementBM/connectfour), [tutorial](https://clementbm.github.io/project/2023/03/29/reinforcement-learning-connect-four-rllib.html)
 
 

diff --git a/docs/index.md b/docs/index.md
@@ -72,7 +72,7 @@ An API standard for multi-agent reinforcement learning.
 **PettingZoo is a simple, pythonic interface capable of representing general multi-agent reinforcement learning (MARL) problems.**
 PettingZoo includes a wide variety of reference environments, helpful utilities, and tools for creating your own custom environments.
 
-The [AEC API](https://pettingzoo.farama.org/api/aec/) supports sequential turn based environments, while the [Parallel API](https://pettingzoo.farama.org/api/parallel/) supports environments with simultaneous actions.
+The [AEC API](/api/aec/) supports sequential turn based environments, while the [Parallel API](/api/parallel/) supports environments with simultaneous actions.
 
 Environments can be interacted with using a similar interface to [Gymnasium](https://gymnasium.farama.org):
 

diff --git a/docs/tutorials/cleanrl/advanced_PPO.md b/docs/tutorials/cleanrl/advanced_PPO.md
@@ -4,7 +4,7 @@ title: "CleanRL: Advanced PPO"
 
 # CleanRL: Advanced PPO
 
-This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on [Atari](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environments ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
+This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on [Atari](/environments/butterfly/pistonball/) environments ([Parallel](/api/parallel/)).
 This is a full training script including CLI, logging and integration with [TensorBoard](https://www.tensorflow.org/tensorboard) and [WandB](https://wandb.ai/) for experiment tracking.
 
 This tutorial is mirrored from [CleanRL](https://github.com/vwxyzjn/cleanrl)'s examples. Full documentation and experiment results can be found at [https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy](https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy)

diff --git a/docs/tutorials/cleanrl/implementing_PPO.md b/docs/tutorials/cleanrl/implementing_PPO.md
@@ -4,7 +4,7 @@ title: "CleanRL: Implementing PPO"
 
 # CleanRL: Implementing PPO
 
-This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on the [Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
+This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on the [Pistonball](/environments/butterfly/pistonball/) environment ([Parallel](/api/parallel/)).
 
 ## Environment Setup
 To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.

diff --git a/docs/tutorials/rllib/holdem.md b/docs/tutorials/rllib/holdem.md
@@ -4,7 +4,7 @@ title: "RLlib: DQN for Simple Poker"
 
 # RLlib: DQN for Simple Poker
 
-This tutorial shows how to train a [Deep Q-Network](https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#deep-q-networks-dqn-rainbow-parametric-dqn) (DQN) agent on the [Leduc Hold'em](https://pettingzoo.farama.org/environments/classic/leduc_holdem/) environment ([AEC](https://pettingzoo.farama.org/api/aec/)).
+This tutorial shows how to train a [Deep Q-Network](https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#deep-q-networks-dqn-rainbow-parametric-dqn) (DQN) agent on the [Leduc Hold'em](/environments/classic/leduc_holdem/) environment ([AEC](/api/aec/)).
 
 After training, run the provided code to watch your trained agent play vs itself. See the [documentation](https://docs.ray.io/en/latest/rllib/rllib-saving-and-loading-algos-and-policies.html) for more information.
 

diff --git a/docs/tutorials/rllib/pistonball.md b/docs/tutorials/rllib/pistonball.md
@@ -4,7 +4,7 @@ title: "RLlib: PPO for Pistonball (Parallel)"
 
 # RLlib: PPO for Pistonball
 
-This tutorial shows how to train [Proximal Policy Optimization](https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#ppo) (PPO) agents on the [Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
+This tutorial shows how to train [Proximal Policy Optimization](https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#ppo) (PPO) agents on the [Pistonball](/environments/butterfly/pistonball/) environment ([Parallel](/api/parallel/)).
 
 After training, run the provided code to watch your trained agent play vs itself. See the [documentation](https://docs.ray.io/en/latest/rllib/rllib-saving-and-loading-algos-and-policies.html) for more information.
 

diff --git a/docs/tutorials/sb3/connect_four.md b/docs/tutorials/sb3/connect_four.md
@@ -4,7 +4,7 @@ title: "SB3: Action Masked PPO for Connect Four"
 
 # SB3: Action Masked PPO for Connect Four
 
-This tutorial shows how to train a agents using Maskable [Proximal Policy Optimization](https://sb3-contrib.readthedocs.io/en/master/modules/ppo_mask.html) (PPO) on the [Connect Four](https://pettingzoo.farama.org/environments/classic/chess/) environment ([AEC](https://pettingzoo.farama.org/api/aec/)).
+This tutorial shows how to train a agents using Maskable [Proximal Policy Optimization](https://sb3-contrib.readthedocs.io/en/master/modules/ppo_mask.html) (PPO) on the [Connect Four](/environments/classic/chess/) environment ([AEC](/api/aec/)).
 
 It creates a custom Wrapper to convert to a [Gymnasium](https://gymnasium.farama.org/)-like environment which is compatible with [SB3 action masking](https://sb3-contrib.readthedocs.io/en/master/modules/ppo_mask.html).
 

diff --git a/docs/tutorials/sb3/kaz.md b/docs/tutorials/sb3/kaz.md
@@ -4,7 +4,7 @@ title: "SB3: PPO for Knights-Archers-Zombies"
 
 # SB3: PPO for Knights-Archers-Zombies
 
-This tutorial shows how to train agents using [Proximal Policy Optimization](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html) (PPO) on the [Knights-Archers-Zombies](https://pettingzoo.farama.org/environments/butterfly/knights_archers_zombies/) environment ([AEC](https://pettingzoo.farama.org/api/aec/)).
+This tutorial shows how to train agents using [Proximal Policy Optimization](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html) (PPO) on the [Knights-Archers-Zombies](/environments/butterfly/knights_archers_zombies/) environment ([AEC](/api/aec/)).
 
 We use SuperSuit to create vectorized environments, leveraging multithreading to speed up training (see SB3's [vector environments documentation](https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html)).
 

diff --git a/docs/tutorials/sb3/waterworld.md b/docs/tutorials/sb3/waterworld.md
@@ -4,7 +4,7 @@ title: "SB3: PPO for Waterworld (Parallel)"
 
 # SB3: PPO for Waterworld
 
-This tutorial shows how to train agents using [Proximal Policy Optimization](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html) (PPO) on the [Waterworld](https://pettingzoo.farama.org/environments/sisl/waterworld/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
+This tutorial shows how to train agents using [Proximal Policy Optimization](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html) (PPO) on the [Waterworld](/environments/sisl/waterworld/) environment ([Parallel](/api/parallel/)).
 
 We use SuperSuit to create vectorized environments, leveraging multithreading to speed up training (see SB3's [vector environments documentation](https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html)).
 

diff --git a/docs/tutorials/tianshou/advanced.md b/docs/tutorials/tianshou/advanced.md
@@ -4,9 +4,9 @@ title: "Tianshou: CLI and Logging"
 
 # Tianshou: CLI and Logging
 
-This tutorial is a full example using Tianshou to train a [Deep Q-Network](https://tianshou.readthedocs.io/en/master/tutorials/dqn.html) (DQN) agent on the [Tic-Tac-Toe](https://pettingzoo.farama.org/environments/classic/tictactoe/) environment.
+This tutorial is a full example using Tianshou to train a [Deep Q-Network](https://tianshou.readthedocs.io/en/master/tutorials/dqn.html) (DQN) agent on the [Tic-Tac-Toe](/environments/classic/tictactoe/) environment.
 
-It extends the code from [Training Agents](https://pettingzoo.farama.org/tutorials/tianshou/intermediate/) to add CLI (using [argparse](https://docs.python.org/3/library/argparse.html)) and logging (using Tianshou's [Logger](https://tianshou.readthedocs.io/en/master/tutorials/logger.html)).
+It extends the code from [Training Agents](/tutorials/tianshou/intermediate/) to add CLI (using [argparse](https://docs.python.org/3/library/argparse.html)) and logging (using Tianshou's [Logger](https://tianshou.readthedocs.io/en/master/tutorials/logger.html)).
 
 
 ## Environment Setup

diff --git a/docs/tutorials/tianshou/beginner.md b/docs/tutorials/tianshou/beginner.md
@@ -6,7 +6,7 @@ title: "Tianshou: Basic API Usage"
 
 This tutorial is a simple example of how to use [Tianshou](https://github.com/thu-ml/tianshou) with a PettingZoo environment.
 
-It demonstrates a game betwenen two [random policy](https://tianshou.readthedocs.io/en/master/_modules/tianshou/policy/random.html) agents in the [rock-paper-scissors](https://pettingzoo.farama.org/environments/classic/rps/) environment.
+It demonstrates a game betwenen two [random policy](https://tianshou.readthedocs.io/en/master/_modules/tianshou/policy/random.html) agents in the [rock-paper-scissors](/environments/classic/rps/) environment.
 
 ## Environment Setup
 To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.

diff --git a/docs/tutorials/tianshou/intermediate.md b/docs/tutorials/tianshou/intermediate.md
@@ -4,7 +4,7 @@ title: "Tianshou: Training Agents"
 
 # Tianshou: Training Agents
 
-This tutorial shows how to use [Tianshou](https://github.com/thu-ml/tianshou) to train a [Deep Q-Network](https://tianshou.readthedocs.io/en/master/tutorials/dqn.html) (DQN) agent to play vs a [random policy](https://tianshou.readthedocs.io/en/master/_modules/tianshou/policy/random.html) agent in the [Tic-Tac-Toe](https://pettingzoo.farama.org/environments/classic/tictactoe/) environment.
+This tutorial shows how to use [Tianshou](https://github.com/thu-ml/tianshou) to train a [Deep Q-Network](https://tianshou.readthedocs.io/en/master/tutorials/dqn.html) (DQN) agent to play vs a [random policy](https://tianshou.readthedocs.io/en/master/_modules/tianshou/policy/random.html) agent in the [Tic-Tac-Toe](/environments/classic/tictactoe/) environment.
 
 ## Environment Setup
 To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.