FLAIROx
diff --git a/‎docs/Environments/mpe.md
Lines changed: 21 additions & 22 deletions b/‎docs/Environments/mpe.md
Lines changed: 21 additions & 22 deletions
diff --git a/‎docs/Environments/smax.md
Lines changed: 9 additions & 2 deletions b/‎docs/Environments/smax.md
Lines changed: 9 additions & 2 deletions
diff --git a/‎docs/Environments/storm.md
Lines changed: 7 additions & 1 deletion b/‎docs/Environments/storm.md
Lines changed: 7 additions & 1 deletion
diff --git a/‎docs/Installation.md
Lines changed: 3 additions & 3 deletions b/‎docs/Installation.md
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/imgs/mpe_qlearning_agent_training-1.png
24.7 KB b/‎docs/imgs/mpe_qlearning_agent_training-1.png
24.7 KB
diff --git a/‎docs/imgs/mpe_qlearning_speed-1.png
93.6 KB b/‎docs/imgs/mpe_qlearning_speed-1.png
93.6 KB
diff --git a/‎docs/imgs/mpe_speedup-1.png
38.4 KB b/‎docs/imgs/mpe_speedup-1.png
38.4 KB
diff --git a/‎docs/imgs/sc2_speedup-1.png
25 KB b/‎docs/imgs/sc2_speedup-1.png
25 KB
diff --git a/‎docs/index.md
Lines changed: 33 additions & 8 deletions b/‎docs/index.md
Lines changed: 33 additions & 8 deletions
diff --git a/‎mkdocs.yml
Lines changed: 4 additions & 1 deletion b/‎mkdocs.yml
Lines changed: 4 additions & 1 deletion
@@ -1,32 +1,27 @@
 # MPE
 
-Multi Particle Environments (MPE) are a set of communication oriented environment where particle agents can (sometimes) move, communicate, see each other, push each other around, and interact with fixed landmarks. We implement all of the [PettingZoo MPE Environments](https://pettingzoo.farama.org/environments/mpe/).
+Multi Particle Environments (MPE) are a set of communication oriented environment where particle agents can (sometimes) move, communicate, see each other, push each other around, and interact with fixed landmarks. 
 
+![MPE](https://github.com/FLAIROx/JaxMARL/blob/main/docs/imgs/qmix_MPE_simple_tag_v3.gif?raw=true){ width=300px }
+/// caption
+MPE Simple Tag
+///
 
-<div class="collage">
-  <div class="row" align="left">
-    <img src="docs/qmix_MPE_simple_tag_v3.gif" alt="MPE Simple Tag" width="30%"/>
-    <img src="docs/vdn_MPE_simple_spread_v3.gif" alt="MPE Simple Spread" width="30%"/>
-    <img src="docs/qmix_MPE_simple_speaker_listener_v4.gif" alt="MPE Speaker Listener" width="30%">
-  </div>
-</div>
-
+## Environments 
 
+We implement all of the [PettingZoo MPE Environments](https://pettingzoo.farama.org/environments/mpe/):
 
 | Envrionment  | JaxMARL Registry Name  |
 |---|---|
-| Simple  | `MPE_simple_v3` |
-| Simple Push  | `MPE_simple_push_v3`  |
-| Simple Spread  |  `MPE_simple_spread_v3` |
-| Simple Crypto  | `MPE_simple_crypto_v3`  |
-| Simple Speaker Listener  | `MPE_simple_speaker_listener_v4`  |
-| Simple Tag  | `MPE_simple_tag_v3`  |
-| Simple World Comm | `MPE_simple_world_comm_v3` |
-| Simple Reference | `MPE_simple_reference_v3` |
-| Simple Adversary | `MPE_simple_adversary_v3` |
-
-
-The implementations follow the PettingZoo code as closely as possible, including sharing variable names and version numbers. There are occasional discrepancies between the PettingZoo code and docs, where this occurs we have followed the code. As our implementation closely follows the PettingZoo code, please refer to their documentation for further information on the environments.
+| Simple  | MPE_simple_v3 |
+| Simple Push  | MPE_simple_push_v3  |
+| Simple Spread  |  MPE_simple_spread_v3 |
+| Simple Crypto  | MPE_simple_crypto_v3  |
+| Simple Speaker Listener  | MPE_simple_speaker_listener_v4  |
+| Simple Tag  | MPE_simple_tag_v3  |
+| Simple World Comm | MPE_simple_world_comm_v3 |
+| Simple Reference | MPE_simple_reference_v3 |
+| Simple Adversary | MPE_simple_adversary_v3 |
 
 We additionally include a fully cooperative variant of Simple Tag, first used to evaluate FACMAC. In this environmnet, a number of agents attempt to tag a number of prey, where the prey are controlled by a heuristic AI.
 
@@ -36,6 +31,10 @@ We additionally include a fully cooperative variant of Simple Tag, first used to
 | 6 agents, 2 prey  | `MPE_simple_facmac_6a_v1` |
 | 9 agents, 3 prey  | `MPE_simple_facmac_9a_v1` |
 
+## Implementation notes
+
+The implementations follow the PettingZoo code as closely as possible, including sharing variable names and version numbers. There are occasional discrepancies between the PettingZoo code and docs, where this occurs we have followed the code. As our implementation closely follows the PettingZoo code, please refer to their documentation for further information on the environments.
+
 ## Action Space
 Following the PettingZoo implementation, we allow for both discrete or continuous action spaces in all MPE envrionments. The environments use discrete actions by default.
 
@@ -53,7 +52,7 @@ The exact observation varies for each environment, but in general it is a vector
 ## Visualisation
 Check the example `mpe_introduction.py` file in the tutorials folder for an introduction to our implementation of the MPE environments, including an example visualisation. We animate the environment after the state transitions have been collected as follows:
 
-```python
+``` python
 import jax 
 from jaxmarl import make
 from jaxmarl.environments.mpe import MPEVisualizer
 
@@ -1,6 +1,13 @@
 # SMAX
-## Description
-SMAX is a purely JAX SMAC-like environment. It, like SMAC, focuses on decentralised unit micromanagement across a range of scenarios. Each scenario features fixed teams.
+
+**SMAX is a purely JAX SMAC-like environment**. It, like SMAC, focuses on decentralised unit micromanagement across a range of scenarios. Each scenario features fixed teams.
+
+![SMAX](https://github.com/FLAIROx/JaxMARL/blob/main/docs/imgs/smax.gif?raw=true){ width=300px }
+/// caption
+2s3z Scenario
+///
+
+
 
 ## Scenarios
 
 
@@ -1,15 +1,21 @@
 # STORM
 
+
+![STORM](https://github.com/FLAIROx/JaxMARL/blob/main/docs/imgs/storm.gif?raw=true){ width=250px}
+
 Spatial-Temporal Representations of Matrix Games (STORM) is inspired by the "in the Matrix" games in [Melting Pot 2.0](https://arxiv.org/abs/2211.13746), the [STORM](https://openreview.net/forum?id=54F8woU8vhq) environment expands on matrix games by representing them as grid-world scenarios. Agents collect resources which define their strategy during interactions and are rewarded based on a pre-specified payoff matrix. This allows for the embedding of fully cooperative, competitive or general-sum games, such as the prisoner's dilemma. 
 
 Thus, STORM can be used for studying paradigms such as *opponent shaping*, where agents act with the intent to change other agents' learning dynamics. Compared to the Coin Game or matrix games, the grid-world setting presents a variety of new challenges such as partial observability, multi-step agent interactions, temporally-extended actions, and longer time horizons. Unlike the "in the Matrix" games from Melting Pot, STORM features stochasticity, increasing the difficulty
 
+## Environment explanation
+
+
 
 ## Visualisation
 
 We render each timestep and then create a gif from the collection of images. Further examples are provided [here](https://github.com/FLAIROx/JaxMARL/tree/main/jaxmarl/tutorials).
 
-```python
+``` python
 import jax
 import jax.numpy as jnp
 from PIL import Image
 
@@ -4,7 +4,7 @@
 
 Before installing, ensure you have the correct [JAX installation](https://github.com/google/jax#installation) for your hardware accelerator. We have tested up to JAX version 0.4.25. The JaxMARL environments can be installed directly from PyPi:
 
-``` sh { .yaml .copy }
+``` sh
 pip install jaxmarl 
 ```
 
@@ -13,11 +13,11 @@ pip install jaxmarl
 If you would like to also run the algorithms, install the source code as follows:
 
 1. Clone the repository:
-    ``` sh { .yaml .copy }
+    ``` sh
     git clone https://github.com/FLAIROx/JaxMARL.git && cd JaxMARL
     ```
 2. Install requirements:
-    ``` sh { .yaml .copy }
+    ``` sh
     pip install -e .[algs] && export PYTHONPATH=./JaxMARL:$PYTHONPATH
     ```
 3. For the fastest start, we reccoment using our Dockerfile, the usage of which is outlined below.
 
@@ -58,21 +58,46 @@ actions = {agent: env.action_space(agent).sample(key_act[i]) for i, agent in enu
 obs, state, reward, done, infos = env.step(key_step, state, actions)
 ```
 
-## Performance Examples
-*coming soon*
+## JaxMARL's performance
+
+![MPE](imgs/mpe_speedup-1.png){ width=300px}
+/// caption
+Speed of JaxMARL's training pipeline compared to two popular MARL libraries when training an RNN agent using IPPO on an MPE task.
+///
+
+Our paper contains further results but the plot above illustrated the speed ups made possible by JIT-compiling the entire traning loop. JaxMARL is much much faster than traditional approaches, while also producing results consistent with existing implementations. 
 
 ## Related Works
-This works is heavily related to and builds on many other works. We would like to highlight some of the works that we believe would be relevant to readers:
+This works is heavily related to and builds on many other works, PureJaxRL provides a [list of projects](https://github.com/luchris429/purejaxrl/blob/main/RESOURCES.md) within the JaxRL ecosystem. Those particularly relevant to multi-agent work are:
+
+JAX-native algorithms:
+
+- [Mava](https://github.com/instadeepai/Mava): JAX implementations of IPPO and MAPPO, two popular MARL algorithms.
+- [PureJaxRL](https://github.com/luchris429/purejaxrl): JAX implementation of PPO, and demonstration of end-to-end JAX-based RL training.
+
+JAX-native environments:
+
+- [Gymnax](https://github.com/RobertTLange/gymnax): Implementations of classic RL tasks including classic control, bsuite and MinAtar.
+- [Jumanji](https://github.com/instadeepai/jumanji): A diverse set of environments ranging from simple games to NP-hard combinatorial problems.
+- [Pgx](https://github.com/sotetsuk/pgx): JAX implementations of classic board games, such as Chess, Go and Shogi.
+- [Brax](https://github.com/google/brax): A fully differentiable physics engine written in JAX, features continuous control tasks. We use this as the base for MABrax (as the name suggests!)
+- [XLand-MiniGrid](https://github.com/corl-team/xland-minigrid): Meta-RL gridworld environments inspired by XLand and MiniGrid.
+
+Other great JAX related works from our lab are below:
+
+- [JaxIRL](https://github.com/FLAIROx/jaxirl?tab=readme-ov-file): JAX implementation of algorithms for inverse reinforcement learning.
+- [Craftax](https://github.com/MichaelTMatthews/Craftax): (Crafter + NetHack) in JAX.
+- [JaxUED](https://github.com/DramaCow/jaxued?tab=readme-ov-file): JAX implementations of autocurricula baselines for RL.
+- [Kinetix](https://kinetix-env.github.io/): Large-scale training of RL agents in a vast and diverse space of simulated tasks, enabled by JAX.
+
+Other things that could help:
 
-* [Jumanji](https://github.com/instadeepai/jumanji). A suite of JAX-based RL environments. It includes some multi-agent ones such as RobotWarehouse.
-* [VectorizedMultiAgentSimulator (VMAS)](https://github.com/proroklab/VectorizedMultiAgentSimulator). It performs similar vectorization for some MARL environments, but is done in PyTorch.
-* More to be added soon :)
+- [Benchmarl](https://github.com/facebookresearch/BenchMARL): A collection of MARL benchmarks based on TorchRL
 
-More documentation to follow soon!
 
 ## Citing JaxMARL
 If you use JaxMARL in your work, please cite us as follows:
-```bibtex
+``` bibtex
 @article{flair2023jaxmarl,
     title={JaxMARL: Multi-Agent RL Environments in JAX},
     author={Alexander Rutherford and Benjamin Ellis and Matteo Gallici and Jonathan Cook and Andrei Lupu and Gardar Ingvarsson and Timon Willi and Akbir Khan and Christian Schroeder de Witt and Alexandra Souly and Saptarashmi Bandyopadhyay and Mikayel Samvelyan and Minqi Jiang and Robert Tjarko Lange and Shimon Whiteson and Bruno Lacerda and Nick Hawes and Tim Rocktaschel and Chris Lu and Jakob Nicolaus Foerster},
 
@@ -5,6 +5,7 @@ theme:
   name: material
   features:
     - navigation.sections
+    - content.code.copy
   palette:
     # Dark Mode
     - scheme: slate
@@ -40,4 +41,6 @@ markdown_extensions:
       pygments_lang_class: true
   - pymdownx.inlinehilite
   - pymdownx.snippets
-  - pymdownx.superfences
+  - pymdownx.superfences
+  - pymdownx.blocks.caption
+  - attr_list