Merge branch 'master' of github.com:cameronangliss/poke-env into open…

…-sheets-fill-battle-obj
hsahovic · Jan 4, 2025 · 3b1d398 · 3b1d398
2 parents 2178ac2 + ea4b5f3
commit 3b1d398
Show file tree

Hide file tree

Showing 22 changed files with 474 additions and 578 deletions.
diff --git a/docs/source/examples/index.rst b/docs/source/examples/index.rst
@@ -11,4 +11,4 @@ This page lists detailled examples demonstrating how to use this package. They a
     quickstart
     using_a_custom_teambuilder
     connecting_to_showdown_and_challenging_humans
-    rl_with_open_ai_gym_wrapper
+    rl_with_gymnasium_wrapper
diff --git a/.../examples/rl_with_open_ai_gym_wrapper.rst → ...ce/examples/rl_with_gymnasium_wrapper.rst b/.../examples/rl_with_open_ai_gym_wrapper.rst → ...ce/examples/rl_with_gymnasium_wrapper.rst
@@ -1,18 +1,18 @@
-.. _rl_with_open_ai_gym_wrapper:
+.. _rl_with_gymnasium_wrapper:
 
-Reinforcement learning with the OpenAI Gym wrapper
+Reinforcement learning with the Gymnasium wrapper
 ==================================================
 
-The corresponding complete source code can be found `here <https://github.com/hsahovic/poke-env/blob/master/examples/rl_with_new_open_ai_gym_wrapper.py>`__.
+The corresponding complete source code can be found `here <https://github.com/hsahovic/poke-env/blob/master/examples/rl_with_new_gymnasium_wrapper.py>`__.
 
-The goal of this example is to demonstrate how to use the `open ai gym <https://gym.openai.com/>`__ interface proposed by ``EnvPlayer``, and to train a simple deep reinforcement learning agent comparable in performance to the ``MaxDamagePlayer`` we created in :ref:`max_damage_player`.
+The goal of this example is to demonstrate how to use the `farama gymnasium <https://gymnasium.farama.org/>`__ interface proposed by ``EnvPlayer``, and to train a simple deep reinforcement learning agent comparable in performance to the ``MaxDamagePlayer`` we created in :ref:`max_damage_player`.
 
-.. note:: This example necessitates `keras-rl <https://github.com/keras-rl/keras-rl>`__ (compatible with Tensorflow 1.X) or `keras-rl2 <https://github.com/wau/keras-rl2>`__ (Tensorflow 2.X), which implement numerous reinforcement learning algorithms and offer a simple API fully compatible with the Open AI Gym API. You can install them by running ``pip install keras-rl`` or ``pip install keras-rl2``. If you are unsure, ``pip install keras-rl2`` is recommended.
+.. note:: This example necessitates `keras-rl <https://github.com/keras-rl/keras-rl>`__ (compatible with Tensorflow 1.X) or `keras-rl2 <https://github.com/wau/keras-rl2>`__ (Tensorflow 2.X), which implement numerous reinforcement learning algorithms and offer a simple API fully compatible with the Gymnasium API. You can install them by running ``pip install keras-rl`` or ``pip install keras-rl2``. If you are unsure, ``pip install keras-rl2`` is recommended.
 
 Implementing rewards and observations
 *************************************
 
-The open ai gym API provides *rewards* and *observations* for each step of each episode. In our case, each step corresponds to one decision in a battle and battles correspond to episodes.
+The Gymnasium API provides *rewards* and *observations* for each step of each episode. In our case, each step corresponds to one decision in a battle and battles correspond to episodes.
 
 Defining observations
 ^^^^^^^^^^^^^^^^^^^^^
@@ -26,9 +26,9 @@ Observations are embeddings of the current state of the battle. They can be an a
 
 To define our observations, we will create a custom ``embed_battle`` method. It takes one argument, a ``Battle`` object, and returns our embedding.
 
-In addition to this, we also need to describe the embedding to the gym interface.
+In addition to this, we also need to describe the embedding to the gymnasium interface.
 To achieve this, we need to implement the ``describe_embedding`` method where we specify the low bound and the high bound
-for each component of the embedding vector and return them as a ``gym.Space`` object.
+for each component of the embedding vector and return them as a ``gymnasium.Space`` object.
 
 Defining rewards
 ^^^^^^^^^^^^^^^^
@@ -108,7 +108,7 @@ Our player will play the ``gen8randombattle`` format. We can therefore inherit f
 Instantiating and testing a player
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Now that our custom class is defined, we can instantiate our RL player and test if it's compliant with the OpenAI gym API.
+Now that our custom class is defined, we can instantiate our RL player and test if it's compliant with the Gymnasium API.
 
 .. code-block:: python
 
@@ -340,7 +340,7 @@ To use the ``cross_evaluate`` method, the strategy is the same to the one used f
 Final result
 ************
 
-Running the `whole file <https://github.com/hsahovic/poke-env/blob/master/examples/rl_with_new_open_ai_gym_wrapper.py>`__ should take a couple of minutes and print something similar to this:
+Running the `whole file <https://github.com/hsahovic/poke-env/blob/master/examples/rl_with_gymnasium_wrapper.py>`__ should take a couple of minutes and print something similar to this:
 
 .. code-block:: console
 

diff --git a/docs/source/getting_started.rst b/docs/source/getting_started.rst
@@ -41,7 +41,7 @@ Agents in ``poke-env`` are instances of the ``Player`` class. Explore the follow
 
 - Basic agent: :ref:`/examples/cross_evaluate_random_players.ipynb`
 - Advanced agent: :ref:`max_damage_player`
-- RL agent: :ref:`rl_with_open_ai_gym_wrapper`
+- RL agent: :ref:`rl_with_gymnasium_wrapper`
 - Using teams: :ref:`ou_max_player`
 - Custom team builder: :ref:`using_a_custom_teambuilder`
 

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -6,7 +6,7 @@ Poke-env: A Python Interface for Training Reinforcement Learning Pokémon Bots
 
 Poke-env provides an environment for engaging in `Pokémon Showdown <https://pokemonshowdown.com/>`__ battles with a focus on reinforcement learning. 
 
-It boasts a straightforward API for handling Pokémon, Battles, Moves, and other battle-centric objects, alongside an `OpenAI Gym <https://gym.openai.com/>`__ interface for training agents.
+It boasts a straightforward API for handling Pokémon, Battles, Moves, and other battle-centric objects, alongside a `Farama Gymnasium <https://gymnasium.farama.org/>`__ interface for training agents.
 
 .. attention:: While poke-env aims to support all Pokémon generations, it was primarily developed with the latest generations in mind. If you discover any missing or incorrect functionalities for earlier generations, please `open an issue <https://github.com/hsahovic/poke-env/issues>`__ to help improve the library.
 

diff --git a/docs/source/modules/player.rst b/docs/source/modules/player.rst
@@ -21,10 +21,10 @@ Player
    :undoc-members:
    :show-inheritance:
 
-OpenAIGymEnv
+GymnasiumEnv
 ************
 
-.. automodule:: poke_env.player.openai_api
+.. automodule:: poke_env.player.gymnasium_api
    :members:
    :undoc-members:
    :show-inheritance:

diff --git a/examples/openai_example.py → examples/gymnasium_example.py b/examples/openai_example.py → examples/gymnasium_example.py
@@ -7,13 +7,13 @@
 from poke_env.environment.abstract_battle import AbstractBattle
 from poke_env.player import (
     Gen8EnvSinglePlayer,
+    GymnasiumEnv,
     ObservationType,
-    OpenAIGymEnv,
     RandomPlayer,
 )
 
 
-class TestEnv(OpenAIGymEnv):
+class TestEnv(GymnasiumEnv):
     def __init__(self, **kwargs):
         self.opponent = RandomPlayer(
             battle_format="gen8randombattle",
@@ -66,31 +66,31 @@ def describe_embedding(self) -> Space:
         return Box(np.array([0, 0]), np.array([6, 6]), dtype=int)
 
 
-def openai_api():
-    gym_env = TestEnv(
+def gymnasium_api():
+    gymnasium_env = TestEnv(
         battle_format="gen8randombattle",
         server_configuration=LocalhostServerConfiguration,
         start_challenging=True,
     )
-    check_env(gym_env)
-    gym_env.close()
+    check_env(gymnasium_env)
+    gymnasium_env.close()
 
 
 def env_player():
     opponent = RandomPlayer(
         battle_format="gen8randombattle",
         server_configuration=LocalhostServerConfiguration,
     )
-    gym_env = Gen8(
+    gymnasium_env = Gen8(
         battle_format="gen8randombattle",
         server_configuration=LocalhostServerConfiguration,
         start_challenging=True,
         opponent=opponent,
     )
-    check_env(gym_env)
-    gym_env.close()
+    check_env(gymnasium_env)
+    gymnasium_env.close()
 
 
 if __name__ == "__main__":
-    openai_api()
+    gymnasium_api()
     env_player()
diff --git a/examples/rl_with_open_ai_gym_wrapper.py → examples/rl_with_gym_wrapper.py b/examples/rl_with_open_ai_gym_wrapper.py → examples/rl_with_gym_wrapper.py
diff --git a/examples/rl_with_new_open_ai_gym_wrapper.py → examples/rl_with_gymnasium_wrapper.py b/examples/rl_with_new_open_ai_gym_wrapper.py → examples/rl_with_gymnasium_wrapper.py
@@ -72,7 +72,7 @@ def describe_embedding(self) -> Space:
 
 async def main():
     # First test the environment to ensure the class is consistent
-    # with the OpenAI API
+    # with the Gymnasium API
     opponent = RandomPlayer(battle_format="gen8randombattle")
     test_env = SimpleRLPlayer(
         battle_format="gen8randombattle", start_challenging=True, opponent=opponent

diff --git a/integration_tests/test_env_player.py b/integration_tests/test_env_player.py
@@ -1,7 +1,7 @@
 import numpy as np
 import pytest
 from gymnasium.spaces import Box, Space
-from gymnasium.utils.env_checker import check_env
+from pettingzoo.test.parallel_test import parallel_api_test
 
 from poke_env.player import (
     Gen4EnvSinglePlayer,
@@ -10,7 +10,6 @@
     Gen7EnvSinglePlayer,
     Gen8EnvSinglePlayer,
     Gen9EnvSinglePlayer,
-    RandomPlayer,
 )
 
 
@@ -80,81 +79,61 @@ def embed_battle(self, battle):
         return np.array([0])
 
 
-def play_function(player, n_battles):
+def play_function(env, n_battles):
     for _ in range(n_battles):
         done = False
-        player.reset()
+        env.reset()
         while not done:
-            _, _, terminated, truncated, _ = player.step(player.action_space.sample())
-            done = terminated or truncated
+            actions = {name: env.action_space(name).sample() for name in env.agents}
+            _, _, terminated, truncated, _ = env.step(actions)
+            done = any(terminated.values()) or any(truncated.values())
 
 
 @pytest.mark.timeout(30)
-def test_random_gym_players_gen4():
-    random_player = RandomPlayer(battle_format="gen4randombattle", log_level=25)
-    env_player = RandomGen4EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=False
-    )
+def test_random_gymnasium_players_gen4():
+    env_player = RandomGen4EnvPlayer(log_level=25, start_challenging=False)
     env_player.start_challenging(3)
     play_function(env_player, 3)
 
 
 @pytest.mark.timeout(30)
-def test_random_gym_players_gen5():
-    random_player = RandomPlayer(battle_format="gen5randombattle", log_level=25)
-    env_player = RandomGen5EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=False
-    )
+def test_random_gymnasium_players_gen5():
+    env_player = RandomGen5EnvPlayer(log_level=25, start_challenging=False)
     env_player.start_challenging(3)
     play_function(env_player, 3)
 
 
 @pytest.mark.timeout(30)
-def test_random_gym_players_gen6():
-    random_player = RandomPlayer(battle_format="gen6randombattle", log_level=25)
-    env_player = RandomGen6EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=False
-    )
+def test_random_gymnasium_players_gen6():
+    env_player = RandomGen6EnvPlayer(log_level=25, start_challenging=False)
     env_player.start_challenging(3)
     play_function(env_player, 3)
 
 
 @pytest.mark.timeout(30)
-def test_random_gym_players_gen7():
-    random_player = RandomPlayer(battle_format="gen7randombattle", log_level=25)
-    env_player = RandomGen7EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=False
-    )
+def test_random_gymnasium_players_gen7():
+    env_player = RandomGen7EnvPlayer(log_level=25, start_challenging=False)
     env_player.start_challenging(3)
     play_function(env_player, 3)
 
 
 @pytest.mark.timeout(30)
-def test_random_gym_players_gen8():
-    random_player = RandomPlayer(battle_format="gen8randombattle", log_level=25)
-    env_player = RandomGen8EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=False
-    )
+def test_random_gymnasium_players_gen8():
+    env_player = RandomGen8EnvPlayer(log_level=25, start_challenging=False)
     env_player.start_challenging(3)
     play_function(env_player, 3)
 
 
 @pytest.mark.timeout(30)
-def test_random_gym_players_gen9():
-    random_player = RandomPlayer(battle_format="gen9randombattle", log_level=25)
-    env_player = RandomGen9EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=False
-    )
+def test_random_gymnasium_players_gen9():
+    env_player = RandomGen9EnvPlayer(log_level=25, start_challenging=False)
     env_player.start_challenging(3)
     play_function(env_player, 3)
 
 
 @pytest.mark.timeout(60)
 def test_two_successive_calls_gen8():
-    random_player = RandomPlayer(battle_format="gen8randombattle", log_level=25)
-    env_player = RandomGen8EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=False
-    )
+    env_player = RandomGen8EnvPlayer(log_level=25, start_challenging=False)
     env_player.start_challenging(2)
     play_function(env_player, 2)
     env_player.start_challenging(2)
@@ -163,10 +142,7 @@ def test_two_successive_calls_gen8():
 
 @pytest.mark.timeout(60)
 def test_two_successive_calls_gen9():
-    random_player = RandomPlayer(battle_format="gen9randombattle", log_level=25)
-    env_player = RandomGen9EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=False
-    )
+    env_player = RandomGen9EnvPlayer(log_level=25, start_challenging=False)
     env_player.start_challenging(2)
     play_function(env_player, 2)
     env_player.start_challenging(2)
@@ -175,39 +151,21 @@ def test_two_successive_calls_gen9():
 
 @pytest.mark.timeout(60)
 def test_check_envs():
-    random_player = RandomPlayer(battle_format="gen4randombattle", log_level=25)
-    env_player_gen4 = RandomGen4EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=True
-    )
-    check_env(env_player_gen4)
+    env_player_gen4 = RandomGen4EnvPlayer(log_level=25, start_challenging=True)
+    parallel_api_test(env_player_gen4)
     env_player_gen4.close()
-    random_player = RandomPlayer(battle_format="gen5randombattle", log_level=25)
-    env_player_gen5 = RandomGen5EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=True
-    )
-    check_env(env_player_gen5)
+    env_player_gen5 = RandomGen5EnvPlayer(log_level=25, start_challenging=True)
+    parallel_api_test(env_player_gen5)
     env_player_gen5.close()
-    random_player = RandomPlayer(battle_format="gen6randombattle", log_level=25)
-    env_player_gen6 = RandomGen6EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=True
-    )
-    check_env(env_player_gen6)
+    env_player_gen6 = RandomGen6EnvPlayer(log_level=25, start_challenging=True)
+    parallel_api_test(env_player_gen6)
     env_player_gen6.close()
-    random_player = RandomPlayer(battle_format="gen7randombattle", log_level=25)
-    env_player_gen7 = RandomGen7EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=True
-    )
-    check_env(env_player_gen7)
+    env_player_gen7 = RandomGen7EnvPlayer(log_level=25, start_challenging=True)
+    parallel_api_test(env_player_gen7)
     env_player_gen7.close()
-    random_player = RandomPlayer(battle_format="gen8randombattle", log_level=25)
-    env_player_gen8 = RandomGen8EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=True
-    )
-    check_env(env_player_gen8)
+    env_player_gen8 = RandomGen8EnvPlayer(log_level=25, start_challenging=True)
+    parallel_api_test(env_player_gen8)
     env_player_gen8.close()
-    random_player = RandomPlayer(battle_format="gen9randombattle", log_level=25)
-    env_player_gen9 = RandomGen9EnvPlayer(
-        log_level=25, opponent=random_player, start_challenging=True
-    )
-    check_env(env_player_gen9)
+    env_player_gen9 = RandomGen9EnvPlayer(log_level=25, start_challenging=True)
+    parallel_api_test(env_player_gen9)
     env_player_gen9.close()
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "poke_env"
-version = "0.8.2"
+version = "0.8.3"
 description = "A python interface for training Reinforcement Learning bots to battle on pokemon showdown."
 readme = "README.md"
 requires-python = ">=3.9.0"

diff --git a/requirements.txt b/requirements.txt
@@ -1,6 +1,7 @@
 gymnasium
 numpy
 orjson
+pettingzoo
 requests
 tabulate
 websockets==12.0
diff --git a/src/poke_env/environment/effect.py b/src/poke_env/environment/effect.py
@@ -828,6 +828,7 @@ def is_from_move(self) -> bool:
     "FLOWERVEIL": Effect.FLOWER_VEIL,
     "FOCUSBAND": Effect.FOCUS_BAND,
     "FOCUSENERGY": Effect.FOCUS_ENERGY,
+    "FOCUSPUNCH": Effect.FOCUS_PUNCH,
     "FOLLOWME": Effect.FOLLOW_ME,
     "FORESIGHT": Effect.FORESIGHT,
     "FOREWARN": Effect.FOREWARN,

diff --git a/src/poke_env/environment/pokemon.py b/src/poke_env/environment/pokemon.py
@@ -276,6 +276,8 @@ def forme_change(self, species: str):
 
     def heal(self, hp_status: str):
         self.set_hp_status(hp_status)
+        if self.fainted:
+            self._status = None
 
     def invert_boosts(self):
         self._boosts = {k: -v for k, v in self._boosts.items()}