Skip to content

πŸ•Ή Pikachu-volleyball game-based multi-agent RL environment using PettingZoo

License

Notifications You must be signed in to change notification settings

helpingstar/pika-zoo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

78 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

pika-zoo

The original Pikachu Volleyball (ε―Ύζˆ¦γ΄γ‹γ‘γ‚…ο½žγ€€οΎ‹οΎžο½°οΎοΎŠοΎžοΎšο½°η·¨) was developed by

  • 1997 (C) SACHI SOFT / SAWAYAKAN Programmers
  • 1997 (C) Satoshi Takenouchi

All of the code for pika-zoo was written based on gorisanson/pikachu-volleyball

Differences from the original code
  • Random numbers are generated by the environment's numpy generator (self.np_random), not by the global function rand.js of the original code.
  • Some code logic has been improved for faster iteration.

You can play the game at the site below.

rl-video-step-0.mp4
Import from pikazoo import pikazoo_v0
Actions Discrete
Parallel API Yes
Manual Control No
Agents agents= ['player_1', 'player_2']
Agents 2
Action Shape (1,)
Action Values Discrete(18)
Observation Shape (35,)
Observation Values [-124,432]

Action Space

Value Meaning Value Meaning Value Meaning
0 NOOP 1 FIRE 2 UP
3 RIGHT 4 LEFT 5 DOWN
6 UPRIGHT 7 UPLEFT 8 DOWNRIGHT
9 DOWNLEFT 10 UPFIRE 11 RIGHTFIRE
12 LEFTFIRE 13 DOWNFIRE 14 UPRIGHTFIRE
15 UPLEFTFIRE 16 DOWNRIGHTFIRE 17 DOWNLEFTFIRE

Observation Space

Index Description min max
0 X position of player 32 400
1 Y position of player 108 244
2 Y Velocity of player -15 16
3 Diving direction of player -1 1
4 Player's remaining duration of lying down -2 3
5 Player's frame number 0 4
6 Player's delay_before_next_frame 0 4
7 State of player (normal) 0 1
8 State of player (jumping) 0 1
9 State of player (jumping and power hitting) 0 1
10 State of player (diving) 0 1
11 State of player (lying down after diving) 0 1
12 Whether player's power hit key was down previously 0 1
13 X position of opponent player 32 400
14 Y position of opponent player 108 244
15 Y Velocity of opponent player -15 16
16 Diving direction of opponent player -1 1
17 Opponent player's remaining duration of lying down -2 3
18 Opponent player's frame number 0 4
19 Opponent player's delay_before_next_frame 0 4
20 State of opponent player (normal) 0 1
21 State of opponent player (jumping) 0 1
22 State of opponent player (jumping and power hitting) 0 1
23 State of opponent player (diving) 0 1
24 State of opponent player (lying down after diving) 0 1
25 Whether opponent player's power hit key was down previously 0 1
26 X position of ball 20 432
27 Y position of ball 0 252
28 Previous X position of ball 0 432
29 Previous Y position of ball 0 252
30 Previous previous X position of ball 0 432
31 Previous previous Y position of ball 0 252
32 X Velocity of ball -20 20
33 Y Velocity of ball -124 124
34 If the ball is in POWER HIT status 0 1
  • Since I do not know the exact minimum and maximum values of the ball's y velocity, I used the minimum and maximum values I observed.
  • Previous X position of ball and Previous previous X position of ball have a low value of 0 because they are initialized to 0 at the start of the round.

Range of x position

  • player1 : [32, 184]
  • player2 : [248, 400]

Arguments

pikazoo_v0.env(
    winning_score=15,
    serve="winner",
    is_player1_computer=False,
    is_player2_computer=False,
)
  • winning_score : The number of points needed to win a game.
  • serve : The method to determine the player to serve
    • winner : The winner of the previous round serves.
    • alternate : The two players alternate serving each round.
    • random : The player to serve is determined randomly.
  • is_player1_computer : If this argument is True, player1 (left) will behave as the original game's rull-based AI, and its inputs will be ignored.
  • is_player2_computer : If this argument is True, player2 (right) will behave as the original game's rull-based AI, and its inputs will be ignored.

Wrappers

SimplifyAction

Represent actions in relative directions instead of absolute directions, exclude actions that are not meaningful in gameplay, and reduce the number of valid actions from 18 to 13.

Value Meaning Value Meaning Value Meaning
0 NOOP 1 FIRE 2 UP
3 FRONT 4 BACK 5 UPFRONT
6 UPBACK 7 UPFIRE 8 FRONTFIRE
9 BACKFIRE 10 DOWNFIRE 11 UPFRONTFIRE
12 DOWNFRONTFIRE
Relative Direction Absolute Direction
FRONT player_1 : RIGHT
player_2 : LEFT
BACK player_1 : LEFT
player_2 : RIGHT

RewardByBallPosition

[20, x_line) [x_line, 432]
          x_line
        β”Œβ”€β”€β”€β”Όβ”€β”€β”€β”¬β†’
        β”‚ 0 β”‚ 2 β”‚ [0, y_line]
        β”œβ”€β”€β”€β”Όβ”€β”€β”€β”Όβ”€ y_line 
        β”‚ 1 β”‚ 3 β”‚ (y_line, 252]
        β”œβ”€β”€β”€β”΄β”€β”€β”€β”˜
        ↓
  • Argument
    • additional_reward : When the ball is in zone n, player_1 gets the value at index n, and player_2 gets the reward at index n+4.
    • x_line : The line separating the x-coordinates
    • y_line : The line separating the y-coordinates

Releases

No releases published

Packages

No packages published

Languages