Skip to content

Pytorch implementation of Proximal Policy Optimization (PPO) for discrete action spaces

Notifications You must be signed in to change notification settings

naivoder/DiscretePPO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proximal Policy Optimization (Discrete)

Overview

🚧 🛠️👷‍♀️ 🛑 Under construction...

This repository contains an implementation of Proximal Policy Optimization (PPO) for discrete action spaces, which has been evaluated against a variety of Gymnasium and Atari environments.

The main script in its current form is configured for Atari environments, with a custom environment wrapper that follows the approach outlined in the original DQN paper (for this reason, it is recommended to use the 'NoFrameskip' versions of the environments).

Setup

Required Dependencies

Install the required dependencies using the following command:

pip install -r requirements.txt

Running the Algorithm

You can run the algorithm on any supported Gymnasium environment. For example:

python main.py --env 'MsPacmanNoFrameskip-v4'

Results

🤔 For your consideration:

The Atari environments were trained for 20000 games. I regret this decision as it lead to inconsistent numbers of learning steps between environments (due to some games requiring more/less steps per game).

I also did not use reward scaling, which I use for most other algorithms. This was a nearly arbitrary decision that came about due to initial debugging - at a certain point things suddenly began to work so I just kinda rolled with it...

I only started tracking the average critic value for a set of fixed states after many environments had already been trained, but I feel that this provides an additional interesting piece of context.

CartPole-v1

MountainCar-v0

Acrobot-v1

LunarLander-v2

AirRaid

Alien

Amidar

Assault

Asterix

Asteroids

Atlantis

BankHeist

BattleZone

BeamRider

Breakout

Krull

Berzerk

CrazyClimber

DemonAttack

Kangaroo

KungFuMaster

Zaxxon

Skiing

MontezumaRevenge

Bowling

Boxing

Carnival

Centipede

ChopperCommand

Defender

DoubleDunk

NameThisGame

Solaris

SpaceInvaders

Phoenix

StarGunner

Pitfall

Tennis

Pong

Pooyan

TimePilot

Tutankham

Enduro

UpNDown

PrivateEye

Qbert

Riverraid

RoadRunner

FishingDerby

Venture

Freeway

Seaquest

Robotank

Frostbite

VideoPinball

Gopher

Gravitar

WizardOfWor

Hero

YarsRevenge

ElevatorAction

IceHockey

Jamesbond

JourneyEscape

Acknowledgements

Special thanks to Phil Tabor, an excellent teacher! I highly recommend his Youtube channel.

Releases

No releases published

Packages

No packages published

Languages