CS182_RL_Project

Reinforcement Learning Project for CS W182/282A: Designing, Visualizing and Understanding Deep Neural Networks @ UC Berkeley

Introduction:

Reinforcement Learning (RL) is a machine learning paradigm centered around training agents to take actions in an environment in order to maximize a reward or goal. One research direction focuses on the ability of an agent to “generalize” what they have learned in one environment to perform well in similar yet novel environments. For instance, an agent playing a game trains to survive a sequence of levels while maximizing its score before being tested on unseen levels. We benchmark the agent’s performance as the score earned by the agent on unseen test levels.

Ideally, an agent should be able to learn not only how to “survive” a level and reach the end without hitting obstacles, but also to optimize their score by eating fruit and avoiding non-fruit objects, and on different, unseen levels as well. This is important not only in the pursuit of more intelligent agents that can handle more tasks, but also to ensure that an agent is truly learning skills and behaviors independent of their environment.

Modifications:

Our work focuses on entropy regularization and noise regularization techniques. Nascent research has emerged suggesting entropy regularization -- finetuning and scheduling penalties on the entropy of the policy distribution -- can lead to improved convergence rates in natural policy gradient algorithms. We seek to experimentally confirm these results while also exploring its impact on agent generalizability. Additionally, we explore noise regularization, a novel technique (inspired by dropout) introducing random perturbations to the policy distribution in an effort to build redundancy and prevent overfitting to the specific parameters of training environments.

Training or testing agents

First, change directory to train_procgen

If you want to train a new agent:

Use nohup python3 -m train --start_level=0 --num_levels=500 --high_entropy=['False', 'True] --scheduler=['none', 'linear', 'piecewise', 'exponential'] --log_dir=NAME

If you want to test an agent:

Use nohup python3 -m train --start_level=500 --num_levels=100 --high_entropy=['False', 'True] --scheduler=['none', 'linear', 'piecewise', 'exponential'] --log_dir=NAME --load_path=FOLDER/NAME/checkpoints/00305

Command Parameters

high_entropy = whether to run an agent with an initial ent_coeff of 0.01 (low, False) or 0.1 (high, True)

scheduler = whether to run an agent with a linear, piecewise step, or exponential scheduler for ent_coeff decay

high_entropy=False = schedulers will decay from ent_coeff=1e-2 to ent_coeff=1e-5
high_entropy=True = schedulers will decay from ent_coeff=1e-1 to ent_coeff=1e-4

log_dir = file path to save results

load_path = file path to existing model

train_procgen: Files from train_procgen required to train/run agents
- training_runs: Checkpoints and Progress for Training Runs
- test_runs: Checkpoints and Progress for Test Runs
- preliminary_runs: Initial Experimentation
- noise : Experiments with noise
  - dirichlet_noise: Dirichlet Distribution Modification
  - gaussian_noise: Gaussian Distribution Modification
README.md : You are here!
requirements.txt : Required Python Packages

Contributors

Maxwell Chen (@maxhchen)
Abinav Routhu (@abinavcal)
Jason Lin (@jasonlin18)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS182_RL_Project

Introduction:

Modifications:

Training or testing agents

Command Parameters

Contents

Contributors

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
train_procgen		train_procgen
README.md		README.md
requirements.txt		requirements.txt

maxhchen/CS182_RL_Project

Folders and files

Latest commit

History

Repository files navigation

CS182_RL_Project

Introduction:

Modifications:

Training or testing agents

Command Parameters

Contents

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages