Intrinsic Reward Matching (IRM)

This codebase contains implementation, visualizations, and analysis of IRM from Skill-Based Reinforcement Learning with Intrinsic Reward Matching (Adeniji and Xie et al 2022). IRM leverages the skill discriminator from unsupervised RL pretraining to perform environment-interaction-free skill sequencing for unseen downstream tasks.

This codebase is built on top of the Contrastive Intrinsic Control (CIC) codebase.

Before running IRM, you must first pretrain an agent with the unsupervised RL method of your choice.

To pretrain an agent dads or cic, run the following command:

python pretrain.py agent=AGENT domain=DOMAIN experiment_folder=YOUR_EXP_FOLDER experiment_name=YOUR_EXP_NAME

To finetune your pretrained agent, run the following command. Make sure to specify the directory of your saved snapshots with YOUR_EXP_NAME.

python finetune.py agent=AGENT irm=IRM_METHOD experiment=YOUR_EXP_NAME task=TASK extr_reward=[REWARD] restore_snapshot_ts=2000000 restore_snapshot_dir=PATH_TO_PRETRAINED_MODEL

In addition, we include a visualization script. You can use this script to see detailed insights into the IRM skill selection process.

python visualize_irm.py agent=AGENT experiment=YOUR_EXP_NAME domain=DOMAIN restore_snapshot_ts=2000000 restore_snapshot_dir=PATH_TO_PRETRAINED_MODEL

For sequential task finetuning (or IRM visualizations), add the flags extr_reward_seq=[REW1,REW2,REW3].

We use the IRM class to perform skill selection (env_rollout, irm_cem, random_skill, etc.) and process rewards for sequential goal-reaching environments. To implement a new skill selection method, create a subclass of IRM and implement the run_skill_selection_method method.

Requirements

We assume you have access to a GPU that can run CUDA 10.2 and CUDNN 8. Then, the simplest way to install all required dependencies is to create an anaconda environment by running

conda env create -f conda_env.yml

After the instalation ends you can activate your environment with

conda activate irm

Note that we use a custom implementation of OpenAI Gym's Fetch environment.

Available Domains

We work on the following domains + tasks:

Domain	Tasks	Reduced State
`fetch_reach`	`goal_0.5_0.5_0.5`, `goal_1_1.2_1`	`fetch_reach_xyz`
`fetch_push`	`goal_barrier1`, `goal_barrier2`, `goal_barrier3`	`fetch_push_xy`
`fetch_barrier`	`goal_barrier1`, `goal_barrier2`, `goal_barrier3`	`fetch_push_xy`
`walker`	`stand`, `walk`, `run`, `flip`	`walker_delta_xyz`
`quadruped`	`walk`, `run`, `stand`, `jump`	`quadruped_velocity`
`jaco`	`reach_top_left`, `reach_top_right`, `reach_bottom_left`, `reach_bottom_right`	`jaco_xyz`
`plane`	`goal_top_right`, `goal_top_left`	`states`

Monitoring

Logs are stored in the exp_local folder. To launch tensorboard run:

tensorboard --logdir exp_local

You may also enable logging to wandb and view logs there.

Citation

If you use this code in your own research, please consider citing:

A. Adeniji, A. Xie, and P. Abbeel. Skill-based reinforcement learning with intrinsic reward
matching, 2022. URL https://arxiv.org/abs/2210.07426.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
agent		agent
custom_dmc_tasks		custom_dmc_tasks
envs		envs
imgs		imgs
irm		irm
.gitignore		.gitignore
README.md		README.md
conda_env.yml		conda_env.yml
dmc.py		dmc.py
dmc_benchmark.py		dmc_benchmark.py
fetch.py		fetch.py
finetune.py		finetune.py
finetune.yaml		finetune.yaml
logger.py		logger.py
plot_csv.py		plot_csv.py
pretrain.py		pretrain.py
pretrain.yaml		pretrain.yaml
replay_buffer.py		replay_buffer.py
utils.py		utils.py
video.py		video.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intrinsic Reward Matching (IRM)

Requirements

Available Domains

Monitoring

Citation

About

Contributors 2

Languages

ademiadeniji/irm

Folders and files

Latest commit

History

Repository files navigation

Intrinsic Reward Matching (IRM)

Requirements

Available Domains

Monitoring

Citation

About

Resources

Stars

Watchers

Forks

Contributors 2

Languages