Skip to content
/ DORA Public
forked from Xinyuz26/DORA

Code for ICML'24 paper "Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics"

Notifications You must be signed in to change notification settings

LAMDA-RL/DORA

 
 

Repository files navigation

DORA

Code for DORA: Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics.

Installation

  • Please refer to requirements.txt with python version of 3.8.13

    pip install -r ./requirement.txt

Usage

Offline Datasets Collection

Please collect your offline datasets in ./envs, which generates tasks with different dynamics in MuJoco. For each single-task dataset, you may use general online RL methods to train policies independently and then restore all the transitions into the replay buffer during training as the datasets.

As demonstrated in our paper, we use Soft Actor Critic (SAC) to train RL policies in 10 tasks with different dynamics. And we gather 200,000 transitions from replay buffer for each single-task dataset, except for the tasks of Pendulum-gravity, which comprises 40,000 transitions. For example, our datasets for Cheetah-gravity can be found in drive.

Encoder Training

An example:

python run_offline_encoder_trainer.py --env_type cheetah-gravity --rnn_fix_length 8 --varying_params gravity --seed 0

Parameters:

  • --env_type: The possible environments includes Cheetah-gravity, Cheetah-dof, Cheetah-body_mass, Hopper-gravity, Pendulum-gravity, Walker-gravity with changing dynamics named gravity, dof, body_mass .
  • --rnn_fix_length: The RNN history length.
  • --varying_params: The changing dynamics, e.g. gravity .
  • --seed: Random seeds.

Meta Policy Training

An example:

python run_offline_policy_trainer.py --env cheetah-dof --rnn_fix_length 8 --encoder_path ''  --seed 0

Parameters:

  • --encoder_path: The path of the trained encoder model, e.g. ./log/env_type/your_experiment/model.

We express our gratitude to Fanmin Luo and Yihao Sun for their help in our code.

Reference

Citation

If you find our paper or code useful, please consider citing via

@inproceedings{dora,
  author       = {Xinyu Zhang and
                  Wenjie Qiu and
                  Yi{-}Chen Li and
                  Lei Yuan and
                  Chengxing Jia and
                  Zongzhang Zhang and
                  Yang Yu},
  title        = {Debiased Offline Representation Learning for Fast Online Adaptation
                  in Non-stationary Dynamics},
  booktitle    = {Forty-first International Conference on Machine Learning, {ICML} 2024,
                  Vienna, Austria, July 21-27, 2024},
  publisher    = {OpenReview.net},
  year         = {2024},
  url          = {https://openreview.net/forum?id=BrZPj9rEpN},
}

About

Code for ICML'24 paper "Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%