Doubly Mild Generalization for Offline Reinforcement Learning

Code for NeurIPS 2024 accepted paper: Doubly Mild Generalization for Offline Reinforcement Learning.

🔧 Environment

Paper results were collected with MuJoCo 210 (and mujoco-py 2.1.2.14) in OpenAI gym 0.23.1 with the D4RL datasets. Networks are trained using PyTorch 1.11.0 and Python 3.7.

🚀 Usage

Offline RL Training

Use the following command to train offline RL on D4RL, including Gym locomotion and Antmaze tasks, and save the models.

python train_offline.py --env halfcheetah-medium-v2 --lam 0.25 --nu 0.1 --save_model
python train_offline.py --env antmaze-large-diverse-v2 --lam 0.25 --nu 0.5 --no_normalize --save_model

Offline-to-Online Finetuning

Use the following command to online fine-tune the pretrained offline models on AntMaze tasks.

python train_finetune.py --env antmaze-large-diverse-v2 --lam 0.25 --nu 0.5 --lam_end 0.5 --nu_end 0.005 --no_normalize

Logging

You can view saved runs using TensorBoard.

tensorboard --logdir <run_dir>

📝 Citation

If you find this work useful, please consider citing:

@article{mao2024doubly,
  title={Doubly mild generalization for offline reinforcement learning},
  author={Mao, Yixiu and Wang, Qi and Qu, Yun and Jiang, Yuhang and Ji, Xiangyang},
  journal={Advances in Neural Information Processing Systems},
  volume={37},
  pages={51436--51473},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
DMG.py		DMG.py
README.md		README.md
train_finetune.py		train_finetune.py
train_offline.py		train_offline.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Doubly Mild Generalization for Offline Reinforcement Learning

🔧 Environment

🚀 Usage

Offline RL Training

Offline-to-Online Finetuning

Logging

📝 Citation

About

Uh oh!

Releases

Packages

Languages

thu-rllab/DMG

Folders and files

Latest commit

History

Repository files navigation

Doubly Mild Generalization for Offline Reinforcement Learning

🔧 Environment

🚀 Usage

Offline RL Training

Offline-to-Online Finetuning

Logging

📝 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages