Skip to content

Unofficial PyTorch implementation (replicating paper results) of Implicit Q-Learning (In-sample Q-Learning) for offline RL

Notifications You must be signed in to change notification settings

Manchery/iql-pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IQL Implementation in PyTorch

IQL

This repo is an unofficial implementation of Implicit Q-Learning (In-sample Q-Learning) in PyTorch.

@inproceedings{
    kostrikov2022offline,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=68n2s9ZJWF8}
}

Note: Reward standardization (We standardize MuJoCo locomotion task rewards by dividing by the difference of returns of the best and worst trajectories in each dataset) used in official implementation is missed in this implementation. One can easily add it by itself.

Train

Gym-MuJoCo

python main_iql.py --env halfcheetah-medium-v2 --expectile 0.7 --temperature 3.0 --eval_freq 5000 --eval_episodes 10 --normalize

AntMaze

python main_iql.py --env antmaze-medium-play-v2 --expectile 0.9 --temperature 10.0 --eval_freq 50000 --eval_episodes 100

Results

mujoco_results

antmaze_results

Acknowledgement

This repo borrows heavily from sfujim/TD3_BC and ikostrikov/implicit_q_learning.

About

Unofficial PyTorch implementation (replicating paper results) of Implicit Q-Learning (In-sample Q-Learning) for offline RL

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages