IQL Implementation in PyTorch

IQL

This repo is an unofficial implementation of Implicit Q-Learning (In-sample Q-Learning) in PyTorch.

@inproceedings{
    kostrikov2022offline,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=68n2s9ZJWF8}
}

Note: Reward standardization (We standardize MuJoCo locomotion task rewards by dividing by the difference of returns of the best and worst trajectories in each dataset) used in official implementation is missed in this implementation. One can easily add it by itself.

Train

Gym-MuJoCo

python main_iql.py --env halfcheetah-medium-v2 --expectile 0.7 --temperature 3.0 --eval_freq 5000 --eval_episodes 10 --normalize

AntMaze

python main_iql.py --env antmaze-medium-play-v2 --expectile 0.9 --temperature 10.0 --eval_freq 50000 --eval_episodes 100

Results

Acknowledgement

This repo borrows heavily from sfujim/TD3_BC and ikostrikov/implicit_q_learning.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
imgs		imgs
.gitignore		.gitignore
IQL.py		IQL.py
README.md		README.md
actor.py		actor.py
common.py		common.py
critic.py		critic.py
log.py		log.py
main_iql.py		main_iql.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IQL Implementation in PyTorch

IQL

Train

Gym-MuJoCo

AntMaze

Results

Acknowledgement

About

Releases

Packages

Contributors 2

Languages

Manchery/iql-pytorch

Folders and files

Latest commit

History

Repository files navigation

IQL Implementation in PyTorch

IQL

Train

Gym-MuJoCo

AntMaze

Results

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages