Multi reward decision making using offline reinforcement learning

Calle Ryge Carlsen, Christian Ole Nielsen, Karl Meisner-Jensen, Magnus Elgaard Bennett

Based on code originally from berkely (license MIT) github

In relation to the paper Decision Transformer: Reinforcement Learning Using Sequence modelling - Lu et. al. 2021. Paper found at arXiv

Overview

Using the decision transformer (DT) to perform offline reinforcement learning in Markovian Gym MuJoCo environments.

The multi-return case for the transformer has been introduced, allowing to condition on multiple return signals, as well as code to generate the multi-return data.

several submit_environment_case.sh files are included to allow for easy training on DTU HPC. Otherwise performing experiments has been easened when using the console.

Instructions

See /gym/readme-gym.md on initializing environment and common errors associated with this.

All code associated with the decision trannsformer is found in the /gym folder

Data

Evaluation data for experiments can be found on drive

License

DTU

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
.idea		.idea
.vscode		.vscode
gym		gym
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
architecture.png		architecture.png
git_commit_log.txt		git_commit_log.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi reward decision making using offline reinforcement learning

Overview

Instructions

Data

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi reward decision making using offline reinforcement learning

Overview

Instructions

Data

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages