brandstetter-johannes / rudder Public

forked from ml-jku/rudder

Notifications You must be signed in to change notification settings
Fork 0
Star 0

RUDDER: Return Decomposition for Delayed Rewards

0 stars 7 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
RUDDER_poster.pdf		RUDDER_poster.pdf

Repository files navigation

RUDDER: Return Decomposition for Delayed Rewards

RUDDER efficiently learns optimal policies in finite Markov decision processes with delayed rewards. With the following links you can find:

Our RUDDER paper: https://arxiv.org/abs/1806.07857
RUDDER blog: https://www.jku.at/index.php?id=16426
Code for RUDDER demonstration on example-task in blog: https://github.com/ml-jku/rudder-demonstration-code
A practical step-by-step guide to applying RUDDER in PyTorch: https://github.com/widmi/rudder-a-practical-tutorial

About

RUDDER: Return Decomposition for Delayed Rewards

Report repository

Releases

No releases published

Packages

No packages published