ml-jku / rudder Public

Notifications You must be signed in to change notification settings
Fork 7
Star 46

RUDDER: Return Decomposition for Delayed Rewards

46 stars 7 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
RUDDER_poster.pdf		RUDDER_poster.pdf

Repository files navigation

RUDDER: Return Decomposition for Delayed Rewards

RUDDER efficiently learns optimal policies in finite Markov decision processes with delayed rewards. With the following links you can find:

Our RUDDER paper: https://arxiv.org/abs/1806.07857
RUDDER blog: https://ml-jku.github.io/rudder/
Code for RUDDER demonstration on example-task in blog: https://github.com/ml-jku/rudder-demonstration-code
A practical step-by-step guide to applying RUDDER in PyTorch: https://github.com/widmi/rudder-a-practical-tutorial

About

RUDDER: Return Decomposition for Delayed Rewards

Custom properties

Report repository

Releases

No releases published

Packages

No packages published

Contributors 2