Goal Selection Strategies for Learning Goal-Oriented Value Functions

Degree: COMS

Description: Recent work in compositional reinforcement learning has demonstrated how to combine skills to solve tasks specified using Boolean algebra operators. However, the algorithm to do so uses standard Q-learning with epsilon greedy exploration. One aspect of the algorithm is the way the agent decides on which goal to explore, which is currently done in a greedy fashion. In this project, we propose extending this algorithm to incorporate different ways of goal selection, such as through uniform random or bandit-based strategies. This project also involves the creation of a virtual environment in Unity or mujoco-worldgen.

Tags/topics: Reinforcement learning, deep reinforcement learning, game design

Algorithms:

Explore only
Exploit only
ε-greedy (Epsilon greedy)
UCB (Upper Confidence Bound)
EXP4
Softmax
Optimistic initialization
Intrinsic rewards
Q-map

References:

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.vscode		.vscode
ab		ab
assets		assets
config		config
gsslgovf		gsslgovf
lr		lr
rp		rp
scripts		scripts
.editorconfig		.editorconfig
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
references.bib		references.bib
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Goal Selection Strategies for Learning Goal-Oriented Value Functions

About

Releases

Packages

Languages

mamello-justice/research-gsslgovf

Folders and files

Latest commit

History

Repository files navigation

Goal Selection Strategies for Learning Goal-Oriented Value Functions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages