Hindsight experience replay allows an agent to learn in environments with sparse rewards and multiple goals.
HER utilizes UVFAs and works by augmenting experience replays with additional goals. The intuition being there is valuable information to be learned even when the end goal is not reached e.g. if I miss a shot in basketball I can still reason that had the hoop been slightly moved I would have made it.
HER is a sort of intrisic ciriculum learning in which the agent is able to learn from smaller goals before reaching the larger ones.
See the experiments folder for example implementations.
- n>15 on bitflip
- More hindsight types
- More environments (push-drag)