Skip to content

Latest commit

 

History

History
16 lines (13 loc) · 934 Bytes

File metadata and controls

16 lines (13 loc) · 934 Bytes

Learning with Policy Gradients in Full Reinforcement Learning environment provided by OpenAI Gym

Overview

Policy Gradients is one of the Reinforcement Learning Algorithm. In this experiment we consider a full RL problem, which means there are several states and each of our actions are in such a way that it not only considers current reward but also the rewards in the long run. Thus to make an optimal policy we should consider the temporal dynamics of the environment.

Dependencies

Credits

Most of the conceptual and programmatic understanding is borrowed from the Reinforcement Learning Series by Arthur Juliani here.