Skip to content

A collection of algorithms to explore and experiment with Reinforcement Learning (RL) methods.

License

Notifications You must be signed in to change notification settings

ivtikhon/rl-sandbox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RL Sandbox

This repository contains a collection of Reinforcement Learning (RL) algorithms implemented in the form of Jupyter notebooks. It serves as a sandbox environment for experimenting with and understanding RL algorithms. The goal is to provide clean, easy-to-follow implementations with explanations in order to facilitate learning and experimentation in RL.

Overview

Currently, the repository includes the following algorithms:

  1. Policy Gradient with Baseline (policy-gradient-baseline.ipynb)
    • This notebook implements the vanilla policy gradient algorithm with a baseline, which reduces the variance of the gradient estimates and leads to more stable learning.
    • The baseline is typically the state-value function, which helps in faster convergence.
  2. Natural Policy Gradients (natural-policy-gradient.ipynb)
    • This notebook implements the natural policy gradient algorithm, which improves upon standard policy gradient methods by using a Fisher information matrix to account for the geometry of the policy space.
    • It is designed to make more efficient updates to the policy by considering the "natural" gradient, leading to faster convergence in many cases.

Getting Started

Prerequisites

Make sure you have the following installed:

About

A collection of algorithms to explore and experiment with Reinforcement Learning (RL) methods.

Resources

License

Stars

Watchers

Forks