RL Sandbox

This repository contains a collection of Reinforcement Learning (RL) algorithms implemented in the form of Jupyter notebooks. It serves as a sandbox environment for experimenting with and understanding RL algorithms. The goal is to provide clean, easy-to-follow implementations with explanations in order to facilitate learning and experimentation in RL.

Overview

Currently, the repository includes the following algorithms:

Policy Gradient with Baseline (policy-gradient-baseline.ipynb)
- This notebook implements the vanilla policy gradient algorithm with a baseline, which reduces the variance of the gradient estimates and leads to more stable learning.
- The baseline is typically the state-value function, which helps in faster convergence.
Natural Policy Gradients (natural-policy-gradient.ipynb)
- This notebook implements the natural policy gradient algorithm, which improves upon standard policy gradient methods by using a Fisher information matrix to account for the geometry of the policy space.
- It is designed to make more efficient updates to the policy by considering the "natural" gradient, leading to faster convergence in many cases.

Getting Started

Prerequisites

Make sure you have the following installed:

Python 3.x
Jupyter Notebook
Recommended: A virtual environment such as venv or conda.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE		LICENSE
README.md		README.md
natural-policy-gradient.ipynb		natural-policy-gradient.ipynb
policy-gradient-baseline.ipynb		policy-gradient-baseline.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL Sandbox

Overview

Getting Started

Prerequisites

About

Languages

License

ivtikhon/rl-sandbox

Folders and files

Latest commit

History

Repository files navigation

RL Sandbox

Overview

Getting Started

Prerequisites

About

Resources

License

Stars

Watchers

Forks

Languages