Toolbox with highly optimized implementations of deep reinforcement learning algorithms for robotics using Pytorch and Python.
In this project, two colleagues and I develop a toolbox with state of the art reinforcement learning algorithms using Pytorch and Python. The toolbox contains other useful features such as custom robotics environments, loggers, plotters and much more.
All the algorithms are in the rlbotics
directory. Each algorithm specified above has an individual directory.
- Deep Q Network (DQN)
- Double Deep Q Network (DDQN)
- Deep Deterministic Policy Gradient (DDPG)
- Twin Delayed Deep Deterministic Policy Gradient (TD3)
- Vanilla Policy Gradient (VPG)
- Soft Actor Critic (SAC)
- Trust Region Policy Optimization (TRPO)
- Proximal Policy Optimization (PPO)
The directory common
contains common modular classes to easily build new algorithms.
approximators
: Basic Deep Neural Networks (Dense, Conv, LSTM).logger
: Log training data and other informationvisualize
: Plot graphspolicies
: Common policies such as Random, Softmax, Parametrized Softmax and Gaussian Policyutils
: Functions to compute the expected return, the Generalized Advantage Estimation (GAE), etc.
Each algorithm directory contains at least 3 files:
main.py
: Main script to run the algorithmhyperparameters.py
: File to contain the default hyperparameters<algo>.py
: Implementation of the algorithmutils.py
: (Optional) File containing some utility functions
Some algorithm directories may have additional files specific to the algorithm.
To contribute to this package, it is recommended to follow this structure:
-
The new algorithm directory should at least contain the 3 files mentioned above.
-
main.py
should contain at least the following functions:argparse
: Parses input argument and loads default hyperparameters fromhyperparameter.py
.main
: Parses input argument, builds the environment and agent, and train the agent.train
: Main training loop called by main()
-
<algo>.py
should contain at least the following methods:__init__
: Initializes the classes_build_policy
: Build policy_build_value_function
: Build value functioncompute_policy_loss
: Build policy loss functionupdate_policy
: Update the policyupdate_value
: Update the value function
- The program was created using Python3.7
- Pytorch
- Numpy
- Pandas
- Tensorboard
- Seaborn
- Scipy
- Gym
To install the RLBotics toolbox, install the required librarires and clone this repository using the following commands:
pip install -r requirements.txt
git clone https://github.com/dyumanaditya/rlbotics
To run the an algorithm on a particular environment, open a terminal and navigate to the folder you just cloned and run the following command:
python3 -m rlbotics.algo.main
Where algo
can be replaced by the algorithm you wish to use. You can also pass in arguments, or modify the hyperparameters.py
file contained in each algorithm folder to change the environment and other hyperparameters related to the algorithm.
Once the algorithm is running you can deploy a tensorboard session to track the progress.
Distributed under the BSD-3-Clause License. See LICENSE for more information.
Dyuman Aditya - dyuman.aditya@gmail.com
Kousheek Chakraborty - kousheekc@gmail.com
Suman Pal - suman7495@gmail.com