Deep Reinforcement Learning library based on PyTorch, OpenAI Gym, and gin config, and Tensorboard (for visualization and logging). The library is designed for research and fast iteration. It is highly modular to speed up the implementation of new algorithms and shorten iteration cycles.
Some key abstractions include:
- Flexible data collection and storage for on-policy (rollouts) and off-policy (replay buffer) methods.
- Code for evaluating and recording trained agents, as well as checkpointing and logging experiments.
- Abstract interface for algorithms with a training loop suitable for both RL and supervised learning.
- Two versions of PPO: One with a clipped objective and one with an adaptive KL penalty.
- DQN with Double Q Learning and Prioritized Experience Replay
- DDPG
- TD3
- SAC
- Alpha Zero
Examples of how to launch experiments can be found here.
The code can be installed using docker or using pip.
Pip:
- In the top level directory, run
pip install -e .
Docker:
- Install docker.
- Install x-docker, a wrapper around docker for running GUI applications inside a container.
- In the top level directory, build the docker image by running:
./build_docker.sh
- Launch the docker container by running:
./launch_docker.sh
This will start a container and mount the code at /root/pkgs/dl.
- From inside the container, run:
cd /root/pkgs/dl/examples/ppo
(You can replace ppo with another exapmle algorithm) - Run:
./train.sh
This will create a log directory and start training with the default environment and hyperparameters. Pressing Ctrl-C will interrupt trianing and save the current model. - In another terminal, run:
tensorboard --logdir /path/to/log/directory