RLbench

A simple reinforcement learning benchmark framework

Prerequisites

Python 3.7+
PyTorch 1.11.0+
stable-baselines3 (sb3-contrib) 1.6.0+

Setup

Tested on Ubuntu 20.04 LTS only.

Create the conda environment, then execute setup.sh. This may require a sudo authority. You should type the sudo password during the installation.

conda create -n rlbench python=3.9.7
conda activate rlbench

git clone https://github.com/HRKimLab/RLbench.git
cd RLbench/

sh setup.sh

If you want to utilize your GPU when training, please install an appropriate Cuda toolkit that corresponds to your own GPU

Quick start

After finishing the setup, change your directory path to src/ and use the pre-defined script with the following command.

sh ../scripts/train_and_plot.sh

Directory structure of data files

Overall structure

LunarLanderContinuous-v2/
├── a1
│   ├── a1s1
│   │   ├── a1s1r1-7
│   │   ├── a1s1r2-42
│   │   └── a1s1r3-53
│   └── a1s2
│       ├── a1s2r1-7
│       ├── a1s2r2-42
│       └── a1s2r3-53
...

CartPole-v1/
├── a1
│   ├── a1s1
...

Internal files

...
├── a1s1
│   ├── a1s1r1-0
│   │   ├── 0.monitor.csv   - Learning stats (raw)
│   │   ├── progress.csv    - Learning stats
│   │   ├── best_model.zip  - Best model parameters
│   │   ├── evaluations.npz - Evaluation stats
│   │   └── info.zip        - Other info related to learning

We are supposed to use *.monitor.csv to draw plots. *.monitor.csv contains reward, episode length, and time elapsed, while progress.csv is in charge of more detailed information such as current exploration rate, learning rate, mean of episode reward, and so on.

How to use

Training

At src/,

python train.py --env [ENV_NAME] \
    --algo [ALGORITHM_NAME] \ 
    --hp [CONFIG_PATH] \
    --nseed [NUMBER_OF_EXPS] \
    --nstep [N_TIMESTEPS] \
    --eval_freq [EVAL_FREQ] \
    --eval_eps [N_EVAL_EPISODES]

example

python train.py --env CartPole-v1 \
    --algo dqn \
    --hp default/dqn \
    --nseed 3 \
    --nstep 100000

For more information, please use --help option.
python train.py --help

Train with multiple algorithms and environments

The current implementation only supports running with the same hyperparameters on the multiple experiments

Please modify the hyperparameters in scripts/run_multiple_trains.py as you want.
Then type the following command at the src/

python ../scripts/run_multiple_trains.py

Plotting

At src/,

python plot/plot_mean_combined.py --env [ENV_NAME] \
    --agents [AGENT_LIST] \ 
    --x [X-AXIS] \
    --y [Y-AXIS]

example

python plot/plot_mean_combined.py --env LunarLanderContinuous-v2 \
    --agents "[a1s1r1,a2s1r1,a3s1r1,a4s1r1,a5s1r1,a6s1r1,a7s1r1,a8s1r1]" \
    --x timesteps \
    --y rew

For more information, please use --help option.
python plot/plot_mean_combined.py --help

Rendering

At src/,

python render_q_value.py

Name		Name	Last commit message	Last commit date
Latest commit History 306 Commits
assets		assets
scripts		scripts
src		src
virmen_env		virmen_env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RLbench

Prerequisites

Setup

Quick start

Directory structure of data files

Overall structure

Internal files

How to use

Training

Train with multiple algorithms and environments

Plotting

Rendering

About

Releases

Packages

Contributors 5

Languages

License

HRKimLab/RLbench

Folders and files

Latest commit

History

Repository files navigation

RLbench

Prerequisites

Setup

Quick start

Directory structure of data files

Overall structure

Internal files

How to use

Training

Train with multiple algorithms and environments

Plotting

Rendering

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages