Skip to content

HPO and Architecture Benchmarking for RL: Dynamically, Reactive and Efficient

License

Notifications You must be signed in to change notification settings

automl/arlbench

Repository files navigation

ARLBench Logo ARLBench Logo

PyPI Version Python License Test Doc Status


🦾 Automated Reinforcement Learning Benchmark

The ARLBench is a benchmark for HPO in RL - evaluate your HPO methods fast and on a representative number of environments! For more information, see our documentation. The dataset is available at HuggingFace.

Features

  • Lightning-fast JAX-Based implementations of DQN, PPO, and SAC
  • Compatible with many different environment domains via Gymnax, XLand and EnvPool
  • Representative benchmark set of HPO settings

ARLBench Subsets

Installation

There are currently two different ways to install ARLBench. Whichever you choose, we recommend to create a virtual environment for the installation:

conda create -n arlbench python=3.10
conda activate arlbench

The instructions below will help you install the default version of ARLBench with the CPU version of JAX. If you want to run the ARLBench on GPU, we recommend you check out the JAX installation guide to see how you can install the correct version for your GPU setup before proceeding.

PyPI You can install ARLBench using `pip`:
pip install arlbench

If you want to use envpool environments (not currently supported for Mac!), instead choose:

pip install arlbench[envpool]
From source: GitHub First, you need to clone the ARLBench reopsitory:
git clone git@github.com:automl/arlbench.git
cd arlbench

Then you can install the benchmark. For the base version, use:

make install

For the envpool functionality (not available on Mac!), instead use:

make install-envpool

Caution

Windows is currently not supported and also not tested. We recommend using the Linux subsytem if you're on a Windows machine.

Quickstart

Here are the two ways you can use ARLBench: via the command line or as an environment. To see them in action, take a look at our examples.

Use the CLI

We provide a command line script for black-box configuration in ARLBench which will also save the results in a 'results' directory. To execute one run of DQN on CartPole, simply run:

python run_arlbench.py

You can use the hydra command line syntax to override some of the configuration like this to change to PPO:

python run_arlbench.py algorithm=ppo

Or run multiple different seeds after one another:

python run_arlbench.py -m autorl.seed=0,1,2,3,4

All hyperparamters to adapt are in the 'hpo_config' and architecture settings in the 'nas_config', so to run a grid of different configurations for 5 seeds each , you can do this:

python run_arlbench.py -m autorl.seed=0,1,2,3,4 nas_config.hidden_size=8,16,32 hp_config.learning_rate=0.001,0.01

We recommend you create your own custom config files if using the CLI (for more information on this, checkout Hydra's guide to config files). Our examples can show you how these can look.

Use the AutoRL environment

If you want to have specific control over the ARLBench loop, want to do dynamic configuration or learn based on the agent state, you should use the environment-like interface of ARLBench in your script.

To do so, import ARLBench and use the AutoRLEnv to run an RL agent:

from arlbench import AutoRLEnv

env = AutoRLEnv()

obs, info = env.reset()

action = env.config_space.sample_configuration()
obs, objectives, term, trunc, info = env.step(action)

Just like with RL agents, you can call 'step' multiple times until termination (which you define via the AutoRLEnv's config). For all configuration options, check out our documentation.

Cite Us

If you use ARLBench in your work, please cite us:

@misc{beckdierkes24,
      title={ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning}, 
      author={J. Becktepe and J. Dierkes and C. Benjamins and D. Salinas and A. Mohan and R. Rajan and F. Hutter and H. Hoos and M. Lindauer and T. Eimer},
      year={2024},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2409.18827},
      note={GitHub: https://github.com/automl/arlbench}, 
}

About

HPO and Architecture Benchmarking for RL: Dynamically, Reactive and Efficient

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •