Skip to content

Commit

Permalink
Merge pull request #80 from rdnfn/dev/general
Browse files Browse the repository at this point in the history
v0.5.0
  • Loading branch information
rdnfn authored May 26, 2022
2 parents 008a004 + 7af3b97 commit c51e9a6
Show file tree
Hide file tree
Showing 49 changed files with 951 additions and 357 deletions.
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,5 @@ docs/generated/
#beobench
beobench_results*
notebooks/archive
*beo.yaml
*beo.yml
.beobench.yml
perf_tests*
22 changes: 22 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: >-
Beobench: A Toolkit for Unified Access to Building
Simulations for Reinforcement Learning
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
version: 0.5.0
url: https://github.com/rdnfn/beobench
authors:
- given-names: Arduin
family-names: Findeis
- given-names: Fiodar
family-names: Kazhamiaka
- given-names: Scott
family-names: Jeen
- given-names: Srinivasan
family-names: Keshav
7 changes: 7 additions & 0 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,13 @@ beobench could always use more documentation, whether as part of the
official beobench docs, in docstrings, or even on the web in blog posts,
articles, and such.

To update the API docs, use the following command inside the ``/docs`` directory:

.. code-block::
sphinx-apidoc -f -o . ..
Submit Feedback
~~~~~~~~~~~~~~~

Expand Down
23 changes: 23 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,29 @@
History
=======

0.5.0 (2022-05-26)
------------------

* Features:

* Mean and cummulative metrics can now be logged by WandbLogger wrapper.
* Support for automatically running multiple samples/trials of same experiment via ``num_samples`` config parameter.
* Configs named `.beobench.yml` will be automatically parsed when Beobench is run in directory containing such a config. This allows users to set e.g. wandb API keys without referring to the config in every Beobench command call.
* Configs from experiments now specify the Beobench version used. When trying to rerun an experiment this version will be checked, and an error thrown if there is a mismatch between installed and requested version.
* Add improved high-level API for getting started. This uses the CLI arguments ``--method``, ``--gym`` and ``--env``. Example usage: ``beobench run --method ppo --gym sinergym --env Eplus-5Zone-hot-continuous-v1``.

* Improvements

* Add ``CITATION.cff`` file to citing software easier.
* By default, docker builds of experiment images are now skipped if an image with tag corresponding to installed Beobench version already exists.
* Remove outdated guides and add yaml configuration description from docs.
* Add support for logging multidimensional actions to wandb.
* Add support for logging summary metrics on every env reset to wandb.

* Fixes

* Updated BOPTEST integration to work with current version of Beobench.

0.4.4 (2022-05-09)
------------------

Expand Down
4 changes: 3 additions & 1 deletion PYPI_README.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
A toolbox for benchmarking reinforcement learning (RL) algorithms on building energy optimisation (BEO) problems. Beobench tries to make working on RL for BEO easier: it provides simple access to existing libraries defining BEO problems (such as `BOPTEST <https://github.com/ibpsa/project1-boptest>`_) and provides a large set of pre-configured RL algorithms. Beobench is *not* a gym library itself - instead it leverages the brilliant work done by many existing gym-type projects and makes their work more easily accessible.
A toolkit providing easy and unified access to building control environments for reinforcement learning (RL). Compared to other domains, `RL environments for building control <https://github.com/rdnfn/rl-building-control#environments>`_ tend to be more difficult to install and handle. Most environments require the user to either manually install a building simulator (e.g. `EnergyPlus <https://github.com/NREL/EnergyPlus>`_) or to manually manage Docker containers. This can be tedious.

Beobench was created to make building control environments easier to use and experiments more reproducible. Beobench uses Docker to manage all environment dependencies in the background so that the user doesn't have to. A standardised API allows the user to easily configure experiments and evaluate new RL agents on building control environments.

For more information go to the `documentation <https://beobench.readthedocs.io/>`_ and the `GitHub code repository <https://github.com/rdnfn/beobench>`_.
24 changes: 19 additions & 5 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
:target: https://opensource.org/licenses/MIT
:alt: License

A toolkit providing easy and unified access to building control environments for reinforcement learning (RL). Compared to other domains, `RL environments for building control <https://github.com/rdnfn/rl-building-control#environments>`_ tend to be more difficult to install and handle. Most environments require the user to either manually install a building simulator (e.g. `EnergyPlus <https://github.com/NREL/EnergyPlus>`_) or to manually manage Docker containers. This is tedious.
A toolkit providing easy and unified access to building control environments for reinforcement learning (RL). Compared to other domains, `RL environments for building control <https://github.com/rdnfn/rl-building-control#environments>`_ tend to be more difficult to install and handle. Most environments require the user to either manually install a building simulator (e.g. `EnergyPlus <https://github.com/NREL/EnergyPlus>`_) or to manually manage Docker containers. This can be tedious.

Beobench was created to make building control environments easier to use and experiments more reproducible. Beobench uses Docker to manage all environment dependencies in the background so that the user doesn't have to. A standardised API, illustrated in the figure below, allows the user to easily configure experiments and evaluate new RL agents on building control environments.

Expand Down Expand Up @@ -66,7 +66,7 @@ Installation
------------

1. `Install docker <https://docs.docker.com/get-docker/>`_ on your machine (if on Linux, check the `additional installation steps <https://beobench.readthedocs.io/en/latest/guides/installation_linux.html>`_)
2. Install *beobench* using:
2. Install Beobench using:

.. code-block:: console
Expand Down Expand Up @@ -94,9 +94,23 @@ Experiment configuration

To get started with our first experiment, we set up an *experiment configuration*.
Experiment configurations
can be given as a yaml file or a Python dictionary. Such a configuration
can be given as a yaml file or a Python dictionary. The configuration
fully defines an experiment, configuring everything
from the RL agent to the environment and its wrappers.
from the RL agent to the environment and its wrappers. The figure below illustrates the config structure.

.. raw:: html

<p align="center">

.. image:: https://github.com/rdnfn/beobench/raw/2cf961a8135b25c9a66e70d67eea9890ce0b878a/docs/_static/beobench_config_v1.png
:align: center
:width: 350 px
:alt: Beobench

.. raw:: html

</p>


Let's look at a concrete example. Consider this ``config.yaml`` file:

Expand Down Expand Up @@ -174,7 +188,7 @@ Execution

.. end-qs-sec3
Given the configuration and agent script above, we can run the experiment using either via the command line:
Given the configuration and agent script above, we can run the experiment either via the command line:

.. code-block:: console
Expand Down
2 changes: 1 addition & 1 deletion beobench/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

__author__ = """Beobench authors"""
__email__ = "-"
__version__ = "0.4.4"
__version__ = "0.5.0"

from beobench.utils import restart
from beobench.experiment.scheduler import run
2 changes: 1 addition & 1 deletion beobench/beobench_contrib
14 changes: 14 additions & 0 deletions beobench/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@ def cli():
default=None,
help="Name of RL method to use in experiment.",
)
@click.option(
"--gym",
default=None,
help="Name of gym framework to use in experiment.",
)
@click.option(
"--env",
default=None,
Expand Down Expand Up @@ -82,9 +87,15 @@ def cli():
default=None,
help="For developer use only: location of custom beobench package version.",
)
@click.option(
"--force-build",
is_flag=True,
help="whether to force a re-build, even if image already exists.",
)
def run(
config: str,
method: str,
gym: str,
env: str,
local_dir: str,
wandb_project: str,
Expand All @@ -96,6 +107,7 @@ def run(
no_additional_container: bool,
use_no_cache: bool,
dev_path: str,
force_build: bool,
) -> None:
"""Run beobench experiment from command line.
Expand All @@ -110,6 +122,7 @@ def run(
beobench.experiment.scheduler.run(
config=list(config),
method=method,
gym=gym,
env=env,
local_dir=local_dir,
wandb_project=wandb_project,
Expand All @@ -121,6 +134,7 @@ def run(
no_additional_container=no_additional_container,
use_no_cache=use_no_cache,
dev_path=dev_path,
force_build=force_build,
)


Expand Down
2 changes: 2 additions & 0 deletions beobench/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

import pathlib

USER_CONFIG_PATH = pathlib.Path("./.beobench.yml")

# available gym-framework integrations
AVAILABLE_INTEGRATIONS = [
"boptest",
Expand Down
14 changes: 14 additions & 0 deletions beobench/data/agents/random_action.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,24 @@
except KeyError:
horizon = 1000

try:
imitate_rllib_env_checks = config["agent"]["config"]["imitate_rllib_env_checks"]
except KeyError:
imitate_rllib_env_checks = False


print("Random agent: starting test.")

env = create_env()

if imitate_rllib_env_checks:
# RLlib appears to reset and take single action in env
# this may be to check compliance of env with space etc.
env.reset()
action = env.action_space.sample()
_, _, _, _ = env.step(action)


observation = env.reset()

num_steps_per_ep = 0
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
72 changes: 72 additions & 0 deletions beobench/data/configs/baselines/boptest_arroyo2022_dqn.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# A first attempt at reproduction of experiments in the following paper by Arroyo et al.
# https://lirias.kuleuven.be/retrieve/658452
#
# Some of the descriptions of RLlib config values are taken from
# https://docs.ray.io/en/latest/rllib/rllib-training.html
# other from
# https://github.com/ibpsa/project1-boptest-gym/blob/master/boptestGymEnv.py

env:
gym: boptest
config:
name: bestest_hydronic_heat_pump
# whether to normalise the observations and actions
normalize: True
discretize: True
gym_kwargs:
actions: ["oveHeaPumY_u"]
# Dictionary mapping observation keys to a tuple with the lower
# and upper bound of each observation. Observation keys must
# belong either to the set of measurements or to the set of
# forecasting variables of the BOPTEST test case. Contrary to
# the actions, the expected minimum and maximum values of the
# measurement and forecasting variables are not provided from
# the BOPTEST framework, although they are still relevant here
# e.g. for normalization or discretization. Therefore, these
# bounds need to be provided by the user.
# If `time` is included as an observation, the time in seconds
# will be passed to the agent. This is the remainder time from
# the beginning of the episode and for periods of the length
# specified in the upper bound of the time feature.
observations:
reaTZon_y: [280.0, 310.0]
# Set to True if desired to use a random start time for each episode
random_start_time: True
# Maximum duration of each episode in seconds
max_episode_length: 31536000 # one year in seconds
# Desired simulation period to initialize each episode
warmup_period: 10
# Sampling time in seconds
step_period: 900 # = 15min
agent:
origin: rllib
config:
run_or_experiment: DQN
config:
lr: 0.0001
gamma: 0.99
# Number of steps after which the episode is forced to terminate. Defaults
# to `env.spec.max_episode_steps` (if present) for Gym envs.
horizon: 24 # one week 672 = 96 * 7 # other previous values: 96 # 10000 #
# Calculate rewards but don't reset the environment when the horizon is
# hit. This allows value estimation and RNN state to span across logical
# episodes denoted by horizon. This only has an effect if horizon != inf.
soft_horizon: True
num_workers: 1 # this is required, otherwise effectively assuming simulator.
# Training batch size, if applicable. Should be >= rollout_fragment_length.
# Samples batches will be concatenated together to a batch of this size,
# which is then passed to SGD.
train_batch_size: 24
stop:
timesteps_total: 105120 # = 3 years # 35040 # = 365 * 96 (full year)
wrappers:
- origin: general
class: WandbLogger
config:
log_freq: 1
summary_metric_keys:
- env.returns.reward
general:
wandb_project: boptest_arroyo2022_baseline
wandb_group: random_action
num_samples: 1
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# A first attempt at reproduction of experiments in the following paper by Arroyo et al.
# https://lirias.kuleuven.be/retrieve/658452
#
# Some of the descriptions of RLlib config values are taken from
# https://docs.ray.io/en/latest/rllib/rllib-training.html
# other from
# https://github.com/ibpsa/project1-boptest-gym/blob/master/boptestGymEnv.py

env:
gym: boptest
name: bestest_hydronic_heat_pump
config:
boptest_testcase: bestest_hydronic_heat_pump
# whether to normalise the observations and actions
normalize: True
gym_kwargs:
actions: ["oveHeaPumY_u"]
# Dictionary mapping observation keys to a tuple with the lower
# and upper bound of each observation. Observation keys must
# belong either to the set of measurements or to the set of
# forecasting variables of the BOPTEST test case. Contrary to
# the actions, the expected minimum and maximum values of the
# measurement and forecasting variables are not provided from
# the BOPTEST framework, although they are still relevant here
# e.g. for normalization or discretization. Therefore, these
# bounds need to be provided by the user.
# If `time` is included as an observation, the time in seconds
# will be passed to the agent. This is the remainder time from
# the beginning of the episode and for periods of the length
# specified in the upper bound of the time feature.
observations:
reaTZon_y: [280.0, 310.0]
# Set to True if desired to use a random start time for each episode
random_start_time: True
# Maximum duration of each episode in seconds
max_episode_length: 31536000 # one year in seconds
# Desired simulation period to initialize each episode
warmup_period: 10
# Sampling time in seconds
step_period: 900 # = 15min
agent:
origin: random_action
config:
config:
horizon: 96
stop:
timesteps_total: 10000
imitate_rllib_env_checks: True
wrappers:
- origin: general
class: WandbLogger
config:
log_freq: 1
summary_metric_keys:
- env.returns.reward
general:
wandb_project: boptest_arroyo2022_baseline
wandb_group: random_action
num_samples: 1
Loading

0 comments on commit c51e9a6

Please sign in to comment.