Skip to content

Usage – Configuration

Nikkel Mollenhauer edited this page Jul 20, 2022 · 24 revisions

This page will introduce the various configuration files used by the different tasks that can be performed within the recommerce-framework.

All configuration in the recommerce-framework is done through .json-files. These files should be located in the configuration_files folder of the user's datapath (link) to be discoverable by the framework.

The framework itself offers a number of default files which can be used outright or modified as needed. After installation, the default files can be extracted to the users datapath by executing one of the following commands:

recommerce --get-defaults

This command extracts the default files to a folder called default_data in the users datapath. Aside from the default configuration files, the folder also contains a number of pre-trained models for various RL-algorithms, though these are not necessarily very good and can be mostly used for demonstration purposes. The other available command is:

recommerce --get-defaults-unpack

will extract the default files and unpack them into the two folders configuration_files and data in the users datapath respectively.

Please note that the --get-defaults-unpack command will overwrite any existing files with identical names in the users datapath.

The following sections will introduce all types of configuration files, and which parameters can be set through them.

environment_config

The environment_config file contains parameters that are at the highest level of the simulation, among other including the definition of the task that should be performed. Currently, there are three possible tasks, each with their own environment_config file, detailed below. However, the follwing parameters are common to all tasks, and only the agent_monitoring-task requires additional parameters.

Key Type Description
task string; one of "training", "exampleprinter" or "agent_monitoring" The task that should be performed.
marketplace A valid recommerce marketplace class as string The marketplace class that should be used for the simulation.
agents list of dictionaries The agents that should be playing on the selected marketplace. See below for the format of the dictionaries.

As mentioned above, the agents key should have a list of dictionaries as its value. Each dictionary should have the following structure:

Key Type Description
name string The name that should be used to identify the agent.
agent_class A valid recommerce agent class as string The class of the agent. Must fit the marketplace class.
argument string or list Depending on the chosen agent_class and task, an argument may be needed. For trained RL-agents this is the name of the modelfile that should be used. For FixedPrice-agents this is a list containing the fixed prices that should be used. If no argument is needed, this must be the empty string.

environment_config_agent_monitoring

The agent_monitoring task requires the following parameters in addition to the parameters common to all tasks:

Key Type Description
episodes int The number of episodes that should be simulated.
plot_interval int The interval at which the plots should be updated. Supposed to be removed in the future, see this issue.
separate_markets bool If true, the agents will play on different but identical markets. If false, the agents will play on the same market against each other.

market_config

THe market_config contains parameters that influence how the market simulation itself plays out. Circular Economy marketplaces require the following parameters:

Key Type Description
max_storage int The maximum number of items a vendor can have in storage. This also influences the policies of many rule-based vendors.
episode_length int How many steps an episode has.
max_price int The maximum price vendors can set for any price channel.
number_of_customers int The number of customers that make purchasing decisions each step.
production_price int The cost of selling a new product.
storage_cost_per_product float The cost for storing one item in inventory for one episode.
opposite_own_state_visibility bool Whether or not vendors know the amount of items in storage of their competitors.
common_state_visibility bool Whether or not vendors know the amount of items in circulation.
reward_mixed_profit_and_difference bool A toggle for the way an agent's reward is calculated. If true, the reward is not only dependent on the agent's own profits, but also the profits of its competitors.

market_config for Linear Markets

In addition to the parameters required by the Circular Economy market, the Linear Markets also require the following parameter:

Key Type Description
max_quality int The maximum quality a product can be (randomly) assigned at the start of an episode.

TODO: If there are parameters exclusive to Circular Economies, list them here

rl_config

The rl_config file contains parameters that are used to configure Reinforcement Learning agents for the training-task. Each RL-algorithm requires different parameters, listed in the sections below. The sections are named after the respective class names of the agents in the framework.

TODO: Find out if the config is used for something else than training

QLearningAgent

The QLearningAgent is an algorithm implemented by the project team. It requires the rl_config to contain the following parameters:

Key Type Description
gamma float
batch_size int
replay_size int
learning_rate float
sync_target_frames int
replay_start_size int
epsilon_decay_last_frame int
epsilon_start float
epsilon_final float

ActorCriticAgent

There are two ActorCriticAgents that were implemented by the project team: The DiscreteActorCriticAgent and the ContinuousActorCriticAgent. Both of them require their rl_config to contain the following parameters:

Key Type Description
gamma float
sync_target_frames int

StableBaselinesA2C

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config:

Key Type Description
learning_rate float
n_steps int
gamma float
neurones_per_hidden_layer int

StableBaselinesDDPG

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config:

Key Type Description
learning_rate float
buffer_size int
learning_starts int
batch_size int
tau float
gamma float

StableBaselinesPPO

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config:

Key Type Description
learning_rate float
n_steps int
batch_size int
n_epochs int
gamme float
clip_range float
neurones_per_hidden_layer int

StableBaselinesSAC

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config:

Key Type Description
learning_rate float
buffer_size int
learning_starts int
batch_size int
tau float
gamma float
ent_coef float or str

StableBaselinesTD3

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config:

Key Type Description
learning_rate float
buffer_size int
learning_starts int
batch_size int
tau float
gamma float