Skip to content

Usage – Configuration

Nikkel Mollenhauer edited this page Jul 19, 2022 · 24 revisions

This page will introduce the various configuration files used by the different tasks that can be performed within the recommerce-framework.

All configuration in the recommerce-framework is done through .json-files. These files are located in the configuration_files folder of the user's datapath (link). The following sections will introduce all types of configuration files, and which parameters can be set through them.

environment_config

The environment_config file contains parameters that are at the highest level of the simulation, among other including the definition of the task that should be performed. Currently, there are three possible tasks, each with their own environment_config file, detailed below. However, the follwing parameters are common to all tasks, and only the agent_monitoring-task requires additional parameters.

Key Value Description
task training, exampleprinter or agent_monitoring The task that should be performed.
marketplace A valid recommerce marketplace class as string The marketplace class that should be used for the simulation.
agents list of dictionaries The agents that should be playing on the selected marketplace. See below for the format of the dictionaries.

As mentioned above, the agents key should have a list of dictionaries as its value. Each dictionary should have the following structure:

Key Value Description
name string The name that should be used to identify the agent.
agent_class A valid recommerce agent class as string The class of the agent. Must fit the marketplace class.
argument string or list Depending on the chosen agent_class and task, an argument may be needed. For trained RL-agents this is the name of the modelfile that should be used. For FixedPrice-agents this is a list containing the fixed prices that should be used. If no argument is needed, this must be the empty string.

environment_config_agent_monitoring

The agent_monitoring task requires the following parameters in addition to the parameters common to all tasks:

Key Value Description
episodes int The number of episodes that should be simulated.
plot_interval int The interval at which the plots should be updated. Supposed to be removed in the future, see this issue.
separate_markets bool If true, the agents will play on different but identical markets. If false, the agents will play on the same market against each other.

market_config

THe market_config contains parameters that influence how the market simulation itself plays out. Circular Economy marketplaces require the following parameters:

Key Value Description
max_storage int The maximum number of items a vendor can have in storage. This also influences the policies of many rule-based vendors.
episode_length int How many steps an episode has.
max_price int The maximum price vendors can set for any price channel.
number_of_customers int The number of customers that make purchasing decisions each step.
production_price int The cost of selling a new product.
storage_cost_per_product float The cost for storing one item in inventory for one episode.
opposite_own_state_visibility bool Whether or not vendors know the amount of items in storage of their competitors.
common_state_visibility bool Whether or not vendors know the amount of items in circulation.
reward_mixed_profit_and_difference bool A toggle for the way an agent's reward is calculated. If true, the reward is not only dependent on the agent's own profits, but also the profits of its competitors.

market_config for Linear Markets

In addition to the parameters required by the Circular Economy market, the Linear Markets also require the following parameter:

Key Value Description
max_quality int The maximum quality a product can be (randomly) assigned at the start of an episode.

TODO: If there are parameters exclusive to Circular Economies, list them here

rl_config

The rl_config file contains parameters that are used to configure Reinforcement Learning agents for the training-task. Each RL-algorithm requires different parameters, listed in the sections below. The sections are named after the respective class names of the agents in the framework.

TODO: Find out if the config is used for something else than training

QLearningAgent

The QLearningAgent is an algorithm implemented by the project team. It requires the rl_config to contain the following parameters:

Key Value Description
gamma float
batch_size int
replay_size int
learning_rate float
sync_target_frames int
replay_start_size int
epsilon_decay_last_frame int The last episode in which the epsilon will be decreased.
epsilon_start float The starting value of the epsilon. Must be between 0 and 1 and bigger than epsilon_final.
epsilon_final float The end value of the epsilon. Must be between 0 and 1 and smaller than epsilon_start.

ActorCriticAgent

There are two ActorCriticAgents that were implemented by the project team: The DiscreteActorCriticAgent and the ContinuousActorCriticAgent. Both of them require their rl_config to contain the following parameters:

StableBaselinesA2C

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config:

StableBaselinesDDPG

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config:

StableBaselinesPPO

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config:

StableBaselinesSAC

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config:

StableBaselinesTD3

This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config: