-
Notifications
You must be signed in to change notification settings - Fork 0
Usage – Configuration
This page will introduce the various configuration files used by the different tasks that can be performed within the recommerce
-framework.
All configuration in the recommerce
-framework is done through .json
-files. These files must be located in the configuration_files
folder of the user's datapath
to be discoverable by the framework.
The framework itself offers a number of default files which can be used outright or modified as needed. After installation, the default files can be extracted to the users datapath
by executing one of the following commands:
recommerce --get-defaults
This command extracts the default files to a folder called default_data
in the users datapath
. Aside from the default configuration files, the folder also contains a number of pre-trained models for various RL-algorithms, though these are not necessarily very good and can be mostly used for demonstration purposes. The other available command is:
recommerce --get-defaults-unpack
will extract the default files and unpack them into the two folders configuration_files
and data
in the users datapath
respectively.
Please note that the --get-defaults-unpack
command will overwrite any existing files with identical names in the users datapath
.
The following sections will introduce all types of configuration files, and which parameters can be set through them. See the respective linked example files for the exact syntax.
Example environment_config_training.json
Example environment_config_exampleprinter.json
Example environment_config_agent_monitoring.json
The environment_config
file contains parameters that are at the highest level of the simulation, among other including the definition of the task that should be performed. Currently, there are three possible tasks, each with their own environment_config
file, detailed below. However, the following parameters are common to all tasks, and only the agent_monitoring
-task requires additional parameters.
Key | Type | Description |
---|---|---|
task |
string ; one of "training" , "exampleprinter" or "agent_monitoring"
|
The task that should be performed. |
marketplace |
A valid recommerce marketplace class as string
|
The marketplace class that should be used for the simulation. |
agents |
list of dictionaries
|
The agents (vendors) that should be playing on the selected marketplace . See below for the format of these dictionaries. |
As mentioned above, the agents
key should have a list of dictionaries as its value. Each dictionary should have the following structure:
Key | Type | Description |
---|---|---|
name |
string |
The name that should be used to identify the agent. |
agent_class |
A valid recommerce agent class as string
|
The class of the agent. Must fit the marketplace class. |
argument |
string or list
|
Depending on the chosen agent_class and task , an argument may be needed. For trained RL-agents, this is the name of the modelfile that should be used. For FixedPrice -agents this is a list containing the fixed prices that should be used. If no argument is needed, this must be the empty string. |
The agent_monitoring
task requires the following parameters in addition to the parameters common to all tasks:
Key | Type | Description |
---|---|---|
episodes |
int |
The number of episodes that should be simulated. |
plot_interval |
int |
The interval at which the plots should be updated. This is supposed to be removed in the future, see this issue. |
separate_markets |
bool |
If true , the agents will play on different but identical markets. If false , the agents will play on the same market against each other. |
The market_config
contains parameters that influence how the market simulation itself plays out. Circular cconomy marketplaces require the following parameters:
Key | Type | Description |
---|---|---|
max_storage |
int |
The maximum number of items a vendor can have in storage. This also influences the policies of many rule based vendors. |
episode_length |
int |
How many steps an episode has. |
max_price |
int |
The maximum price vendors can set for any price channel. |
number_of_customers |
int |
The number of customers that make purchasing decisions each step. |
production_price |
int |
The cost of selling a new product. |
storage_cost_per_product |
float |
The cost for storing one item in inventory for one episode. |
opposite_own_state_visibility |
bool |
Whether or not vendors know the amount of items in storage of their competitors. |
common_state_visibility |
bool |
Whether or not vendors know the amount of items in circulation. |
reward_mixed_profit_and_difference |
bool |
A toggle for the way an agent's reward is calculated. If true , the reward is not only dependent on the agent's own profits, but also the profits of its competitors. |
In addition to the parameters required by the circular economy market, the linear markets also require the following parameter:
Key | Type | Description |
---|---|---|
max_quality |
int |
The maximum quality a product can be (randomly) assigned at the start of an episode. |
TODO: If there are parameters exclusive to circular economies, list them here
Example rl_config.json
for a Q-Learning agent
The rl_config
file contains parameters that are used to configure reinforcement learning agents for the training
-task (please note that currently, the RL-agents always require the rl_config
, even if they are not used for training. This issue aims to fix this). Each RL-algorithm requires different parameters, listed in the sections below. The sections are named after the respective class names of the agents in the framework.
Please note that unfortunately, specific descriptions of RL-parameters are still missing.
The QLearningAgent
is an algorithm implemented by the project team. It requires the rl_config
to contain the following parameters:
Key | Type | Description |
---|---|---|
gamma |
float |
|
batch_size |
int |
|
replay_size |
int |
|
learning_rate |
float |
|
sync_target_frames |
int |
|
replay_start_size |
int |
|
epsilon_decay_last_frame |
int |
|
epsilon_start |
float |
|
epsilon_final |
float |
There are two ActorCriticAgents
that were implemented by the project team: The DiscreteActorCriticAgent
and the ContinuousActorCriticAgent
. Both of them require their rl_config
to contain the following parameters:
Key | Type | Description |
---|---|---|
gamma |
float |
|
sync_target_frames |
int |
This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config
:
Key | Type | Description |
---|---|---|
learning_rate |
float |
|
n_steps |
int |
|
gamma |
float |
|
neurones_per_hidden_layer |
int |
This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config
:
Key | Type | Description |
---|---|---|
learning_rate |
float |
|
buffer_size |
int |
|
learning_starts |
int |
|
batch_size |
int |
|
tau |
float |
|
gamma |
float |
This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config
:
Key | Type | Description |
---|---|---|
learning_rate |
float |
|
n_steps |
int |
|
batch_size |
int |
|
n_epochs |
int |
|
gamme |
float |
|
clip_range |
float |
|
neurones_per_hidden_layer |
int |
This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config
:
Key | Type | Description |
---|---|---|
learning_rate |
float |
|
buffer_size |
int |
|
learning_starts |
int |
|
batch_size |
int |
|
tau |
float |
|
gamma |
float |
|
ent_coef |
float or str
|
This algorithm is being imported from StableBaselines3. Find the relevant page in their documentation here. In our framework, the algorithm requires the following parameters to be present in the rl_config
:
Key | Type | Description |
---|---|---|
learning_rate |
float |
|
buffer_size |
int |
|
learning_starts |
int |
|
batch_size |
int |
|
tau |
float |
|
gamma |
float |
Online Marketplace Simulation: A Testbed for Self-Learning Agents is the 2021/2022 bachelor's project of the Enterprise Platform and Integration Concepts (@hpi-epic, epic.hpi.de) research group of the Hasso Plattner Institute.