-
Notifications
You must be signed in to change notification settings - Fork 0
Usage – Examples
This page will give an introduction into the major tools our framework offers, together with exemplary walkthrough guides to follow along.
Please refer to this page for an explanation of the training
task.
The following is needed before the training
-task can be started:
These files must all be present in a folder called configuration_files
within the user's datapath
.
After placing the above files in the configuration_files
folder, the training
-task can be started by running the following command:
recommerce -c training
If the configuration files and the given marketplace and vendor classes are valid, the training
-task will automatically start by initializing the reinforcement learning agent.
After this is done, the agent will be trained for the set amount of episodes defined in the rl_config.json
file.
At any point, you can check the training progress by taking a look at the terminal output, which displays a progress bar with an estimate of the remaining time. The terminal also displays the current episode and maximum reward achieved by the trained agent. Whenever a certain amount of episodes has completed, the simulation will save a so-called intermediate model, which contains the current policy of the agent. These models are used after the training session has finished.
After the training has completed, the framework will automatically start a monitoring session. During this monitoring, two things are done:
- Statistics collected during the training session are visualized in a number of diagrams.
- A set number of additional episodes is simulated, monitoring each of the saved intermediate models. For all of these, additional metrics are collected and visualized, which allows users to compare the different models.
The training
-task produces a number of different output files, in addition to the terminal output while the task is running.
All output folders are located within the user's datapath
.
-
results/runs
: This folder contains the raw data collected during training and created by the TensorBoard. It can be viewed even after the training has finished by starting a TensorBoard session. -
results/trainedModels
: This folder contains the trained intermediate models, which were saved during training session. They can be used to monitor the agent using one of the other tasks. -
results/monitoring
: This folder contains the visualizations created during the monitoring session run directly after the training session.
Please refer to this page for an explanation of the exampleprinter
task.
The following is needed before the exampleprinter
-task can be started:
These files must all be present in a folder called configuration_files
within the user's datapath
.
If the exampleprinter
should be run on a reinforcement learning agent:
- A trained RL-agent model (see training task) in the
data
folder of the user'sdatapath
. rl_config.json
file
After placing the above files in their respective folders, the exampleprinter
-task can be started by running the following command:
recommerce -c exampleprinter
If the configuration files and the given marketplace and vendor classes are valid, the exampleprinter
-task will automatically start. During the task, the marketplace will be simulated for the set amount of steps defined in the market_config.json
file.
During each step of the simulation, the exampleprinter
will record all actions taken by the vendors as well as all market states. Within each step, the current state is also printed to the terminal, meaning the prices the various vendors set as well as their respective inventory levels can be seen. The following is an exemplary state produced in a circular economy marketplace with rebuy prices enabled and with two vendors playing:
[279. 100. 3. 5. 1. 10.]
The table below explains what the different values mean in this example:
Value | Explanation |
---|---|
279 | The number of items currently in circulation. |
100 | The number of items the first vendor has in its inventory. |
3 | The price for the refurbished product set by the second vendor. |
5 | The price for the new product set by the second vendor. |
1 | The rebuy-price set by the second vendor. |
10 | The number of items the second vendor has in its inventory. |
Depending on the chosen market scenario (at the time only the CircularEconomyRebuyPriceDuopoly
class is supported), the exampleprinter
will also visualize all actions and market states in a separate .svg
file for each step, and at the end compile all diagrams into an animated .html
slideshow. See this page for an excerpt from one such slideshow.
The exampleprinter
-task produces a number of different output files, in addition to the terminal output while the task is running.
All output folders are located within the user's datapath
.
-
results/runs
: This folder contains the raw data collected during the simulation and created by the TensorBoard. It can be viewed after the session has finished by starting a TensorBoard session.
If the chosen marketplace is compatible:
-
results/monitoring
: This folder contains the animated overview diagram, as well as.svg
files for the different steps of the simulation, outlining actions and states within the steps.
Please refer to this page for an explanation of the agent_monitoring
task.
The following is needed before the agent_monitoring
-task can be started:
These files must all be present in a folder called configuration_files
within the user's datapath
.
For each Reinforcement learning agent that is monitored, the following files must be present:
After placing the above files in their respective folders, the agent_monitoring
-task can be started by running the following command:
recommerce -c agent_monitoring
If the configuration files and the given marketplace and vendor classes are valid, the agent_monitoring
-task will automatically start by printing the configuration to the terminal.
The framework will then proceed to simulate the amount of episodes configured in the market_config.json
file. The progress of the monitoring session is displayed through a progress bar in the terminal.
After the simulation has finished the agent_monitoring
-task will start to visualize the collected metrics using a range of different diagrams, in addition to some output printed immediately to the terminal. For the full list of diagrams and which metrics they visualize, please refer to this page.
The agent_monitoring
task produces a large number of different diagrams, which are all collected within the results/monitoring
folder.
Please refer to this page for an explanation of the policyanalyzer
task.
Disclaimer: As the policyanalyzer
-task is currently not integrated into the recommerce
workflow (meaning there is no CLI command as of yet), this section will not be too detailed and should be updated as soon as the tool is properly integrated. See this issue.
The following is needed before the policyanalyzer
-task can be started:
Currently, additional configuration must be done within the policyanalyzer.py
file, which will not be included in this guide, as this should no longer be necessary after the task has been included in the recommerce
workflow.
After the policyanalyzer
-task has been configured and the configuration file placed in the configuration_files
folder, the task can be started by executing the policyanalyzer.py
file.
During the policyanalyzer
-task, the framework will feed the monitored agent with all combinations of the features that were configured. The resulting action the agent takes following this state is recorded and then visualized in a diagram.
Depending on the configuration, the policyanalyzer
-task produces one or more diagrams within the results/monitoring
folder.
Online Marketplace Simulation: A Testbed for Self-Learning Agents is the 2021/2022 bachelor's project of the Enterprise Platform and Integration Concepts (@hpi-epic, epic.hpi.de) research group of the Hasso Plattner Institute.