Usage – Examples

This page will give an introduction into the major tools our framework offers, together with exemplary walkthrough guides to follow along.

Training

Please refer to this page for an explanation of the training task.

Prerequisites

The following is needed before the training-task can be started:

These files must all be present in a folder called configuration_files within the user's datapath.

Walkthrough

After placing the above files in the configuration_files folder, the training-task can be started by running the following command:

recommerce -c training

If the configuration files and the given marketplace and vendor classes are valid, the training-task will automatically start by initializing the reinforcement learning agent.

After this is done, the agent will be trained for the set amount of episodes defined in the rl_config.json file.

At any point, you can check the training progress by taking a look at the terminal output, which displays a progress bar with an estimate of the remaining time. The terminal also displays the current episode and maximum reward achieved by the trained agent. Whenever a certain amount of episodes has completed, the simulation will save a so-called intermediate model, which contains the current policy of the agent. These models are used after the training session has finished.

After the training has completed, the framework will automatically start a monitoring session. During this monitoring, two things are done:

Statistics collected during the training session are visualized in a number of diagrams.
A set number of additional episodes is simulated, monitoring each of the saved intermediate models. For all of these, additional metrics are collected and visualized, which allows users to compare the different models.

Output

The training-task produces a number of different output files, in addition to the terminal output while the task is running.

All output folders are located within the user's datapath.

results/runs: This folder contains the raw data collected during training and created by the TensorBoard. It can be viewed even after the training has finished by starting a TensorBoard session.
results/trainedModels: This folder contains the trained intermediate models, which were saved during training session. They can be used to monitor the agent using one of the other tasks.
results/monitoring: This folder contains the visualizations created during the monitoring session run directly after the training session.

Exampleprinter

Please refer to this page for an explanation of the exampleprinter task.

Prerequisites

The following is needed before the exampleprinter-task can be started:

These files must all be present in a folder called configuration_files within the user's datapath.

If the exampleprinter should be run on a reinforcement learning agent:

A trained RL-agent model (see training task) in the data folder of the user's datapath.
rl_config.json file

Walkthrough

After placing the above files in their respective folders, the exampleprinter-task can be started by running the following command:

recommerce -c exampleprinter

If the configuration files and the given marketplace and vendor classes are valid, the exampleprinter-task will automatically start. During the task, the marketplace will be simulated for the set amount of steps defined in the market_config.json file.

During each step of the simulation, the exampleprinter will record all actions taken by the vendors as well as all market states. Within each step, the current state is also printed to the terminal, meaning the prices the various vendors set as well as their respective inventory levels can be seen. The following is an exemplary state produced in a circular economy marketplace with rebuy prices enabled and with two vendors playing:

[279. 100.   3.   5.   1.  10.]

The table below explains what the different values mean in this example:

Value	Explanation
279	The number of items currently in circulation.
100	The number of items the first vendor has in its inventory.
3	The price for the refurbished product set by the second vendor.
5	The price for the new product set by the second vendor.
1	The rebuy-price set by the second vendor.
10	The number of items the second vendor has in its inventory.

Depending on the chosen market scenario (at the time only the CircularEconomyRebuyPriceDuopoly class is supported), the exampleprinter will also visualize all actions and market states in a separate .svg file for each step, and at the end compile all diagrams into an animated .html slideshow. See this page for an excerpt from one such slideshow.

Output

The exampleprinter-task produces a number of different output files, in addition to the terminal output while the task is running.

All output folders are located within the user's datapath.

results/runs: This folder contains the raw data collected during the simulation and created by the TensorBoard. It can be viewed after the session has finished by starting a TensorBoard session.

If the chosen marketplace is compatible:

results/monitoring: This folder contains the animated overview diagram, as well as .svg files for the different steps of the simulation, outlining actions and states within the steps.

Agent-monitoring

Please refer to this page for an explanation of the agent_monitoring task.

Prerequisites

The following is needed before the agent_monitoring-task can be started:

These files must all be present in a folder called configuration_files within the user's datapath.

For each Reinforcement learning agent that is monitored, the following files must be present:

A trained RL-agent model (see training task) in the data folder of the user's datapath.

Walkthrough

After placing the above files in their respective folders, the agent_monitoring-task can be started by running the following command:

recommerce -c agent_monitoring

If the configuration files and the given marketplace and vendor classes are valid, the agent_monitoring-task will automatically start by printing the configuration to the terminal.

The framework will then proceed to simulate the amount of episodes configured in the market_config.json file. The progress of the monitoring session is displayed through a progress bar in the terminal.

After the simulation has finished the agent_monitoring-task will start to visualize the collected metrics using a range of different diagrams, in addition to some output printed immediately to the terminal. For the full list of diagrams and which metrics they visualize, please refer to this page.

Output

The agent_monitoring task produces a large number of different diagrams, which are all collected within the results/monitoring folder.

Policyanalyzer

Please refer to this page for an explanation of the policyanalyzer task.

Disclaimer: As the policyanalyzer-task is currently not integrated into the recommerce workflow (meaning there is no CLI command as of yet), this section will not be too detailed and should be updated as soon as the tool is properly integrated. See this issue.

Prerequisites

The following is needed before the policyanalyzer-task can be started:

market_config.json file

Currently, additional configuration must be done within the policyanalyzer.py file, which will not be included in this guide, as this should no longer be necessary after the task has been included in the recommerce workflow.

Walkthrough

After the policyanalyzer-task has been configured and the configuration file placed in the configuration_files folder, the task can be started by executing the policyanalyzer.py file.

During the policyanalyzer-task, the framework will feed the monitored agent with all combinations of the features that were configured. The resulting action the agent takes following this state is recorded and then visualized in a diagram.

Output

Depending on the configuration, the policyanalyzer-task produces one or more diagrams within the results/monitoring folder.

Online Marketplace Simulation: A Testbed for Self-Learning Agents is the 2021/2022 bachelor's project of the Enterprise Platform and Integration Concepts (@hpi-epic, epic.hpi.de) research group of the Hasso Plattner Institute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage – Examples

Training

Prerequisites

Walkthrough

Output

Exampleprinter

Prerequisites

Walkthrough

Output

Agent-monitoring

Prerequisites

Walkthrough

Output

Policyanalyzer

Prerequisites

Walkthrough

Output

Clone this wiki locally