A study consists of a collection of bandits,
When 'pulling' a bandit, we are generating a random variable from this bandit's distribution. For example, if generating a random variable from bandit
We therefore wish to find the optimal bandit to exploit; or in other words the bandit with the highest parameter:
There is a second constraint however, that is doing this using the fewest generations (bandit pulls) as possible. Reasons for this include being able to maximize cumulative reward, given a fixed number of generations (e.g if there is a time constraint), or generating may have another associated cost (e.g paying for each armed bandit pull).
The problem can therefore be encapsulated as: Given a fixed number of generations,
To run the below commands, it's suggested that Just is installed.
Options exist for simulating both directly using the command line, or via passing a configuration file in the form of JSON. The second is recommended for reproducable simulation studies.
To see all possible simulation commands run
just help
or help for a specific command:
just {{COMMAND}} help
e.g:
just list-distributions help
To view possible distributions, and information about their associated parameters, run:
just list-distributions
To run a simulation by passing arguments directly, run:
just {{COMMAND}} {{*ARGS}}
Where COMMAND
is a command returned from just help
, and args are those required
by a specific command upon running just {{COMMAND}} help
.
There is the option to perform a MAB simulation, reading arguments and configuration from a config file. To perform this simulation, run:
just simulate-from-json {{COMMAND}} {{CONFIG}}
To run all unit and integration tests in the repository, execute:
just local-test
Metrics from the simulations are output to stdout.
Plots are also optionally produced, detailing the the performance of the simulation.