Skip to content

Simulation

Petros Papapanagiotou edited this page Nov 20, 2018 · 4 revisions

Concepts

A ValueGenerator represents a function that can generate a value for some simulation parameters (typically duration and cost). It can be a ConstantGenerator which always provides the same value or any probability distribution.

ValueGenerators must also implement an estimate method (such as the median of the distribution) that provides an estimate of the generated values. This can help create an environment of imperfect knowledge. For example, the Scheduler does not know the actual durations of tasks, which can vary from the expected estimate for various reasons.

Tasks are carried out through a period of virtual time. They have the following fields:

  • id: A unique id that separates them from any other task. A workflow may spawn multiple tasks with the same name, so this is necessary. The id is generated by the Coordinator which means tasks should only be generated by a TaskGenerator.
  • name: A descriptive name.
  • simulation: This is the name of the simulated workflow instance (see Simulation below) that spawned the task.
  • created: The timestamp when this task was created in the Coordinator.
  • resources: The list of names of persistent resources (see TaskResource below) required by the task.
  • duration: A pre-generated duration period from a ValueGenerator.
  • estimatedDuration: The estimated value from the ValueGenerator that produced the duration. This forces the Scheduler to make realistically imperfect decisions instead of being able to predict durations perfectly.
  • initialCost: A one-off cost of this task.
  • interrupt: The number of times the task can be interrupted in favour of a more urgent one. This parameter is not currently being used.
  • priority: There are 5 priority values, namely Highest, High, Medium, Low, VeryLow.

Tasks are ordered based on their various parameters. The compare method defines how tasks are ordered by highest to lowest priority when we are trying to assign them to resources. The current implementation orders tasks in this order of criteria:

  1. Priority: Putting this parameter first makes it a very strong influencer of priority. A task with higher Priority will always be executed before any other task with lower priority, even if, for example, the latter has been queueing for a long time.
  2. Age: Tasks who were created earlier take priority. This attempts to minimize delays, but may have side-effects.
  3. Resources: Tasks that involve a higher number of resources take priority. The intuition is that it is generally harder to achieve a state where more resources are available at the same time, so we do not want to postpone the task.
  4. Estimated duration: Tasks that require more time are prioritized.
  5. Interrupt: Inflexible tasks that can not be interrupted are prioritized.
  6. Tasks are considered of equal priority at this point, so we use their id as a deterministic ordering.

A TaskGenerator is a factory of Tasks. Its parameters are similar to those of a Task, although the duration and cost are ValueGenerators. As a case class, it can be easily manipulated dynamically to alter its values.

The addTo method is the main function used to generate a task. It sends the TaskGenerator to the Coordinator. It also specified the names of the required persistent resources (see TaskResource below). The Coordinator then uses create to generate a task and add it to its priority queue.

A Promise[Unit] is also generated and passed to the Coordinator. This is used to signal when the task has finished executing (or if an exception was thrown).

A TaskResource or simply "resource" corresponds to a persistent resource. Typical examples are human actors or persistent machinery. Each Task may require a different combination of resources.

Each TaskResource is assumed to have a unique name. It also has a costPerTick, i.e. the cost (if any) of using the resource per unit of time.

We say a TaskResource is attached to a Task when the Task that uses this resource is being performed. The currentTask variable holds the Task that is currently attached to the resource (if any) coupled with the virtual timestamp of when it started. A TaskResource is idle when it has no Tasks attached to it.

The Coordinator calls upon startTask to attach a Task to the resource and finishTask to detach it. The Coordinator has full control over the resource and makes all necessary checks to ensure consistent behaviour (e.g. not starting a Task when another Task is already attached).

The nextAvailableTimestamp method lets the Scheduler know an estimate of when we expect to have this resource available again. This is based off of the estimatedDuration of the Task (see Task above).

A Simulation typically corresponds to the execution of a single workflow instance. It has a unique name to distinguish it from any other instance.

The run method executes the simulation with a given SimulatorExecutor (see below). The getProcesses method must return the list of all PiProcesses involved in the simulation. It is necessary to provide this list to the corresponding ProcessExecutor that will execute the workflow.

A Simulation does not necessarily need to be tied to a PiProcess though. Anything that can generate Tasks can be a Simulation. As a useful example, TaskSimulation is a simple Simulation that executes one Task and then terminates.

A SimulationActor wraps the execution of a Simulation in an Akka Actor. This is used internally by the Coordinator and does not need to be constructed by the user.

Any PiProcess (usually AtomicProcess) that depends on the completion of one (or more) Task should mixin the SimulatedProcess trait. This primarily identifies it as a simulated process that takes some virtual time to complete as opposed to a (virtually) instantaneous one.

The Coordinator waits for all AtomicProcesses to complete before virtual time is allowed to progress. This ensures workflows have reduced as much as possible and generated as many new Tasks as they need to in time. The Coordinator should not wait for SimulatedProcesses to complete because their associated Task(s) need virtual time to progress.

Every SimulatedProcess should also be associated with a particular Simulation by simulationName.

In the typical scenario, simulated processes use TaskGenerators. Execution involves a call to TaskGenerator.addTo followed by a map of the returned Future to the actual result value the process needs to return (if any). This is conveniently wrapped in the simulate method.

A ProcessExecutor that can also execute simulated workflows is a SimulatorExecutor.

Currently the only requirements is the simulationReady method. It should return true if all AtomicProcess calls have either completed or are calls to a SimulatedProcess. This allows the Coordinator to progress virtual time.

In practice, simulationReady should return true if all PiInstances loaded in the executor are simulationReady (see PiInstance.simulationReady).

The Scheduler is responsible for selecting the next tasks to be executed by each resource based on Task priorities and TaskResource availability.

The Scheduler trait defines the getNextTasks method that implements the above scheduling. Its parameters include the pending Tasks, the current virtual timestamp (currentTime), and the resourceMap of all involved resources.

The DefaultScheduler implements a simple but decent scheduling algorithm. It constructs a Schedule object for every TaskResource. This includes a sequence of busy periods as pairs of starting and ending timestamps (Long,Long). It then does the following:

  1. It iterates through the list of pending tasks which is assumed sorted (prioritized).
  2. For every Task, it merges all the Schedules of its associated TaskResources. This results in a single Schedule that represents the time ranges where any of the involved TaskResources are busy.
  3. It tries to fit the Task in the merged Schedule. It uses Task.estimatedDuration for this. It calculates the starting time (start) that the Task may start in the merged Schedule.
  4. It adds the Task in each individual Schedule of every involved TaskResource starting at time start.
  5. If start is the current time and all TaskResources are idle, it suggests the Task be started now.
  6. It repeats with the next Task in the priority queue.

This algorithm ensures that Tasks are assigned by priority and that no Task blocks the earliest (estimated) possible starting time of any other Task with higher priority. At the same time, it allows for Tasks of lower priority to start earlier if the TaskResources of the higher priority Task are not be available (and will not become available before the lower priority Task is estimated to complete).

Clone this wiki locally