Authors: Travis Takai, Ed Schouten
Date: 2021-03-23
Action results presented through bb-browser's frontend provide a great way to gain insight into the behavior of remote execution by looking at a single action at a time. The next logical step to this is analyzing build results in aggregate and in real-time. There are many reasons for wanting to analyze remote execution behavior such as identifying build duration trends, estimating the total cost of a build, identifying computationally expensive build targets, evaluating hardware utilization, or correlating completed build actions with data in the Build Event Protocol. bb-browser displays all of the necessary info, but does not allow for programmatic analysis of action results for a number of reasons:
-
Most of the useful information is stored within the ExecutedActionMetadata's auxiliary_metadata field but is not exposed by the bazel client or displayed in the Build Event Protocol output.
-
There aren't any convenient ways for exploring build information via bb-browser as the instance name, action digest, and size of the action result are all needed before querying for an action result.
-
The goal of bb-storage is to offer short-term data storage, which does not allow for any form of querying for historical data.
-
bb-scheduler does not provide a persisted list of build actions and bb-storage has no way of providing access to a sequence of action results in an efficient way.
An alternative approach that allows for a flexible configuration for clients is to allow for streaming action results along with their associated metadata to an external service. This allows for both long-term data persistence and real-time build action analysis. We should work towards creating this streaming service, which we can call Completed Action Logger.
The protocol will be used by bb_worker to stream details of completed actions, known as CompletedActions, to an external logging service:
service CompletedActionLogger {
// Send a CompletedAction to another service as soon as a build action has
// completed. Receiving a message from the return stream indicates that the
// service successfully received the CompletedAction.
rpc LogCompletedActions(stream CompletedAction)
returns (stream google.protobuf.Empty);
}
Each worker will have the ability to stream CompletedActions to the logging service once the action result has been created and all metadata has been attached. CompletedActions will take the form of:
// CompletedAction wraps a finished build action in order to transmit to
// an external service.
message CompletedAction {
// A wrapper around the action's digest and REv2 ActionResult, which contains
// the action's associated metadata.
buildbarn.cas.HistoricalExecuteResponse historical_execute_response = 1;
// A unique identifier associated with the CompletedAction, which is
// generated by the build executor. This provides a means by which the
// external logging service may be able to deduplicate incoming
// CompletedActions. The usage of this field is left to the external
// logging service to determine.
string uuid = 2;
// The REv2 instance name of the remote cluster that workers are returning
// the action result from.
string instance_name = 3;
}
The HistoricalExecuteResponse message is simply bb-storage's UncachedActionResult that will be renamed in order to be more used more generally. Apart from the Action digest included in the historical_execute_response field, no information about the Action is part of CompletedAction. Implementations of the CompletedActionLogger service can load objects from the Content Addressable Storage in case they need to inspect details of the Action.
Now that we've defined what the Completed Action Logger is, let's go ahead and implement one of the desired uses for action analysis: metadata for measuring the cost of build execution. Defining a new message, which we'll call MonetaryResourceUsage, provides a nice way of calculating how much a given build cost based on the amount of time spent executing the action and will take the form of:
// A representation of unique factors that may be aggregated to
// compute a given build action's total price.
message MonetaryResourceUsage {
message Expense {
// The type of currency the cost is measured in. Required to be in
// ISO 4217 format: https://en.wikipedia.org/wiki/ISO_4217#Active_codes
string currency = 1;
// The value of a specific expense for a build action.
double cost = 2;
}
// A mapping of expense categories to their respective costs.
map<string, Expense> expenses = 1;
}
This will be appended to auxiliary_metadata that is part of the REv2 ExecutedActionMetadata message at the end of execution, which will automatically ensure it is cached along with the other build metadata in bb-storage. While expenses are not significant on a per-action basis, when combined with the CompletedActionLogger service we now have a way to quantify how much a given build invocation or target costs and see how that changes over time. Implementations of the CompletedActionLogger service are responsible for aggregating these MonetaryResourceUsage messages. It is possible to aggregate this data by making use of the fields within the RequestMetadata message, such as tool_invocation_id or the recently added target_id field, as the RequestMetadata data is always appended to the auxiliary_metadata message.