Skip to content
Zhe ZHAO edited this page Apr 9, 2021 · 7 revisions

General

Q: My project is using multiple different CD tools at the same time. How do you consolidate the execution data into one and compute the metrics?

A: To be answered soon.

Q: Why do you synchronize execution history to local? Will this become a security risk?

A: To be answered soon.

Computing four metrics

How to compute DF(Deployment Frequency):

Q: Do you include failure deployments when calculating DF?

A: No. DF is supposed to reflect the frequency we ship value to the target environments. A failure deployment does not ship any change to the target environments hence should not be included.

Q: Do you include "empty builds" (i.e. the builds which include no code changes, like a manual build) when calculating DF?

A: Yes we do. A success execution of the pipeline/workflow usually represents a success deployment to the target environment, regardless of having code changes or not.
And, in case you're worried. No, empty builds has no positive/negative impact to the average lead time value at all.

How to compute MLT, Mean Lead Time for Changes:

Q: Is the MLT "deployment based" or "commit based"? Do you simply compute lead time for each deployment respectively?

A: Every deployment has different batch size, has different numbers of commits, sometimes 0 commits. Instead of calculating lead time for each deployment and get the mean value, we calculate the lead time for every commit which are included in the evaluation period. This give us a more accurate evaluation in terms of "how long it takes from code commit to code running in the environment".

Q: How do you compute "delayed deployments"? Does "delayed deployments" result in negative impact to the MLT value?

A: When computing MLT value, commits are selected base on the time range in which the deployment occurs instead of the build trigger.
i.e. When deployment occurs, compute its lead time. Otherwise no.
There's one special case though - imagine we have two deployments, where the deployment#1 happens before deployment #2. However if we trace back to their build time respective, It is possible though the time when build#1 occurs is actually after build#2.

calculate_MLT_for_selected_time_range

Q: Is story cycle time included in the lead time computation?

A: No, story cycle time is not included. Based on the data fetched form CD tools, here we evaluate the deployment lead time instead of story lead time. i.t. The time taken from code been committed untill code been shipped to target environment. There are other tools in the market, like Jira, calculating other useful metrics like lead time of story card , bug card percentage, or rolling average.

How to compute CFR, Change Failure Rate:

Q: How do you classify or deem a failure from the CI? A build failure? A deployment failure? Or .... how do you know the actual deployed release failed to deliver value?

For errors that happened after the deployment, it's difficult to determine if the error was introduced by released code or some other reasons, e.g. infrastructure failure, previous bug, etc. Therefore, for this kind of case, some manual work will be needed. In this stage, we want to use as little manual work as possible, so we've chosen pipeline as the only data source. But in the future, in order to make the statistic more accurate, some user input or JIRA/GitHub issue integration could be considered.

Q: Is "an aborted build" a failure build? Should it be included in CFR?

A: No. Since for each failure we would also want to calculate its TTR value. An aborted build doesn't necessarily result in a failure for which team wants to apply a fix.

Q: When calculating CFR, except for the failure execution of selected stage/workflow, should we also calculate the ones failed at preceding stages? For example, failed at build stage.

A: No. Errors occur in different stages can be visualized separately based on the stages user selects in chart filters

How to compute MTTR, Mean Time to Restore Service:

Q: For consecutive failures, do we measure the TTR time for all of them, or the first failure only?

A: For consecutive failures, we measure the TTR value for first failure only.
i.e. the time between first failure, and first fix.