-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp the end-of-test summary #4089
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: oleiade <theo@crevon.me>
Co-authored-by: oleiade <theo@crevon.me>
Co-authored-by: oleiade <theo@crevon.me>
Co-authored-by: oleiade <theo@crevon.me>
Co-authored-by: oleiade <theo@crevon.me>
Co-authored-by: oleiade <theo@crevon.me>
Co-authored-by: oleiade <theo@crevon.me>
Co-authored-by: oleiade <theo@crevon.me>
Co-authored-by: oleiade <theo@crevon.me>
Co-authored-by: oleiade <theo@crevon.me>
58138ac
to
126a188
Compare
1904999
to
4bb8f7a
Compare
4bb8f7a
to
9c1ed70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a pass on that, but need a couple of more for sure, commenting just signaling that I'm on it
Sure, thanks! Take your time! |
} | ||
|
||
// NewSummary instantiates a new empty Summary. | ||
func NewSummary() *Summary { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of the curiosity, why have we decided to use the empty summary constructor? I mean, why don't we ask require the mandatory values throw the constructor (and maybe even apply validation)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I have a concrete answer, to be honest. I think the reason why I followed this approach is because it's mostly a DTO, and concretely a recursive one, so it felt easier to initialize it empty, like when you initialize a map
, and populate it on the go, instead of asking for the inner data in the constructor.
Most of the logic is on the summary.Output
side, but I preferred not to couple both, even if that's the main use, at least for now.
Co-authored-by: Oleg Bespalov <oleg.bespalov@grafana.com>
The key difference between Do you have any other suggestion to differentiate among them? To be fully transparent, I don't have a strong opinion here, but I'd advocate to either make any change w'all fully agree, or move forward as-is, to avoid looping in cycles. As far as I know, we offer no guarantee on the text summary format, so it should be fine to iterate it in the near future if needed. The one shipped as part of this PR doesn't need to be the definitive one before and along the v1. |
d2511c3
to
b8ed6e0
Compare
@joanlopez like I said internally, I'm totally fine with processing as it's with one exception that it's worth adjusting texting when it lands documentation
IMO too generic and vague and in my case it was the source of confusion since I expected to see these more valuable results since the beginning |
Nice, thanks for your input @olegbespalov! |
Hey @joanlopez @olegbespalov 👋🏻 Apologies for the delay and being blocking here. I overlooked the compact vs full list of metrics during the last phases of the design, but doing a bit of digging into some of our initial design docs, I found back that we had come up with a candidate list of metrics to exclude in compact mode (and include in full/extended mode):
The rationale there, being that we wanted to only show what is relevant to the vast majority of users in compact mode, and only bring them back when users explicit request the full mode. In general I remember that we discussed focusing on removing things that would only be relevant to a very small portion of users by default, or to very specific use-cases. At the time we outlined only HTTP metrics, because that's where we thought the most of those cases were, but I think we should feel free to expand that list if we can consider other metrics as "not absolutely mandatory". I also agree with @olegbespalov in that we are explicitly excluding the end of test summary from the v1.Y.Z support policy, and we should feel free to iterate on it in the future. So we don't have to block on this if there are diverging ideas about what the list of included/excluded metrics would be for instance: even if we kept the list at what it is now, that would be 👍🏻 for me 🙇🏻 Hope that's helpful, and again, great work @joanlopez I love it ❤️ 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🚀
# Ignore everything in this folder | ||
* | ||
|
||
# But the end-of-test summary file | ||
!summary.js |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not certain if this was intended to be commited or not , but all of those configurations for eslint and prettier IMO either should
- not exist
- be project wide
This currently seems a bit ad hoc and I am not cetain how common it will be for those to be updated.
The original idea with the js summary code is that it lives in jslib and k6 just gets the artefact and puts it here.
This allows us to skip having this checks in k6 repo and have it one place where it can be used or combined with others by users in scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oleiade I think it's you who added it. Any preference?
I guess the reason here is that DX is a bit worse, because you normally use both in combination (k6+summary), even if we distribute the summary code as a jslib. But it's also true that in theory we shouldn't be changing it very often, maybe it's just fine to keep everything in the jslib repo and just copy files back and forth when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey folks, this is indeed a legacy artifact left behind when I rewrote the summary.js
file completely. During the refactor, I wanted to be sure to address most linting issues to catch potential pre-existing issues before I modify it. And I wanted to ensure the style was consistent across the whole file at the very least.
If we move the code in a JS code in a jslib, then I'd be completely fine with moving the configuration there indeed 👍🏻
If we keep it in the repository, I would then prefer for us to have an eslint/prettier setup in some fashion still, project wide as suggested would be ideal for me, but I acknowledge there might be a lot of special cases/ignores to handle.
I have no strong opinion either way, but I would prefer to not have to copy-paste eslint/prettier configurations around whenever I need to touch the code ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not done with the review, but would like to try to make it in parts so, posting this for now.
Very happy with this being an output and still supporting handleSummary.
I do have a bunch of comments that are around the reviewing of this, not certain whether we should work on them or just for future reference:
- it would've been nice for some of the refactor stuff that are not related to have been separate as they now bump up the line numbers and github is not great at actually figureing out what changed.
- I would've prefered if the old summary stayed in the same file and the new one had a new name. I did the same thing a few days ago with k6/timers PR and it definitely isn't great to have to figure out if changes were made or just renamed. Given the code changes here it is even harder.
My main problems are:
- reusing metric.Sink - I am even okay with just copy pasting it in the output and usign it, but given this likely is to be (re)moved soon to make metrics immutable, I really would prefer not to have to make changes to this later on. This also is probably good chance to use hdr for Trend summary, doing now isntead of later and making another breaking change.
- The current code depends on the summary knowing the exact names of metrics that it wants to group. This makes it beign coupled to the names and means we will have to keep editing it and probvably worse it will not work very well with extensions
flags.String("with-summary", lib.SummaryModeCompact.String(), "determine the summary mode,"+ | ||
" \"compact\", \"full\" or \"legacy\"") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not certain about the name with-summary
, it doesn't really sound like any other option we have. I would expect this will be either summary
or summary-mode
or something like this.
Is there particular reason for the with-
part ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the initial suggestion was from @oleiade, but as far as I know it's not set in stone.
I'm happy with --summary-mode
, wdyt? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was the original suggestion indeed, but it is not set in stone indeed. I'm not completely hung upon --with-summary
, but I proposed it because I like how it is close to natural language and somewhat guessable: "run k6 with its summary in full mode".
But like I said, I really don't have a strong opinion on this one, I'd be fine with --sumary-mode
too. What I care about is that summary related options are consistent; thus if we go for --summary-mode
, when we revamp the machine readable summary, we should make sure the option also keeps the --summary-*
prefix 🙇🏻
if envVar, ok := environment["K6_SUMMARY_EXPORT"]; !opts.SummaryExport.Valid && ok { | ||
opts.SummaryExport = null.StringFrom(envVar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of those refactorings would've be nicer as a separate PR so that we don't mix them up here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try that now, yeah!
} | ||
} | ||
} | ||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for code reviews reason it would've been easier if the files weren't switched yet but done in a separate PR, as now I do not know if the other code was changed and how much and this diff here is pointless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not sure I can follow you.
Do you mean that would have been better to refactor the old code, and only then apply the changes required for the new end of test summary aspect?
If so, I may agree, but I think @oleiade, who did a great job on this refactoring did both mostly on a single pass.
But, it may help to have a look at iterative commits related: 5d30088, b80a09c, b2b5823, b80a09c, b2b5823, 4153ba7, af4d6f4, 8b1da1d, 126a188, and 63d89e0.
Although not ideal, I know 🙏🏻
output/summary/data.go
Outdated
case isSkippedMetric(summaryMode, info.Name): | ||
// Do nothing, just skip. | ||
case isHTTPMetric(info.Name): | ||
dest.HTTP[info.Name] = summaryMetric | ||
case isExecutionMetric(info.Name): | ||
dest.Execution[info.Name] = summaryMetric | ||
case isNetworkMetric(info.Name): | ||
dest.Network[info.Name] = summaryMetric | ||
case isBrowserMetric(info.Name): | ||
dest.Browser[info.Name] = summaryMetric | ||
case isGrpcMetric(info.Name): | ||
dest.Grpc[info.Name] = summaryMetric | ||
case isWebSocketsMetric(info.Name): | ||
dest.WebSocket[info.Name] = summaryMetric | ||
case isWebVitalsMetric(info.Name): | ||
dest.WebVitals[info.Name] = summaryMetric | ||
default: | ||
dest.Custom[info.Name] = summaryMetric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the summary very much coupled with what modules are in core and what are their metrics in a way that will mean that this will need to be updated on us adding stuff and anything outside of the core mdoules will go to custom.
So in the future when/if we have more and more modules as extensions or even move some out of here this will either have to stop working or will need some different way to make the same thing.
Have there been other approaches that have been considered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We haven't considered any other approach for this iteration. And I'm not sure there's any without extending the current metrics API, to register new metrics with some sort of common/group identifier.
Do you have any particular idea? cc/ @oleiade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at the time we looked into various options. But the only way that allowed to us to move forward and ship this in reasonable time was to do it this way, as just like you mention it @joanlopez, at least currently, we couldn't really think of another way to do it that wouldn't involve a big, scary, change.
To be fair, I also think the tradeoff there is limited, in the sense that everything being equal otherwise, we quite rarely add official metrics. Moving forward, I suggest we open an issue, and tackle it separately, in the future. But if there's another somewhat quick and achievable solution that we can get it in time for v1, let's go for it 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you open the issue and please list the options that were big and scary. I also do not have a particular proposal, but it will be nice to have them out there.
Arguably we do not add a lot of official metrics in an attempt to reduce the things we have to support forever. One of the ideas with v1 is that in v2, v3 and so on, we will be making breaking changes more easily without necessarily having a transition period for every breaking change.
So there is good chance that we will add a lot more metrics.
Also, even if we do not add new official metrics, this means that everybody using extensions more (another v1 goal) will get custom metrics for everything from those extensions.
I do not think we will be fixing this perfectly either way, but would still prefer to have at least an artifact of what the reasons were and what other things we considered.
output/summary/data.go
Outdated
lib.SummaryMetricInfo{ | ||
Name: metricData.Metric.Name, | ||
Type: metricData.Metric.Type.String(), | ||
Contains: metricData.Metric.Contains.String(), | ||
}, | ||
metricData.Sink, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not particularly certain this should actually get the whole Metric
instead of just some parts of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that aggregatedMetric
may already hold the lib.SummaryMetricInfo
, which is a subset of the data hold by the *metrics.Metric
, and replace that instead?
If so, I agree, and I'm happy to make that change.
But, could you confirm it, please? If not, please can you clarify what do you mean? 🙏🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - basically metrics.Metric
will get thinned out, so a lot of the stuff you currently don't use from it will be removed. So no real reason to copy stuff around IMO.
output/summary/data.go
Outdated
// FIXME (@joan): rename this to make it explicit this is different from an actual k6 metric, and this is used | ||
// only to keep an aggregated view of specific metric-check-group-scenario-thresholds set of values. | ||
type aggregatedMetric struct { | ||
// FIXME (@joan): Drop this and replace it with a concrete copy of the metric data we want to track | ||
// to avoid any potential confusion. | ||
Metric *metrics.Metric | ||
|
||
// FIXME (@joan): Introduce our own way of tracking thresholds, and whether they're crossed or not. | ||
// Without relying on the internal submetrics the engine maintains specifically for thresholds. | ||
// Thresholds []OurThreshold // { crossed: boolean } | ||
|
||
Sink metrics.Sink | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joanlopez would really prefer if you do not use metrics.Sink as it is to be removed hopefully before v1.0.0 #3907
In the past it has been a case of many races as people do not realize there is no locking on this. They are also mostly designed to serve the current summary, thresholds and k6 stats
all of which have slightly different requirements and make the code very clunky ... and arguably not great. #2320
This also means that this summary still makes Trends collect each and every sample value that has been emitted which is a thing we could fix now, instead of adding another breaking change later. Maybe in a separate PR still if that is a problem. #763 #3213 for a possible fix for summary and https://github.com/grafana/k6/blob/master/output/cloud/expv2/hdr.go for the current used one in cloud output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't fully get what's the suggestion here.
To "duplicate" the metrics.Sink
values locally (in the summary) and use the HDR in that case to avoid that duplication being painful with Trend
metrics?
But, don't I need to have my own Sink instance (or similar) to keep values locally, or keep the reference to do something like the Drain
-ing implemented in the PR you shared above? 🤔
If so, I'm happy doing such change, as long as we're fine with the loss of precision or whatever implication HDR histograms may have in the summary results.
And in such case, why don't we change metrics.Sink
implementation at all? It's a part of the public API, but I guess that as struct values are private, we could keep compatibility on all the public methods while using HDR under the hood? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following with the end of my last comment, and re-reading your comment here (below):
- reusing metric.Sink - I am even okay with just copy pasting it in the output and usign it, but given this likely is to be (re)moved soon to make metrics immutable, I really would prefer not to have to make changes to this later on. This also is probably good chance to use hdr for Trend summary, doing now isntead of later and making another breaking change.
It looks like you're suggesting precisely to re-implement TrendSink
with HDR histograms? Do I read that correctly? I can do that in a separate PR, if that's what we want, yeah. Can you confirm @mstoykov, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metrics.Sink is very likely going to stay in some capacity as internal implementation for the current summary and the thresholds and so on.
The idea though is that it will not be part of the metrics.Metric
struct and likely not part of the metrics
package.
Ideally it will be replaced with something more ergonimic for each use case it has, instead of trying to be useful in all contexts.
As such it is preferable if this new summary just has a new struct to actually keep the aggregated data that doesn't add to those cases.
Moving to HDR for Trend will also be nice as we keep talking about how that will happen, but it will be a breaking change, so we keep postponing it. Now that we are doing one breaking change we can piggy back this in.
Honestly, now since Mihail brought it, I must share that I have a same feeling. That's why I was going to suggest putting the classification methods under the metric package, to at least show the future contributors that this classification exists and should be considered. However, not sure where else it could be relevant 🤷 , so that's why I bought an existing approach. |
Can you expand a bit what do you mean with that, please? 🤔
I think that's the main reason why I haven't explored any other option (no one really convinced me). |
I meant that these functions which I currently internal could potentially be a part of the metric package (along with the metric categories) and if future contributors want to introduce new metrics, my assumption that they also could think that they need to care about metric category, but the summary logic will remain unchanged/or less affected to changes
👍 yes, I totally accept it, just wanted to share my perspective |
Overview
This pull request changes the current end-of-test summary by a new design, with two available formats:
aiming to bring clearer a more valuable results to users. Find a screenshot below.
User-facing details
compact
summary with--with-summary=compact
, or with no argument, as it is the default choice.full
summary with--with-summary=full
.legacy
summary with--with-summary=legacy
.handleSummary
function is now different. So, those users relying on it must migrate their implementation or use--with-summary=legacy
meanwhile.Technical details (for review purposes)
output.Output
namedsummary
.internal/cmd/testdata/summary/...
with different scenarios, groups, thresholds, custom metrics and what not... that can be used for both automated and manual testing. If you think anything is missing, just suggest it.lib.Summary
vslib.LegacySummary
), to keep support for the legacy summary for some time, in an easy way, so things aren't complexity mixed and the the clean up in the future is simpler, just by removing that type, all the references to it, and simplifying the few conditionals that behave depending on which summary type is provided.summary-legacy.js
, for simpler cleanup whenever we remove that support, which I guess might be for v2 (once we ship the formalized JSON output format within v1).Internal Checklist
Before review readiness
lib.Report
as the new API for customhandleSummary
(note this would be a breaking change), or if we want to ship this progressively.js/summary.go
according to that decision.js/summary.js
:summarizeMetrics
andsummarizeMetricsWithThresholds
, or just replace the first with the second one (as we may no longer need the first one if we remove the old summary code).output/summary
package:playground/full-summary
files, or define a proper location for them.--- Ideas left for a second iteration---
General
make lint
) and all checks pass.make tests
) and all tests pass.