-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the registry for (sub-)metric validation and move data crunching out of the Engine
#2426
Conversation
The xk6 test is failing because of this: #2428 |
This comment was marked as outdated.
This comment was marked as outdated.
core/engine.go
Outdated
for metricName, thresholds := range e.options.Thresholds { | ||
metric, err := e.getOrInitPotentialSubmetric(metricName) | ||
|
||
if e.runtimeOptions.NoThresholds.Bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this skip the whole parsing either way - irregardless of error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I don't think so, given that thresholds are basically the only way to currently create sub-metrics for the end-of-test summary, and we can have --no-thresholds
without also having --no-summary
. So, if --no-thresholds
are enabled, I'm following the previously established logic that we shouldn't fail the test for wrongly configured thresholds, but I'm still trying to parse them for sub-metrics, just emit any errors as warnings.
core/engine.go
Outdated
e.Metrics[m.Name] = m | ||
m.Observed = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike this new property on the Metric type. I would really prefer if we go back to the map.
I do think Metric already has too many responsibilities and now adding it knowing it was "observed" seems contra productive.
we even still add it to the map so ... 🤷 Why not just get it out of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike it as well, I want to completely split apart the current Metric
into at least two parts:
- The first part, which should be the actual
Metric
, will be just theName
,Type
and Contains- that should be the only thing that the
Registry` actually controls and cares about - A second completely separate
struct
should be theSink
andObserved
and probably the threshold and submetric things. That will only be a concern of theMetricsRegistry
. I don't think they should even be exported... In any case, the mapping from a*Metric
to that second struct will happen only in theMetricsRegistry
.
Unfortunately, until we have that separation, I kind of had to put Observed
in the Metric
struct, since that's where the Sink
already was 😞 Doing otherwise without even more refactoring would cause us to have a map lookup for every sample of every metric that k6 generates... 😞 I intended on splitting this when I work on the time series refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw this is what I mean by this TODO:
Lines 449 to 450 in f06d6f0
// TODO: decouple the metrics from the sinks and thresholds... have them | |
// linked, but not in the same struct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e a map lookup for every sample of every metric that k6 generates..
Didn't we have the same up to this point as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we did, but we don't actually need to have that, so I guess you can consider it an optimization? 😅 :
I probably dislike the extra item in Metric
even more than you, but with Sink
already in there, this actually seems a better (temporary and transient) architecture to me, since all of this state is now kept in Metric
and will be easy to refactor into a different struct all at the same time. Previously, the Observed
state was implicitly tracked by that map in the Engine
, while all of the other state was in every Metric
...
If you want, I am fine with implementing my TODO
above now, before we merge this PR, though I'd prefer to do it in a separate PR after this #2442 PR from @oleiade is also merged. It will leave the metrics.Registry
as the only source of truth for metrics, so we'll be easily able to assign them sequential IDs and use that in the MetricsEngine
without the need for a map.
@na-- the panic you posted is due to the test parallalization not because of the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, it's a big and good one 👍
In general looks good, I left a few comments.
} | ||
|
||
if err != nil { | ||
return fmt.Errorf("invalid metric '%s' in threshold definitions: %w", metricName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not for this PR, but what do you think about first trying to collect all invalid metrics and return an error that contains all wrong variations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor things
// TODO: maybe even move the outputs to a sub-folder here? it may be worth it to | ||
// do a new Output v2 implementation that uses channels and is more usable and | ||
// easier to write? this way the old extensions can still work for a while, with | ||
// an adapter and a deprecation notice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have this comment in the issue instead to have it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of already also have it the issue though, see the end of #2430 (comment)
e.ingester = me.GetIngester() | ||
outputs = append(outputs, e.ingester) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not possible to have this in the same place where we set the other outputs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be, but I think it will actually be worse to do it that way.
he other outputs receive a whole bunch of things this one doesn't need, while this one needs the MetricsEngine
, something we create a few lines above here (or in cmd/run.go
in the follow up PR without an Engine). Besides, we add this output based on completely different criteria only some of the time, based on RuntimeOptions values...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🎉
A bunch of minor, non-blocking comments on my side. I do agree with the brought up argument about metrics.Observed
; I dislike it too, but I understand the arguments in (temporary favor of it, and can live with it for now.
|
||
func (me *MetricsEngine) getOrInitPotentialSubmetric(name string) (*stats.Metric, error) { | ||
// TODO: replace with strings.Cut after Go 1.18 | ||
nameParts := strings.SplitN(name, "{", 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up doing something somewhat similar to extract metric/submetric name for thresholds validation. I ended up using strings.FieldFunc
, which, I believe, could lead to somewhat clearer code here?
delimiterFn := func(c rune) bool {
return c == '{' || c == '}' || c == ':'
}
fmt.Println("Fields are: %q", strings.FieldsFunc("metric_name{foo:bar}", f))
The resulting length is either 1 (metric), or an odd value >=3. No need to match the closing }
for instance. You could also drop the matching of :
in your case, I believe.
Hope that's helpful 🙇🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that also accept metric_name}foo{bar:
as valid? 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a prerequisite for solving other issues like always evaluating thresholds correctly, and as a side-benefit, it also allows us to validate them in the init context, before the test has started.
This allows us to slowly deconstruct and split apart the Engine. It also clears the way for us to have test suites, where every test has a separate pool of VUs and its own ExecutionScheduler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍾
This PR is the first part of #1889. It aims to prepare the
Engine
for complete annihilation 😅 I want to get completely rid of it and use its constituent parts separately instead, basically to almost completely decouple the test execution from the metrics handling.However, the
Engine
has tons of tests that would have to be re-written. So this PR is an attempt to softly split theEngine
by moving things around and defining and implementing its replacement components, while still keeping them under the sameEngine
facade. This way we can have a reasonably high assurance we haven't broken anything, since all of the old tests still work with minimal adjustment. Hopefully, this will allow us to merge things gradually, not have to do it all at once.Given the extensive refactoring here, there were a few opportunities to fix some bugs and side-issues along the way:
Fix the bug of thresholds not working for unused metrics
(e5d8c32) fixes Metrics for which no data was recorded are not displayed or evaluated for Thresholds #1346Fix submetric matching bug when nonexistent keys are specified
(a3138b1) fixes Custom metric threshold calculation using wrong statistics #2390Move the Engine data crunching logic in a new component under metrics
(2476cfc) goes a long way towards closing Remove theEngine
and split metrics/VUs handling #1889, since now theMetricsEngine
gets its metrics like just another output, almost no special handling when it comes to it 🎉Overall, I am not sure if it will be easier to review this commit by commit or all at once. The commits are logically separated, but a lot of the same code is changed in multiple commits because of that 🤷♂️