Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting the metric batch exporter early #11545

Merged
merged 3 commits into from
Sep 5, 2023
Merged

Conversation

dmathieu
Copy link
Member

Motivation/summary

We need a batch processor to set the local metrics exporter. But we also need a metrics provider before we have a processor.
So this change allows setting up the metrics provider very early, and set the batch processor only later on.

Checklist

For functional changes, consider:

  • Is it observable through the addition of either logging or metrics?
  • Is its use being published in telemetry to enable product improvement?
  • Have system tests been added to avoid regression?

How to test these changes

This is unit-tested.

Related issues

@dmathieu dmathieu requested a review from a team as a code owner August 31, 2023 13:02
@dmathieu dmathieu added the backport-skip Skip notification from the automated backport with mergify label Aug 31, 2023
@dmathieu
Copy link
Member Author

dmathieu commented Sep 4, 2023

Ping @elastic/apm-server for review.

@@ -67,6 +71,10 @@ func (e *MetricExporter) Aggregation(k metric.InstrumentKind) metric.Aggregation
}

func (e *MetricExporter) Export(ctx context.Context, rm *metricdata.ResourceMetrics) error {
if e.processor == nil {
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is it considered a valid state to run the exporter without a processor ? If not we should return an error here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may then end up with errors at boot time just because one export happened before the batch processor was setup. I don't think that's a good behavior either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I might be missing some context. If we end up with errors for early exports, why are we creating the exporter early ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant we may end up with early errors if we don't expect the processor to be nil, hence this check

@@ -34,12 +34,11 @@ import (
// NewMetricExporter initializes a new MetricExporter
// This export logic is heavily inspired from the OTLP input in apm-data.
// https://github.com/elastic/apm-data/blob/main/input/otlp/metrics.go
func NewMetricExporter(p modelpb.BatchProcessor, opts ...ConfigOption) *MetricExporter {
func NewMetricExporter(opts ...ConfigOption) *MetricExporter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: tied to the question below but if the processor is required we should leave it as part of the method instead of moving it to the config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it goes back to the issue we currently have, which is that we cannot create the metrics exporter (and therefore get a valid meter) until we have a batch processor.
But we also need a meter earlier, when we create the middlewares/interceptors.

We can't add a new exporter on an existing meter provider either. So this PR makes the batch processor optional to allow us configuring telemetry early, and extending it once we have a full boot.

@@ -293,7 +301,8 @@ func TestMetricExporter(t *testing.T) {
batch = append(batch, (*b)...)
return nil
})
e := NewMetricExporter(p, tt.exporterConfig...)
e := NewMetricExporter(tt.exporterConfig...)
e.SetBatchProcessor(p)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I think we can assign the batch processor to the field in the exporterconfig and void the p variable and the additional call. WDYT ?

Copy link
Member Author

@dmathieu dmathieu Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, though it means we have to append to tt.exporterConfig.
Also, this is a test. So an additional call isn't really problematic.
And doing so also means an additional test to ensure the behavior of SetBatchProcessor.

@dmathieu
Copy link
Member Author

dmathieu commented Sep 5, 2023

@kruskall and I had a quick zoom about this.
As a quick/written recap, this is needed because we need to set a global MeterProvider before the HTTP and gRPC servers are created, or we'd get the default/noop ones in those middlewares.
At the same time, we can't set the local exporter's batch processor until we have one.

We could move everything around to ensure things are started in the exact proper order (which would be brittle).

Instead, this allows us to remain order-agnostic, set telemetry as early as possible, but ignore the local exporter until we have a batch processor set.

@dmathieu dmathieu enabled auto-merge (squash) September 5, 2023 14:16
@dmathieu dmathieu merged commit 3216c04 into main Sep 5, 2023
7 checks passed
@dmathieu dmathieu deleted the set-batch-processor branch September 5, 2023 14:30
bmorelli25 pushed a commit to bmorelli25/apm-server that referenced this pull request Sep 5, 2023
* allow setting the metric batch exporter early

* use WithBatchProcessor in tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify test-plan-skip v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants