-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding AgentHealth Middleware to Exporters #266
Conversation
retryCnt: *awsConfig.MaxRetries, | ||
collectorID: collectorIdentifier.String(), | ||
retryCnt: config.AWSSessionSettings.MaxRetries, | ||
collectorID: uuid.New().String(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change from uuid.NewRandom()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it back to uuid.NewRandom().
emf.pusherMapLock.Lock() | ||
defer emf.pusherMapLock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good find
sender = registry.Register(set.ID, cfg.TelemetryConfig, xrayClient, opts...) | ||
} | ||
|
||
var xrayClient awsxray.XRayClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super familiar with this exporter, but is it possible that that the reference to xrayClient on line 78 could happen before the initialization in the exporthelper.WithStart
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So based off the BaseExporter struct and the Start funciton of the Base exporter, the Start function is called before the exporter begins processing data. I believe this is a standard practice in the OpenTelemetry Collector's component lifecycle.
// baseExporter contains common fields between different exporter types.
type baseExporter struct {
component.StartFunc
component.ShutdownFunc
signal component.DataType
batchMergeFunc exporterbatcher.BatchMergeFunc[Request]
batchMergeSplitfunc exporterbatcher.BatchMergeSplitFunc[Request]
marshaler exporterqueue.Marshaler[Request]
unmarshaler exporterqueue.Unmarshaler[Request]
set exporter.Settings
obsrep *ObsReport
// Message for the user to be added with an export failure message.
exportFailureMessage string
// Chain of senders that the exporter helper applies before passing the data to the actual exporter.
// The data is handled by each sender in the respective order starting from the queueSender.
// Most of the senders are optional, and initialized with a no-op path-through sender.
batchSender requestSender
queueSender requestSender
obsrepSender requestSender
retrySender requestSender
timeoutSender *timeoutSender // timeoutSender is always initialized.
consumerOptions []consumer.Option
}
func (be *baseExporter) Start(ctx context.Context, host component.Host) error {
// First start the wrapped exporter.
if err := be.StartFunc.Start(ctx, host); err != nil {
return err
}
// If no error then start the batchSender.
if err := be.batchSender.Start(ctx, host); err != nil {
return err
}
// Last start the queueSender.
return be.queueSender.Start(ctx, host)
}
@@ -34,6 +34,7 @@ type cwlExporter struct { | |||
collectorID string | |||
svcStructuredLog *cwlogs.Client | |||
pusherFactory cwlogs.MultiStreamPusherFactory | |||
params exp.Settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we name this something like exporter settings?
@@ -39,6 +39,7 @@ type emfExporter struct { | |||
pusherMap map[cwlogs.StreamKey]cwlogs.Pusher | |||
svcStructuredLog *cwlogs.Client | |||
config *Config | |||
set exporter.Settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we name this settings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking comments
Description of the Issue
The CloudWatch Agent lacked an
agenthealth
configuration to monitor API health by tracking HTTP status codes for specific responses. Monitoring the status codes (200
,400
,408
,419
, and429
) across all APIs is critical for diagnosing issues and ensuring comprehensive observability of API behaviors. Without this configuration, users were unable to quickly identify trends in API health metrics or correlate specific status codes with performance issues.Changes Made
This change adds agent health to the emf_exporter and awscloudwatchlogs exporter. AwsXrayExporter already has handlers being added.
Added
agenthealth
Configuration:200
: Success400
: Bad Request408
: Request Timeout419
: Authentication Timeout429
: Too Many RequestsUpdated CloudWatch Agent Configuration JSON:
agenthealth
metrics.Validated the Configuration:
Impact
Testing
200
,400
,408
,419
, and429
are captured in the header as you can see from the image below: