Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

receiver/prometheusreceiver: add option to fallback to collector starttime #36365

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ridwanmsharif
Copy link
Contributor

Description

This change adds an option to the metric adjuster to use an approximation of the collector starttime as a fallback for the start time of scraped cumulative metrics. This is useful when no start time is found and when the collector starts up alongside its targets (like in serverless environments or sidecar approaches).

Link to tracking issue

Fixes #36364

Testing

Added unit test for this config option

Documentation

Config option added to the README.

@ridwanmsharif ridwanmsharif requested review from dashpole and a team as code owners November 14, 2024 02:37
@github-actions github-actions bot added the receiver/prometheus Prometheus receiver label Nov 14, 2024
@github-actions github-actions bot requested a review from Aneurysm9 November 14, 2024 02:38
@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif/starttime-fallback branch 2 times, most recently from 4ac0c41 to 05b85e8 Compare November 14, 2024 02:50
…ttime

This change adds an option to the metric adjuster to use an
approximation of the collector starttime as a fallback for the start
time of scraped cumulative metrics. This is useful when no start time is
found and when the collector starts up alongside its targets (like in
serverless environments or sidecar approaches).

Signed-off-by: Ridwan Sharif <ridwanmsharif@google.com>
@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif/starttime-fallback branch from 05b85e8 to d67a526 Compare November 14, 2024 15:15
@ArthurSens
Copy link
Member

The code itself looks correct!

To be completely honest, I'm pretty new to this component and haven't used it myself. I noticed we have waaaay too many fallbacks for Created Timestamps; that looks weird 🤔.

Do I understand correctly that the flow is like this:

  • If metric StartTimeUnixNano is set, use that to populate the Created Timestamp from the prometheus/client_golang SDK.
  • If StartTimeUnixNano is not set, get it from another metric called process_start_time_seconds (where does that come from?)
  • Finally, if we still don't have a timestamp, we use the collector start time.

Did I get the flow correctly?

It makes me wonder, when would an OpenTelemetry metric not have StartTimeUnixNano not set? I understand this is not a required field in the spec, but maybe we could work on making it required instead?

@dashpole dashpole added the enhancement New feature or request label Nov 21, 2024
func NewStartTimeMetricAdjuster(logger *zap.Logger, startTimeMetricRegex *regexp.Regexp, useCollectorStartTimeFallback bool) MetricsAdjuster {
var fallbackStartTime *time.Time
if useCollectorStartTimeFallback {
now := time.Now()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than assume that this is always called at/near start time of collector, should we use the github.com/shirou/gopsutil/v4/host package to request uptime like hostmetricsreceiver does for boottime?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that correct in a containerized environment, or would it give the start time of the host?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the answer to that question, would need to be tested (I don't currently have capacity to test myself).

func NewStartTimeMetricAdjuster(logger *zap.Logger, startTimeMetricRegex *regexp.Regexp, useCollectorStartTimeFallback bool) MetricsAdjuster {
var fallbackStartTime *time.Time
if useCollectorStartTimeFallback {
now := time.Now()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't be the collector start time, but the start time of this instance of the component. If the pipeline is stopped and restarted then this will result in a different timestamp even though the process has not restarted. This is a subtlety, but could be very confusing for someone using OpAMP to manage collector instances. I'm not sure whether there's a way to get a reliable process start time from the collector host. The system uptime is almost certainly not the correct value to use here. Perhaps the best option is just to clarify in the description of the new configuration field that the approximated start time will be relative to the component start time and not necessarily the collector start time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we populate it using an init function, or a variable outside of metric adjuster?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Populating a variable in an init function could work. It would be executed once near the start of the process.

if stma.fallbackStartTime == nil {
return err
}
stma.logger.Warn("Couldn't get start time for metrics. Using fallback start time.", zap.Error(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be at the Warn level? Wouldn't this be a fairly high-rate log entry if none of the processed metrics have a start time?

@Aneurysm9
Copy link
Member

It makes me wonder, when would an OpenTelemetry metric not have StartTimeUnixNano not set? I understand this is not a required field in the spec, but maybe we could work on making it required instead?

This is the prometheus receiver, so the concern here is prometheus metrics not having a start time and we want to ensure that the pdata metrics produced by the receiver do have a start time. The flow you described appears correct, though it is in the context of populating StartTimeUnixNano instead of looking to it for a value.

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Dec 10, 2024
@dashpole dashpole removed the Stale label Dec 10, 2024
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Dec 25, 2024
@dashpole dashpole removed the Stale label Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request receiver/prometheus Prometheus receiver
Projects
None yet
Development

Successfully merging this pull request may close these issues.

receiver/promettheusreceiver: add option to the metric adjuster to fallback to collector start time
6 participants