Skip to content

Conversation

@ashwanthgoli
Copy link
Contributor

What this PR does / why we need it:

Extends xcap regions to also support timestamped events and status similar to otel spans. This is replicating more of the functionality of spans, but the added code is trivial to maintain.

The reason for doing this is that the spans created out of band are not correctly linked with the ones generated by xcap exporter. By supporting events & status, we can replace most of the span usage with regions in the new engine. This ensures the resulting spans from the exporter are all correctly linked.

This pr also increases xcap coverage:

  • metastore observations are recorded
  • engine.go records events to xcap region
  • remove duplicate spans created in executor
  • rangeio uses regions, xcap_bucket records object stats to it as its the immediate parent

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@ashwanthgoli ashwanthgoli requested a review from a team as a code owner December 3, 2025 07:50
@ashwanthgoli ashwanthgoli changed the title Xcap improve coverage chore(xcap): support events, status and improve coverage Dec 3, 2025
statOptimizedRangesCount = xcap.NewStatisticInt64("optimized.ranges", xcap.AggregationTypeSum)
statOptimizedRangesSize = xcap.NewStatisticInt64("optimized.ranges.size.bytes", xcap.AggregationTypeSum)
// pick the min value when aggregating to track the slowest read.
statOptimizedThroughput = xcap.NewStatisticFloat64("optimized.ranges.throughput", xcap.AggregationTypeMin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting with min seems fine to me, but we might want to name the statistic to make it clearer it's min_throughput or something.

Longer term, we might want optimized.ranges.throughput to become our first calculated statistic, calculated from (end_time - start_time) / bytes. If end_time is a max stat, and start_time is a min stat, then the throughput would "adjust" automatically based on how much aggregation you're doing.

children := parentToChildren[region.id]
// Add events to the span
for _, event := range region.events {
span.AddEvent(event.Name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, no action needed: this doesn't look like a problem yet, but OTel SDKs have a span event limit, defaulting to 128, so we might accidentally have some of our capture events drop here.

Longer term, we might want to explore whether events should only appear in non-trace exports (like logs).

@ashwanthgoli ashwanthgoli enabled auto-merge (squash) December 4, 2025 05:30
@ashwanthgoli ashwanthgoli merged commit cbad9d4 into main Dec 4, 2025
70 checks passed
@ashwanthgoli ashwanthgoli deleted the xcap-improve-coverage branch December 4, 2025 05:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants