Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

otel: fix flakiness and various issues in TestFBOtelRestartE2E #6819

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mauri870
Copy link
Member

@mauri870 mauri870 commented Feb 11, 2025

What does this PR do?

This test starts the collector with a timeout, but the error returned is not
always a context cancelled, sometimes it returns err == nil, which is also
fine, just not handled properly.

While at it, fix some other issues I found while testing:

  • Remove the requirement that an ignored field cannot be equal in both documents. There are cases for agent.version where it matches on main but not in 8.x or 9.0.
  • Using require inside a goroutine calls runtime.GoExit on failure, meaning
    the test exits immediatelly without doing any cleanup, causing resource leaks. Use assert in those
    cases.
  • Now with the beats dependency up to date, deduplication works as intended otelconsumer: set document id attribute for elasticsearchexporter beats#42412. Update the test to use logs_dynamic_id in the elasticsearchexporter options and ensure data is deduplicated in Elasticsearch.

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Related issues

@mauri870 mauri870 added Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team backport-8.x Automated backport to the 8.x branch with mergify backport-9.0 Automated backport to the 9.0 branch labels Feb 11, 2025
@mauri870 mauri870 self-assigned this Feb 11, 2025
@mauri870 mauri870 requested a review from a team as a code owner February 11, 2025 16:56
@mauri870 mauri870 requested review from swiatekm and pchila February 11, 2025 16:56
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@mauri870 mauri870 changed the title otel: adjust TestFBOtelRestartE2E to validate deduplication works otel: adjust TestFBOtelRestartE2E to validate that deduplication works Feb 11, 2025
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Feb 11, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@mauri870 mauri870 marked this pull request as draft February 12, 2025 11:31
@mauri870
Copy link
Member Author

mauri870 commented Feb 12, 2025

Moving this to draft since it requires work done in beats via elastic/beats#42412 . I need to bump the beats dependency in go.mod #6837.

@mauri870 mauri870 force-pushed the otel-restart-test-duplicates branch from dc95b0a to d04606e Compare February 19, 2025 20:34
@mauri870 mauri870 changed the title otel: adjust TestFBOtelRestartE2E to validate that deduplication works otel: fix flaky behavior in TestFBOtelRestartE2E Feb 19, 2025
@mauri870 mauri870 changed the title otel: fix flaky behavior in TestFBOtelRestartE2E otel: fix flakiness and various issues in TestFBOtelRestartE2E Feb 19, 2025
@mauri870 mauri870 marked this pull request as ready for review February 19, 2025 20:41
@mauri870
Copy link
Member Author

mauri870 commented Feb 19, 2025

I'm repurposing this PR to include a series of fixes for the otel tests. Having the fixes as a batch as oposed to separate PRs speeds up the continuous integration builds.

@mauri870 mauri870 requested a review from swiatekm February 19, 2025 20:43
This test starts the collector with a timeout, but the error returned is not
always a context cancelled, sometimes it returns err == nil, which is also
fine, just not handled properly.

While at it, fix some other issues I found while testing:

- Using require inside a goroutine calls runtime.GoExit on failure, meaning
  the test exits immediatelly without doing any cleanup. Use assert in those
  cases.
@mauri870 mauri870 force-pushed the otel-restart-test-duplicates branch from d04606e to 95c25c9 Compare February 20, 2025 11:29
@mauri870 mauri870 enabled auto-merge (squash) February 20, 2025 12:51
@mauri870 mauri870 marked this pull request as draft February 20, 2025 18:51
auto-merge was automatically disabled February 20, 2025 18:51

Pull request was converted to draft

Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify backport-9.0 Automated backport to the 9.0 branch skip-changelog Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Flaky Test]: TestFBOtelRestartE2E – expected the collector to have stopped
4 participants