Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce test flakiness for telemetry request spec #3962

Closed
wants to merge 1 commit into from

Conversation

p-datadog
Copy link
Contributor

@p-datadog p-datadog commented Sep 30, 2024

What does this PR do?

This PR inserts a brief wait (up to 0.25 seconds) for telemetry request spec to avoid the code under test and the test code happening in different whole seconds, which would then fail a test assertion.

Motivation:

Repair the following observed CI failure:


  1) Datadog::Core::Telemetry::Request.build_payload is expected to eq {:api_version=>"v2", :application=>{:env=>"env", :language_name=>"ruby", :language_version=>"2.5.9", ...e_id=>"40935bf2-7e96-4797-af75-b687b45ad719", :seq_id=>#<Double "seq_id">, :tracer_time=>1727716350}
     Failure/Error:
       is_expected.to eq(
         api_version: api_version,
         application: application,
         debug: debug,
         host: host,
         payload: payload,
         request_type: request_type,
         runtime_id: runtime_id,
         seq_id: seq_id,
         tracer_time: tracer_time,

       expected: {:api_version=>"v2", :application=>{:env=>"env", :language_name=>"ruby", :language_version=>"2.5.9", ...e_id=>"40935bf2-7e96-4797-af75-b687b45ad719", :seq_id=>#<Double "seq_id">, :tracer_time=>1727716350}
            got: {:api_version=>"v2", :application=>{:env=>"env", :language_name=>"ruby", :language_version=>"2.5.9", ...e_id=>"40935bf2-7e96-4797-af75-b687b45ad719", :seq_id=>#<Double "seq_id">, :tracer_time=>1727716349}

       (compared using ==)

       Diff:
       @@ -6,5 +6,5 @@
        :request_type => #<Double "request_type">,
        :runtime_id => "40935bf2-7e96-4797-af75-b687b45ad719",
        :seq_id => #<Double "seq_id">,
       -:tracer_time => 1727716350,
       +:tracer_time => 1727716349,
     # ./spec/datadog/core/telemetry/request_spec.rb:65:in `block (3 levels) in <top (required)>'
     # ./spec/spec_helper.rb:231:in `block (2 levels) in <top (required)>'
     # ./spec/spec_helper.rb:116:in `block (2 levels) in <top (required)>'

Additional Notes:

How to test the change?

On my machine the failure can be reliably reproduced by changing the sleep into

    sleep 0 until Time.now.usec > 995700

The required time is the approximate time of the second test's execution subtracted from 1,000,000.

Unsure? Have a question? Request a review!

@p-datadog p-datadog requested a review from a team as a code owner September 30, 2024 17:26
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.86%. Comparing base (74363cc) to head (8427d6f).
Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #3962   +/-   ##
=======================================
  Coverage   97.85%   97.86%           
=======================================
  Files        1305     1305           
  Lines       78224    78222    -2     
  Branches     3887     3887           
=======================================
+ Hits        76549    76553    +4     
+ Misses       1675     1669    -6     
Flag Coverage Δ
97.86% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pr-commenter
Copy link

pr-commenter bot commented Sep 30, 2024

Benchmarks

Benchmark execution time: 2024-09-30 18:02:10

Comparing candidate commit 8427d6f in PR branch reduce-test-flakiness with baseline commit 69ac3b8 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 23 metrics, 2 unstable metrics.

Comment on lines +6 to +12
before do
# The tests here assert on time equality after rounding to a second;
# these assertions will fail if the code executes just before
# a second boundary and the assertions run just after the second boundary.
# Align the runs closer to the beginning of a second to reduce flakiness.
sleep 0 until Time.now.usec < 750000
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the profiler tests I have a few similar situations where I don't know exactly the expected timestamp.

My suggestion here, rather than the sleep/loop (which doesn't exactly guarantee the timings we need -- just makes it more likely to happen) would be to do something like:

let!(:before_time) { Time.now.to_i }

# ...

    it do
      is_expected.to eq(
        api_version: api_version,
        application: application,
        debug: debug,
        host: host,
        payload: payload,
        request_type: request_type,
        runtime_id: runtime_id,
        seq_id: seq_id,
        tracer_time: be_between(before_time, Time.now.to_i),
      )
    end

This way we validate the timestamp is correct without the loop or any flakiness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this in #3966.

p-datadog pushed a commit that referenced this pull request Oct 1, 2024
Use range match for tracer time instead of rounding and requiring
exact match to rounded time, which does not always work
(see #3962).
@p-datadog p-datadog closed this Oct 1, 2024
@p-datadog p-datadog deleted the reduce-test-flakiness branch October 1, 2024 15:38
p-datadog pushed a commit that referenced this pull request Oct 1, 2024
Use range match for tracer time instead of rounding and requiring
exact match to rounded time, which does not always work
(see #3962).
p-datadog pushed a commit that referenced this pull request Oct 1, 2024
Use range match for tracer time instead of rounding and requiring
exact match to rounded time, which does not always work
(see #3962).
p-datadog pushed a commit that referenced this pull request Oct 1, 2024
Use range match for tracer time instead of rounding and requiring
exact match to rounded time, which does not always work
(see #3962).
p-datadog added a commit that referenced this pull request Oct 16, 2024
* Remove flakiness from telemetry request spec

Use range match for tracer time instead of rounding and requiring
exact match to rounded time, which does not always work
(see #3962).

* use match instead of eq

---------

Co-authored-by: Oleg Pudeyev <code@olegp.name>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants