Metrics collect stress test #2247

fraillt · 2024-10-25T05:31:15Z

Changes

Added two tests (delta and cumulative temporality) to measuring how much metrics collection phase impact measurements throughput.
Implementation consists of running collection phase in the loop, while simultaneously testing measurements throughput.
The main concern isn't the speed of collection phase itself, but ensuring that Opentelemetry Metrics doesn't significantly contribute to p99 latency.
I didn't customize it for different metrics type, idea is to eventually land #2117, so that I could unify collection phase for other metrics #2145.

Additionally, significantly improved code reuse for stress crate.

Here are some results on my machine:

cargo run --release --package stress --bin metrics_collect -- delta
    Finished `release` profile [optimized] target(s) in 0.07s
     Running `target/release/metrics_collect delta`
     17.38 measurements/ms
     52.54 measurements/it
   3023.88 μs/it
   
cargo run --release --package stress --bin metrics_collect -- cumulative
    Finished `release` profile [optimized] target(s) in 0.07s
     Running `target/release/metrics_collect cumulative`
      9.14 measurements/ms
     24.35 measurements/it
   2665.30 μs/it

For curiosity I have also implemented #2117 and #2145 locally, and results are as follows:

cargo run --release --package stress --bin metrics_collect -- delta
    Finished `release` profile [optimized] target(s) in 0.07s
     Running `target/release/metrics_collect delta`
     56.38 measurements/ms
    154.24 measurements/it
   2735.73 μs/it
   
cargo run --release --package stress --bin metrics_collect -- cumulative
    Finished `release` profile [optimized] target(s) in 0.07s
     Running `target/release/metrics_collect cumulative`
    840.84 measurements/ms
   2077.64 measurements/it
   2470.90 μs/it

Merge requirement checklist

CONTRIBUTING guidelines followed
Unit tests added/updated (if applicable)
Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
Changes in public API reviewed (if applicable)

linux-foundation-easycla · 2024-10-25T05:31:19Z

The committers listed above are authorized under a signed CLA.

✅ login: fraillt / name: Mindaugas Vinkelis (b4fffe6)
✅ login: lalitb / name: Lalit Kumar Bhasin (7265639)

utpilla · 2024-10-25T22:09:21Z

@fraillt Thanks for the continued interest in making metrics better! I have two suggestions around sending PRs that I would like you to consider. These are mainly directed towards making the PR review process easy and simple (which would in turn result in the getting PRs merged faster).

Send targeted PRs aimed at solving one thing at a time. For example, there's a lot of things going on this PR. There's a new test being added as the PR title suggests, there is moving around of files, refactoring around reuse of logic across projects. Each of these different additions/enhancements could be added incrementally instead of doing it in a single shot.
Avoid force pushing the commit. With force push, we have no way to know how many files changed since your last commit. We then have to review all the files again to ensure that the latest commit only made the changes that we were expecting. Retaining the individual commits makes it easy for reviewer.

fraillt · 2024-10-29T19:10:21Z

@fraillt Thanks for the continued interest in making metrics better! I have two suggestions around sending PRs that I would like you to consider. These are mainly directed towards making the PR review process easy and simple (which would in turn result in the getting PRs merged faster).

1. Send targeted PRs aimed at solving one thing at a time. For example, there's a lot of things going on this PR. There's a new test being added as the PR title suggests, there is moving around of files, refactoring around reuse of logic across projects. Each of these different additions/enhancements could be added incrementally instead of doing it in a single shot.

2. Avoid force pushing the commit. With force push, we have no way to know how many files changed since your last commit. We then have to review all the files again to ensure that the latest commit only made the changes that we were expecting. Retaining the individual commits makes it easy for reviewer.

I'll try to be more careful regarding force-push in the future.
Regarding targeted PR, I see your point, that targeted PRs are easier to review, but sometimes they doesn't make sense in isolation. For instance I moved SharedReader from opentelemetry-sdk/benches/metric.rs to opentelemetry-sdk/src/testing/metrics/metric_reader.rs, in isolation that looks like a pointless effort, but I do need this functionality for metrics_collect.rs.

Basically what I'm trying to say, is that I'm afraid that if I create PR just for refactoring, it might look worthless on its own...

All changes that was done in this PR does relate to one file that brings new functionality stress/src/bin/metrics_collect.rs.

utpilla · 2024-10-29T22:36:32Z

Basically what I'm trying to say, is that I'm afraid that if I create PR just for refactoring, it might look worthless on its own...

All changes that was done in this PR does relate to one file that brings new functionality stress/src/bin/metrics_collect.rs.

I think we could still have a simpler split up of PRs. Something like this:

Send a PR that adds the new stress test. (along with the boiler plate code: SharedReader etc.) You don't have to add the reuse related changes at this point.
Once that gets merged, you could send a refactoring PR to allow for reuse of SharedReader. This would showcase the motivation for reuse as well.

fraillt · 2024-10-30T20:29:18Z

I force-pushed, because it basically delete everything I did, except for one file, so I guess it is ok :).
I'll not force-push, once someone reviews and I'll update my revision to address review comments.

utpilla · 2024-10-30T21:16:22Z

stress/src/metrics_collect.rs

+                    barrier.wait();
+                    let now = Instant::now();
+                    let mut count = 0;
+                    while is_collecting.load(Ordering::Acquire) {


I think we could have a much simpler and effective setup here. If we want to know whether running collect stops the world, it's better to spawn a thread that keeps calling reader.collect on a loop. And have other threads, record measurements simultaneously.

I did that initially. The problem was that collection phase was not realistic as it had 0 measurements and basically held lock in a loop. So I decided to make it more realistic, by generating some measurements so collection wouldn't be empty.

The problem was that collection phase was not realistic as it had 0 measurements and basically held lock in a loop

I don't follow. You could always record some measurements before you call collect. My concern is around the way this test is setup. The test is using some custom "iterations" which seems unnecessary. You could simply do this instead:

Emit some measurements.

Spawn a thread that runs collect in a loop.

Spawn additional threads that record measurements.

Calculate throughput for the measurements recorded in step 3.

If doing the above, leads to zero measurements being recorded, then we have our answer: Collect is going to "stop the world" as long as it runs.

As I mentioned in this comment, unless we plan to use a more efficient synchronization mechanism in ValueMap, this stress test would not be adding any value.

First of all, this test is adding value, as it is able to measure difference with different "collect" implementation.
Second, running collect in the loop doesn't work with delta temporality, because after first run there will be zero measurements and all other iterations will basically hold write lock in the loop. While in reality no one runs collect in loop and usually there are many measurements and its important that write lock is held as short as possible (which is not the case with current implementation, hence I was able to improve it and this test actually measures it).
I could probably have different tests for delta an cumulative temporality, but I decided to emulate realistic environment.

While in reality no one runs collect in loop

In reality, no one runs the collect the way it's run in this stress test either. Collect would usually be run periodically in 10/30/60 seconds intervals.

First of all, this test is adding value, as it is able to measure difference with different "collect" implementation.

usually there are many measurements and its important that write lock is held as short as possible (which is not the case with current implementation, hence I was able to improve it and this test actually measures it).

This test claims to measure update throughput "while collect is running". The actual implementation of this test however relies on squeezing in some updates before collect runs. That's not the same as testing "while collect is running".

I would love to improve collect to take the write lock for the shortest possible time, but it's better tested using a benchmark.

The actual implementation of this test however relies on squeezing in some updates before collect runs.

There's a lot of truth to this statement, but none the less it appears to measure something useful :)

temporality type of change before change (measurements/ms) after change (measurements/ms)

Cumulative changed write to read lock of hashmap 9 840

Delta reduce the amount of time write lock is held 17 56

Ideally I would like to "catch" changes to how attribute-set hashmap are locked with both: existing types of measurements (with existing attribute-set and new attribute-set combination).
Maybe we need to tests these things separately... I don't know, any ideas are welcome :)

utpilla · 2024-10-30T21:18:25Z

stress/src/metrics_collect.rs

+        let barrier = Barrier::new(num_threads + 1);
+        std::thread::scope(|s| {
+            // first create bunch of measurements,
+            // so that collection phase wouldn't be "empty"


This can be done without spawning new threads. (before you spawn threads for running collect and recording measurements)

utpilla · 2024-10-30T21:27:51Z

stress/src/metrics_collect.rs

+                };
+                is_collecting.store(true, Ordering::Release);
+                barrier.wait();
+                reader.collect(&mut rm).unwrap();


In the PR description, you mentioned that you saw a major improvement when running this with the changes for #2145 locally. I'm curious, the code changes for #2145 are mostly related to reuse of collect code across instruments. How is that improving the perf numbers?

Throughput perf here should mostly depend on the thread synchronization mechanism used for collect and update. Currently, it's using a RwLock in ValueMap and the collect operation requires a write lock which would "stop the world" until collect has the write lock. Unless you change that, I wouldn't expect any significant improvement in throughput here.

For delta temporality I was able to reduce the amount of time a write lock is held while collecting.
For cumulative temporality I simply changed to read lock (currently there is write lock, although cumulative temporality doesn't modify anything, and IMO locking individual measurements is sufficient and much better, then "stop the world until all measurements are read" approach). Although read lock is better, but measurements with new attribute-sets will still be locked. I see two more alternatives:

clone entire attribute-set hashmap (acquire read lock, clone hashmap, release lock, iterate individual measurements)

implement some sort of sharding (more throughtput at the cost of single measurement performance)

In any case, there are ways to improve it:) so I created this PR so we could actually measure different ideas.

These are good ideas!

For cumulative temporality I simply changed to read lock (currently there is write lock, although cumulative temporality doesn't modify anything,

Great! In that case, we should be able test update throughput while collect runs in a loop for Cumulative and a use a simple benchmark for Delta. 🙂

implement some sort of sharding (more throughtput at the cost of single measurement performance)

I did try sharding approach earlier in #1564 - to reduce the write-lock duration by locking only the specific shard of measurements during collection. However, I encountered some concurrency issues. It’s still an approach worth revisiting at some point.

Great! In that case, we should be able test update throughput while collect runs in a loop for Cumulative and a use a simple benchmark for Delta. 🙂

+1

codecov · 2024-11-01T13:46:01Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.3%. Comparing base (e1860c7) to head (7265639).

Additional details and impacted files

@@          Coverage Diff          @@
##            main   #2247   +/-   ##
=====================================
  Coverage   79.3%   79.3%           
=====================================
  Files        121     121           
  Lines      20968   20968           
=====================================
  Hits       16646   16646           
  Misses      4322    4322

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fraillt requested a review from a team as a code owner October 25, 2024 05:31

fraillt force-pushed the metrics_collect_stress_test branch 2 times, most recently from 02be567 to 1fc6d80 Compare October 25, 2024 11:46

Metrics collect stress test

b4fffe6

fraillt force-pushed the metrics_collect_stress_test branch from 1fc6d80 to b4fffe6 Compare October 30, 2024 20:26

utpilla reviewed Oct 30, 2024

View reviewed changes

Merge branch 'main' into metrics_collect_stress_test

7265639

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics collect stress test #2247

Metrics collect stress test #2247

fraillt commented Oct 25, 2024 •

edited

Loading

linux-foundation-easycla bot commented Oct 25, 2024 •

edited

Loading

utpilla commented Oct 25, 2024

fraillt commented Oct 29, 2024

utpilla commented Oct 29, 2024

fraillt commented Oct 30, 2024

utpilla Oct 30, 2024 •

edited

Loading

fraillt Oct 31, 2024

utpilla Oct 31, 2024

fraillt Oct 31, 2024

utpilla Oct 31, 2024 •

edited

Loading

fraillt Nov 1, 2024 •

edited

Loading

utpilla Oct 30, 2024

utpilla Oct 30, 2024

fraillt Oct 31, 2024

utpilla Oct 31, 2024 •

edited

Loading

lalitb Oct 31, 2024 •

edited

Loading

codecov bot commented Nov 1, 2024

temporality	type of change	before change (measurements/ms)	after change (measurements/ms)
Cumulative	changed write to read lock of hashmap	9	840
Delta	reduce the amount of time write lock is held	17	56

Metrics collect stress test #2247

Are you sure you want to change the base?

Metrics collect stress test #2247

Conversation

fraillt commented Oct 25, 2024 • edited Loading

Changes

Merge requirement checklist

linux-foundation-easycla bot commented Oct 25, 2024 • edited Loading

utpilla commented Oct 25, 2024

fraillt commented Oct 29, 2024

utpilla commented Oct 29, 2024

fraillt commented Oct 30, 2024

utpilla Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

fraillt Oct 31, 2024

Choose a reason for hiding this comment

utpilla Oct 31, 2024

Choose a reason for hiding this comment

fraillt Oct 31, 2024

Choose a reason for hiding this comment

utpilla Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

fraillt Nov 1, 2024 • edited Loading

Choose a reason for hiding this comment

utpilla Oct 30, 2024

Choose a reason for hiding this comment

utpilla Oct 30, 2024

Choose a reason for hiding this comment

fraillt Oct 31, 2024

Choose a reason for hiding this comment

utpilla Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

lalitb Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Nov 1, 2024

Codecov Report

fraillt commented Oct 25, 2024 •

edited

Loading

linux-foundation-easycla bot commented Oct 25, 2024 •

edited

Loading

utpilla Oct 30, 2024 •

edited

Loading

utpilla Oct 31, 2024 •

edited

Loading

fraillt Nov 1, 2024 •

edited

Loading

utpilla Oct 31, 2024 •

edited

Loading

lalitb Oct 31, 2024 •

edited

Loading