Increase the number of times CI runs benchmarks #49092

rosstimothy · 2024-11-15T19:36:54Z

This is in an attempt to provide more reliable test results. 6 was chosen as that's the minimum number benchstat requires for a 95% confidence interval.

Example of warning emitted from benchstat in CI: https://github.com/gravitational/teleport/actions/runs/11862270592

rosstimothy · 2024-11-15T20:04:23Z

@r0mant @doggydogworld what are your thoughts on this change? The attempt is to provide more reliable information by running the benchmark tests a few times, though that comes at the expense of the tests themselves taking longer to complete. The benckmark tests are still however faster than unit and integration tests.

doggydogworld · 2024-11-15T21:04:34Z

@r0mant @doggydogworld what are your thoughts on this change? The attempt is to provide more reliable information by running the benchmark tests a few times, though that comes at the expense of the tests themselves taking longer to complete. The benckmark tests are still however faster than unit and integration tests.

Yeah I originally chose the 1x to ensure that it finishes ASAP to not be a blocker and to prove that changes won't break the benchmark.

Running 6 times minimum would be necessary to test for performance regressions so this eventually was going to make it in. Since we've been able to observe so far that 6 times probably won't slow CI down too much I think it's good to do it now.

The eventual goal would then be to create a baseline (which would have more than 6 runs) to compare with these results to determine regressions.

codingllama · 2024-11-18T14:36:10Z

Makefile

@@ -943,14 +943,14 @@ endif
 test-go-bench: PACKAGES = $(shell grep --exclude-dir api --include "*_test.go" -lr testing.B .  | xargs dirname | xargs go list | sort -u)
 test-go-bench: BENCHMARK_SKIP_PATTERN = "^BenchmarkRoot"
 test-go-bench: | $(TEST_LOG_DIR)
-	go test -run ^$$ -bench . -skip $(BENCHMARK_SKIP_PATTERN) -benchtime 1x $(PACKAGES) \
+	go test -run ^$$ -bench . -skip $(BENCHMARK_SKIP_PATTERN) -count=6 -benchtime 1x $(PACKAGES) \


Do we take advantage of the increased runs in any way? I can see this being part of a larger change that makes use of the bench results, but on its own I'm not sure it adds much. (Even so I'm a bit skeptical that every benchmark we have was built with this in mind.)

Also notice that we are still running 1x, so even though the counts are higher the benchmark itself is still minimal.

We'd like to use our benchmarks as a signal that nothing has broken or regressed between releases. However, right now output between two consecutive tags on a release branch can vary quite dramatically. I'm hoping this might have some impact on stabilizing things.

Should we increase the benchtime too?

I'm not against it, I was trying to limit the changes here to keep test times from ballooning. Do you have a recommendation?

I think it's hard to pitch a generic number, my 2c is that the flags should be on a per-benchmark basis. It also has some to do with how long we want the jobs to take.

Yeah it seems we have too many benchmarks that adding 6s for each one really balloons it. I think this is a good first step at least to get some data.

It sounds like we'll want to revert to -benchtime=1x but I'll leave my stamp regardless. Thanks for the all the explanations here, folks.

Oh, one last suggestion: what about -benchtime=6x instead of -benchtime=1x -count=6?

I'll give that a try. I don't really want to merge this as is and require all PRs to wait >30 minutes just so the benchmark tests can finish.

With -benchtime=1x -count=6 we are back to need >= 6 samples for confidence interval at level 0.95

This is in an attempt to provide more reliable test results. 6 was chosen as that's the minimum number benchstat requires for a 95% confidence interval.

rosstimothy added the no-changelog Indicates that a PR does not require a changelog entry label Nov 15, 2024

rosstimothy force-pushed the tross/increase_benchmark_count branch from 622b657 to 3ffcce1 Compare November 15, 2024 19:44

rosstimothy marked this pull request as ready for review November 15, 2024 21:37

rosstimothy requested review from klizhentas, russjones, r0mant, zmb3, fheinecke, camscale, tcsc and codingllama as code owners November 15, 2024 21:37

github-actions bot requested review from kopiczko and nklaassen November 15, 2024 21:37

github-actions bot added the size/sm label Nov 15, 2024

rosstimothy added backport/branch/v16 backport/branch/v17 labels Nov 15, 2024

rosstimothy requested a review from doggydogworld November 15, 2024 21:43

codingllama reviewed Nov 18, 2024

View reviewed changes

doggydogworld approved these changes Dec 5, 2024

View reviewed changes

codingllama approved these changes Dec 5, 2024

View reviewed changes

public-teleport-github-review-bot bot removed request for kopiczko and nklaassen December 5, 2024 16:48

rosstimothy added 3 commits December 5, 2024 15:03

Increase the number of times CI runs benchmarks

b041067

This is in an attempt to provide more reliable test results. 6 was chosen as that's the minimum number benchstat requires for a 95% confidence interval.

remove benchtime

f0a3853

Try -benchtime=6x instead of -benchtime=1x -count=6

ede6752

rosstimothy force-pushed the tross/increase_benchmark_count branch from 1dcfe4f to ede6752 Compare December 5, 2024 20:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase the number of times CI runs benchmarks #49092

Increase the number of times CI runs benchmarks #49092

rosstimothy commented Nov 15, 2024

rosstimothy commented Nov 15, 2024

doggydogworld commented Nov 15, 2024

codingllama Nov 18, 2024

rosstimothy Nov 18, 2024

codingllama Nov 18, 2024

rosstimothy Nov 18, 2024

codingllama Nov 18, 2024

doggydogworld Dec 5, 2024

codingllama Dec 5, 2024

codingllama Dec 5, 2024 •

edited

Loading

rosstimothy Dec 5, 2024

rosstimothy Dec 5, 2024

Increase the number of times CI runs benchmarks #49092

Are you sure you want to change the base?

Increase the number of times CI runs benchmarks #49092

Conversation

rosstimothy commented Nov 15, 2024

rosstimothy commented Nov 15, 2024

doggydogworld commented Nov 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codingllama Dec 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codingllama Dec 5, 2024 •

edited

Loading