Initial benchmark API #42

zyga · 2020-10-05T06:10:01Z

This patch lays the groundwork for supporting micro-benchmarks
inside libzt. Test suites can now visit benchmarks, in addition
to test cases and other test suites. A benchmark is a function
taking one argument of type zt_b, similar to zt_t for test cases.

The typedef zt_b is a pointer to struct zt_benchmark, holding one
parameter, a 64 bit counter, n, of desired number of iterations
to execute.

Internally libzt executes all benchmarks at least once, to ensure
they do not crash. In verbose mode, when invoked with -v command line
option, precise measurements are taken to compute the number of
nanoseconds required to execute a single loop iteration.

Timing is based on microsecond-accurate, portable, clock_t clock()
function. There are several warm-up phases where the loop is executed
enough times to take roughly ten milliseconds. In my crude measurements
this stabilizes the result well enough to estimate the cost of a single
iteration.

Following that, benchmark.n is set to a value that should give about
one second of execution. This is when final measurements are taken.

I've experimented with several different ideas, and found significant
noise in the early estimation phase, when the effective runtime was
lower than 10ms, at one ms results were several orders of magnitude
off the duration measured over 10ms.

The duration of the complete test is currently over-exaggerated.
I found no difference between desired runtime length of 1000ms
and 100ms, suggesting there is some more room for improvement.

There's a chance to improve accuracy by switching to non-portable,
nanosecond-resolution APIs that internally fuel clock(), but this
was not attempted yet.

The code is not tested yet, manual pages are not complete but there
is a small example of the new functionality.

Signed-off-by: Zygmunt Krynicki me@zygoon.pl

This patch lays the groundwork for supporting micro-benchmarks inside libzt. Test suites can now visit benchmarks, in addition to test cases and other test suites. A benchmark is a function taking one argument of type zt_b, similar to zt_t for test cases. The typedef zt_b is a pointer to struct zt_benchmark, holding one parameter, a 64 bit counter, n, of desired number of iterations to execute. Internally libzt executes all benchmarks at least once, to ensure they do not crash. In verbose mode, when invoked with -v command line option, precise measurements are taken to compute the number of nanoseconds required to execute a single loop iteration. Timing is based on microsecond-accurate, portable, clock_t clock() function. There are several warm-up phases where the loop is executed enough times to take roughly ten milliseconds. In my crude measurements this stabilizes the result well enough to estimate the cost of a single iteration. Following that, benchmark.n is set to a value that should give about one second of execution. This is when final measurements are taken. I've experimented with several different ideas, and found significant noise in the early estimation phase, when the effective runtime was lower than 10ms, at one ms results were several orders of magnitude off the duration measured over 10ms. The duration of the complete test is currently over-exaggerated. I found no difference between desired runtime length of 1000ms and 100ms, suggesting there is some more room for improvement. There's a chance to improve accuracy by switching to non-portable, nanosecond-resolution APIs that internally fuel clock(), but this was not attempted yet. The code is not tested yet, manual pages are not complete but there is a small example of the new functionality. Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>

This is not portable, a more portable fallback will follow. Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>

Those are identical to the default case but defining them silences warning emitted by mscv by default.

The benchmark logic is implemented with QueryPerformanceFrequency and QueryPerformanceCounter

There is no guarantee that the loop will have at least one iteration.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>

zyga added 10 commits October 5, 2020 08:08

Fix comment about zt_b

5ccfdaf

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>

Add baseline no-op benchmark

4e25793

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>

Use clock_gettime for nanosecond precision

a0c516b

This is not portable, a more portable fallback will follow. Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>

Add redundant switch cases

7ccf317

Those are identical to the default case but defining them silences warning emitted by mscv by default.

Port benchmark to win32

f637a3f

The benchmark logic is implemented with QueryPerformanceFrequency and QueryPerformanceCounter

Fix ignore pattern for bench-sqrt

7db0065

Initialize out, silence warning

4a94156

There is no guarantee that the loop will have at least one iteration.

Optimize Windows build

4ce670d

Trim spaces

8d42fd3

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>

zyga changed the base branch from master to main December 28, 2020 19:47

zyga added 2 commits November 21, 2021 22:48

Silence inline and macro warnings

568c5c2

Avoid printing benchmark twice when on Windows

fa6b175

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial benchmark API #42

Initial benchmark API #42

zyga commented Oct 5, 2020

Initial benchmark API #42

Are you sure you want to change the base?

Initial benchmark API #42

Conversation

zyga commented Oct 5, 2020