Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial benchmark API #42

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Initial benchmark API #42

wants to merge 12 commits into from

Conversation

zyga
Copy link
Owner

@zyga zyga commented Oct 5, 2020

This patch lays the groundwork for supporting micro-benchmarks
inside libzt. Test suites can now visit benchmarks, in addition
to test cases and other test suites. A benchmark is a function
taking one argument of type zt_b, similar to zt_t for test cases.

The typedef zt_b is a pointer to struct zt_benchmark, holding one
parameter, a 64 bit counter, n, of desired number of iterations
to execute.

Internally libzt executes all benchmarks at least once, to ensure
they do not crash. In verbose mode, when invoked with -v command line
option, precise measurements are taken to compute the number of
nanoseconds required to execute a single loop iteration.

Timing is based on microsecond-accurate, portable, clock_t clock()
function. There are several warm-up phases where the loop is executed
enough times to take roughly ten milliseconds. In my crude measurements
this stabilizes the result well enough to estimate the cost of a single
iteration.

Following that, benchmark.n is set to a value that should give about
one second of execution. This is when final measurements are taken.

I've experimented with several different ideas, and found significant
noise in the early estimation phase, when the effective runtime was
lower than 10ms, at one ms results were several orders of magnitude
off the duration measured over 10ms.

The duration of the complete test is currently over-exaggerated.
I found no difference between desired runtime length of 1000ms
and 100ms, suggesting there is some more room for improvement.

There's a chance to improve accuracy by switching to non-portable,
nanosecond-resolution APIs that internally fuel clock(), but this
was not attempted yet.

The code is not tested yet, manual pages are not complete but there
is a small example of the new functionality.

Signed-off-by: Zygmunt Krynicki me@zygoon.pl

zyga added 10 commits October 5, 2020 08:08
This patch lays the groundwork for supporting micro-benchmarks
inside libzt. Test suites can now visit benchmarks, in addition
to test cases and other test suites. A benchmark is a function
taking one argument of type zt_b, similar to zt_t for test cases.

The typedef zt_b is a pointer to struct zt_benchmark, holding one
parameter, a 64 bit counter, n, of desired number of iterations
to execute.

Internally libzt executes all benchmarks at least once, to ensure
they do not crash. In verbose mode, when invoked with -v command line
option, precise measurements are taken to compute the number of
nanoseconds required to execute a single loop iteration.

Timing is based on microsecond-accurate, portable, clock_t clock()
function. There are several warm-up phases where the loop is executed
enough times to take roughly ten milliseconds. In my crude measurements
this stabilizes the result well enough to estimate the cost of a single
iteration.

Following that, benchmark.n is set to a value that should give about
one second of execution. This is when final measurements are taken.

I've experimented with several different ideas, and found significant
noise in the early estimation phase, when the effective runtime was
lower than 10ms, at one ms results were several orders of magnitude
off the duration measured over 10ms.

The duration of the complete test is currently over-exaggerated.
I found no difference between desired runtime length of 1000ms
and 100ms, suggesting there is some more room for improvement.

There's a chance to improve accuracy by switching to non-portable,
nanosecond-resolution APIs that internally fuel clock(), but this
was not attempted yet.

The code is not tested yet, manual pages are not complete but there
is a small example of the new functionality.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
This is not portable, a more portable fallback will follow.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Those are identical to the default case
but defining them silences warning
emitted by mscv by default.
The benchmark logic is implemented
with QueryPerformanceFrequency
and QueryPerformanceCounter
There is no guarantee that the loop
will have at least one iteration.
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
@zyga zyga changed the base branch from master to main December 28, 2020 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant