Add more parameters and wait for server availability #18

franz1981 · 2025-10-22T06:30:20Z

IDK @holly-cummins if that defeat the purpose of simplicity but it should be opaque to users which run the default values.

If you like the type of changes I can do the same for the other scripts (that's why is in DRAFT).

edeandrea · 2025-10-22T13:18:50Z

LGTM!

holly-cummins · 2025-10-23T11:07:27Z

Hmm, I feel conflicted about this. It clearly makes the scripts more powerful, but I wonder if it makes them less useful for the intended purposes:

A sceptical user can read the scripts, and quickly see it's just a simple wrk2/hyperfoil invocation
A user who has 62 other things to do can just invoke-and-go without thinking

I know the defaults mean users can still just invoke-and-go, but the extra capability does reduce the readability of the scripts. It means if we're doing a live demo and we do a more on the script to show what we just did, it's a bit overwhelming to the audience.

If we wanted to do things 'properly', wouldn't we use the 'medium-complexity' scripts that Eric is working on? If the crappy scripts are actually ok, it leaves less of a gap for the medium scripts to fit into. :)

Did you happen to spot how much of a difference waiting for the first request makes to the throughput? I guess not doing so would penalise the runtime with the slower start time and thus be 'unfair'. I can't decide if the extra complexity is worth it for the fairness, or not. I think on that one it probably is, but for the parameters, I'd almost just want to say people should edit the script if they don't like the defaults, because it's only a simple shell script. We could maybe give variable names to the arguments, though. That does make it more obvious what's going on with a more.

So my initial take is

Yes to the || true because that reduces noise from the output and I should have done it anyway
No to the multiple iterations because in a live demo/user in a hurry, more output on screen is worse, and slower is worse
No to the arguments because it makes the script text so long
Yes to using variables in the hyperfoil invocation, rather than magic numbers (improves clarity)
Tentative yes to waiting for first request to improve the result fairness

franz1981 · 2025-10-23T20:35:37Z

I have mixed feelings as well.
What is the exact real purpose of the script? What is intended to demonstrate (from our pov)?

holly-cummins · 2025-10-23T20:50:20Z

I have mixed feelings as well. What is the exact real purpose of the script? What is intended to demonstrate (from our pov)?

Two purposes:

When we're doing live talks, we know we run scripts exactly like this, and every time we do so, we re-invent them, and probably get things wrong. Eric's done it, Clement's got a repo he uses, Julien's got a set of scripts, I've done it ... So rather than every member of the Quarkus team continually reinventing comparative benchmarks and writing mini-performance harnesses, we want a shared resource that uses at least some best practices (and is easy for you to keep an eye on, because it's in one place).
Everyone else comparing Quarkus to spring, either for their own talks, or just for their own internal research, also does a similar process. Seeing an application run on their own machine is always going to be more compelling than numbers we publish, even if the numbers are more methodologically sound. So we want to make it easy and accessible.

In order to feel trustworthy, either to an audience or to someone exploring on their laptop, our scripts have to be easy to understand and digest, which means they'd ideally be one or two lines, and not use any unfamiliar tools. We know they'll actually be less valid if they're that simple, but that's why we have the 'ok, now do it properly' version, and the surrounding discussion about 'here's what's wrong with the numbers you just got'. But there's no point in having two 'do it properly' sets of scripts. :)

franz1981 · 2025-10-23T21:43:44Z

here's what's wrong with the numbers you just got'. But there's no point in having two 'do it properly' sets of scripts

Thanks, got it!
In this regard I think, even before my changes, this script was "too good".
It set the number of cores, memory, it uses wrk2, and start/stop properly, via docker, a dbms...
And still, due to some missed configuration options, is likely not be able to deliver a reliable comparison/some data.

Now, let's say we are at a conference, using this script, and it reports bad numbers (which is possible): it requires to the speaker, live, to "fix" it, in order to obtain something better.
And maybe won't be enough, and more changes need to be made, still live and step by step, to show users why some are required - until numbers become "good enough".
Another option is to just says "yeah, it was expected not be good enough" - and shows the much bigger other script made by Eric, which would overwhelm users - failing into explaining what the original was missing, because too much different.

A third option, which is the purpose of this PR, is to have a slightly more complex script which doesn't need to be fixed live, but just configured, to obtain "good enough" numbers, making it

less error prone for the speaker
easier to grasp from users

At the same time, by making it configured by default to be broken, will make easier to show how numbers can get more reliable, by changing some parameters values.

Said that, I could remove:

parsing the args: it creates too much visual noise
the measurement iteration

And see how it looks like.
I've still left the curl command to silently wait the server to be up and running or wrk could fail due to missing server (w Spring, which can be quite slow to start...).

franz1981 · 2025-10-27T05:29:19Z

I've tried to reduce the visual noise and still allow a speaker to tune more easily the script
e.g. having more explicit and named params

franz1981 · 2025-10-27T06:25:29Z

FYI Hyperfoil/Hyperfoil#626

this is why timeout has been added here ^^
We will work on a fix on hyperfoil side - although it happens only under a specific condition i.e. immensely higher throughput compared to what the system under test can sustain

franz1981 · 2025-10-28T06:35:22Z

last but not least: having a way to parametrize the number of cores is good to show people how performance can be affected in some unexepected ways i.e. single runtime core can silently switch the GC algorithm (as well as scaling the number of compiler threads) ^^

franz1981 force-pushed the stresstest_params branch 2 times, most recently from 0061dcd to c8d5d8e Compare October 27, 2025 05:28

franz1981 marked this pull request as ready for review October 27, 2025 05:28

Add more parameters and wait for server availability

4e57fd9

franz1981 force-pushed the stresstest_params branch from c8d5d8e to 4e57fd9 Compare October 28, 2025 06:34

holly-cummins mentioned this pull request Nov 3, 2025

Do not fix load in a test intended to measure throughput #28

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add more parameters and wait for server availability #18

Add more parameters and wait for server availability #18

Uh oh!

franz1981 commented Oct 22, 2025 •

edited

Loading

Uh oh!

edeandrea commented Oct 22, 2025

Uh oh!

holly-cummins commented Oct 23, 2025 •

edited

Loading

Uh oh!

franz1981 commented Oct 23, 2025

Uh oh!

holly-cummins commented Oct 23, 2025 •

edited

Loading

Uh oh!

franz1981 commented Oct 23, 2025 •

edited

Loading

Uh oh!

franz1981 commented Oct 27, 2025

Uh oh!

franz1981 commented Oct 27, 2025

Uh oh!

franz1981 commented Oct 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add more parameters and wait for server availability #18

Are you sure you want to change the base?

Add more parameters and wait for server availability #18

Uh oh!

Conversation

franz1981 commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

edeandrea commented Oct 22, 2025

Uh oh!

holly-cummins commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

franz1981 commented Oct 23, 2025

Uh oh!

holly-cummins commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

franz1981 commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

franz1981 commented Oct 27, 2025

Uh oh!

franz1981 commented Oct 27, 2025

Uh oh!

franz1981 commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

franz1981 commented Oct 22, 2025 •

edited

Loading

holly-cummins commented Oct 23, 2025 •

edited

Loading

holly-cummins commented Oct 23, 2025 •

edited

Loading

franz1981 commented Oct 23, 2025 •

edited

Loading

franz1981 commented Oct 28, 2025 •

edited

Loading