-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QQ: checkpointing frequency improvements #11964
Conversation
3dc0791
to
b17f444
Compare
d513239
to
776d8cb
Compare
776d8cb
to
e22d3c8
Compare
The forced push was a rebase. |
My I assume that a checkpoint taken every ≈ 1M messages is a reasonable rate. With a 50M message backlog across 4 queues, the node takes 18s to start on a mostly idle 10 core machine with a reasonably fast 3 year old SSD. With a workload that simulates peak throughput with 4 queues, 3 publishers and 3 consumers, |
depending on how many messages there are in the backlog it will grow the number of indexes between checkpoints from 4096 to ~1M (max) so yes that tallys. Cheers. |
e22d3c8
to
f9aa5ac
Compare
it was in 3.13.x. Also add a force_checkpoint aux command that the purge operation emits - this can also be used to try to force a checkpoint
f9aa5ac
to
0f1f27c
Compare
Also remove a resolved TODO about conversion for the `last_checkpoint` field.
QQ: checkpointing frequency improvements (backport #11964)
The current approach takes too many checkpoints which affects performance negatively, especially with large backlogs.
This PR takes an approach more similar to what was done for release cursors in 3.13.x.
Also add a force_checkpoint aux command that the purge operation
emits - this can also be used to try to force a checkpoint
The checkpointing config can be changed by setting the the
quorum_queue_checkpoint_config
persistent term:persistent_term:set(quorum_queue_checkpoint_config, {MinIntervalMs, MinIndexes, MaxIndexes}).
the current values are:
{1000, 4096, 666667}
which means it will take a checkpoint at most every 1s as long as at least 4096 indexes have been applied. The min indexes between each checkpoint will grow in line with the message backlog up to at most 666667.