Skip to content

Commit

Permalink
Removing bias from shard allocation. (#5233)
Browse files Browse the repository at this point in the history
We have noticed after running for a long time, in airmail the distribution of shard count amongst ingester seems uniform, but one or two indexers are getting most of the throughput.

This could be caused by an indirect bias in the allocation of shard to
ingester. For instance, in airmail, most indexes are very small, but a
few of them are much larger. Small indexes have 1 shard with a very low
throughput. Large indexes on the other hands have several shards with
typically >2MB of throughut.

Larger indexes are also more subject to scale up/down, since other
indexes tend to stick to having 1 shard (we don't scale down to 0).

This PR tries to remove any possible bias when assigning / removing
shards in
- scale up
- scale down
- rebalance.

Scale up
---------------------------

This is the most important change/bias.
In presence of a tie, we were picking the ingester with the
lowest ingester id.

Also, on replication, the logic picking a follower was broken
(for a given leader, we were always picking the same follower).

The `max_num_shards_to_allocate_per_node` was also wrong (division
instead of ceil div) (probably minor).

Scale down
----------------------------

The code was relying on the long term ingestion rate, and then
ties were solved by the hashmap iterator. The Hashmap iterator
is quite random so this was probably not a problem.

Rebalance
----------------------------

Arithmetic used to compute the target number of shards was a
little bit inaccurate.

The shard that are rebalanced are now picked at random (instead of
picking the oldest shards in the model).
  • Loading branch information
fulmicoton authored Jul 22, 2024
1 parent 8e6dc17 commit 69f700a
Show file tree
Hide file tree
Showing 4 changed files with 364 additions and 147 deletions.
2 changes: 2 additions & 0 deletions quickwit/quickwit-control-plane/src/control_plane.rs
Original file line number Diff line number Diff line change
Expand Up @@ -626,6 +626,7 @@ impl Handler<DeleteIndexRequest> for ControlPlane {
.model
.list_shards_for_index(&index_uid)
.flat_map(|shard_entry| shard_entry.ingesters())
.map(|node_id_ref| node_id_ref.to_owned())
.collect();

self.model.delete_index(&index_uid);
Expand Down Expand Up @@ -750,6 +751,7 @@ impl Handler<DeleteSourceRequest> for ControlPlane {
shard_entries
.values()
.flat_map(|shard_entry| shard_entry.ingesters())
.map(|node_id_ref| node_id_ref.to_owned())
.collect()
} else {
BTreeSet::new()
Expand Down
Loading

0 comments on commit 69f700a

Please sign in to comment.