Removing bias from shard allocation. (#5233)

We have noticed after running for a long time, in airmail the distribution of shard count amongst ingester seems uniform, but one or two indexers are getting most of the throughput. This could be caused by an indirect bias in the allocation of shard to ingester. For instance, in airmail, most indexes are very small, but a few of them are much larger. Small indexes have 1 shard with a very low throughput. Large indexes on the other hands have several shards with typically >2MB of throughut. Larger indexes are also more subject to scale up/down, since other indexes tend to stick to having 1 shard (we don't scale down to 0). This PR tries to remove any possible bias when assigning / removing shards in - scale up - scale down - rebalance. Scale up --------------------------- This is the most important change/bias. In presence of a tie, we were picking the ingester with the lowest ingester id. Also, on replication, the logic picking a follower was broken (for a given leader, we were always picking the same follower). The `max_num_shards_to_allocate_per_node` was also wrong (division instead of ceil div) (probably minor). Scale down ---------------------------- The code was relying on the long term ingestion rate, and then ties were solved by the hashmap iterator. The Hashmap iterator is quite random so this was probably not a problem. Rebalance ---------------------------- Arithmetic used to compute the target number of shards was a little bit inaccurate. The shard that are rebalanced are now picked at random (instead of picking the oldest shards in the model).
quickwit-oss · Jul 22, 2024 · 69f700a · 69f700a
1 parent 8e6dc17
commit 69f700a
Show file tree

Hide file tree

Showing 4 changed files with 364 additions and 147 deletions.
diff --git a/quickwit/quickwit-control-plane/src/control_plane.rs b/quickwit/quickwit-control-plane/src/control_plane.rs
@@ -626,6 +626,7 @@ impl Handler<DeleteIndexRequest> for ControlPlane {
             .model
             .list_shards_for_index(&index_uid)
             .flat_map(|shard_entry| shard_entry.ingesters())
+            .map(|node_id_ref| node_id_ref.to_owned())
             .collect();
 
         self.model.delete_index(&index_uid);
@@ -750,6 +751,7 @@ impl Handler<DeleteSourceRequest> for ControlPlane {
                 shard_entries
                     .values()
                     .flat_map(|shard_entry| shard_entry.ingesters())
+                    .map(|node_id_ref| node_id_ref.to_owned())
                     .collect()
             } else {
                 BTreeSet::new()