Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster refactoring #4541

Open
BorysTheDev opened this issue Jan 31, 2025 · 4 comments
Open

Cluster refactoring #4541

BorysTheDev opened this issue Jan 31, 2025 · 4 comments
Assignees

Comments

@BorysTheDev
Copy link
Contributor

BorysTheDev commented Jan 31, 2025

Problem: we have a lot of methods that use a vector of migrations and because migrations can be removed we need to lock mutex in all these methods
Solution: make the vector of migrations unchangeable. We already consider the slot migration process as part of the config, so we can use absolutely the same approach, as we use for config, for the vector of migrations and store it in the config or create a thread_local ptr for it. In this case, we don't need any synchronization to operate with migrations and use mutex only in the case when the config is updated (during DFLYCLUSTER CONFIG command and migration finalization).

Other improvements:

  1. make the interface of incoming and outgoing migration more similar or even make an ancestor
  2. make flush slots in one place for all migrations and config, during DFLYCLUSTER CONFIG command
  3. instead of migration_mu_ we can use set_config_mu_ without any issues.
  4. update streamer
  5. refactor DflyClusterGetSlotInfo do simple atomic::add(relaxed) instead of a mutex
  6. remove unused code:
  • auto host_ip = cntx->conn()->RemoteEndpointAddress();
  • #include
  1. need to rewrite SlotRanges::Merge() because it is slow and call under mutex
  2. move tl_cluster_config out of cluster_family into cluster_config
  3. implement only migrated slots requests block instead of global Pause()
@BorysTheDev BorysTheDev self-assigned this Jan 31, 2025
@romange
Copy link
Collaborator

romange commented Feb 27, 2025

While I am supportive of direction, I have some reservations.
In Dragonfly we have state transitions that are often atomic and transactional.

For example, when we call "REPLICAOF host port" we launch an asynchronous flow, but then when we call "REPLICAOF NO ONE" we synchronously wait for the previous instance of replica to stop and join on the previous flow. This is done synchronously and the previous instance of the "state" was destroyed before the next one starts. This way we ensure consistency of our operations.

I am sure such transitions exists in cluster as well. So it's not about data members being protected by mutex, it's about the transactional model of our states, how do we ensure that if we kick off a flow A, then flow B, then the end result will be B_end and not A_end.

@romange
Copy link
Collaborator

romange commented Feb 27, 2025

And I feel task of such complexity requires more than just few short sentences.

@adiholden
Copy link
Collaborator

This is a good point and indeed worth writing a about. To answer this question shortly,
we do need to keep mutex for operations, f.e with replicaof we will have a mutex protecting that we dont have 2 replicaof flows at the same time. Once a new flow starts it will first cancels the prev flow and than start the new one. The mutex on the flow is taken untill we finish update the state on all threads, this way we make sure when flow B starts from some thread we take mutex cancel A start B and set the new state on all threads.

@romange
Copy link
Collaborator

romange commented Mar 2, 2025

but then I feel it's not enough and we will always have bugs like in #4663

as long as we are blocking on mutex in global transactions and this mutex is not just for CPU only operations that are simple to reason about but also for covering non-trivial flows, we will have such deadlocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants