-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster refactoring #4541
Comments
While I am supportive of direction, I have some reservations. For example, when we call "REPLICAOF host port" we launch an asynchronous flow, but then when we call "REPLICAOF NO ONE" we synchronously wait for the previous instance of replica to stop and join on the previous flow. This is done synchronously and the previous instance of the "state" was destroyed before the next one starts. This way we ensure consistency of our operations. I am sure such transitions exists in cluster as well. So it's not about data members being protected by mutex, it's about the transactional model of our states, how do we ensure that if we kick off a flow |
And I feel task of such complexity requires more than just few short sentences. |
This is a good point and indeed worth writing a about. To answer this question shortly, |
but then I feel it's not enough and we will always have bugs like in #4663 as long as we are blocking on mutex in global transactions and this mutex is not just for CPU only operations that are simple to reason about but also for covering non-trivial flows, we will have such deadlocks. |
Problem: we have a lot of methods that use a vector of migrations and because migrations can be removed we need to lock mutex in all these methods
Solution: make the vector of migrations unchangeable. We already consider the slot migration process as part of the config, so we can use absolutely the same approach, as we use for config, for the vector of migrations and store it in the config or create a thread_local ptr for it. In this case, we don't need any synchronization to operate with migrations and use mutex only in the case when the config is updated (during DFLYCLUSTER CONFIG command and migration finalization).
Other improvements:
The text was updated successfully, but these errors were encountered: