-
Notifications
You must be signed in to change notification settings - Fork 955
Fix a race condition in CompositeEndpointGroup #6220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Motivation: `CompositeEndpointGroup` has a race condition that can occur when multiple child `EndpointGroup`s notify listeners at the same time from different threads. Consider the following sequence of events involving two groups, A and B: 1. A's listener is invoked by thread A. https://github.com/line/armeria/blob/2367de299f71478bd93a618bd40b70a3676656c0/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java#L64 2. Thread A sets `dirty = true`. https://github.com/line/armeria/blob/2367de299f71478bd93a618bd40b70a3676656c0/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java#L65 3. Thread A sets `dirty = false`. https://github.com/line/armeria/blob/2367de299f71478bd93a618bd40b70a3676656c0/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java#L89 4. Thread A builds `newEndpointsA`. https://github.com/line/armeria/blob/2367de299f71478bd93a618bd40b70a3676656c0/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java#L95-L98 5. B's listener is invoked by thread B. 6. Thread B sets `dirty = true`. 7. Thread B sets `dirty = false`. 8. Thread B builds `newEndpointsB` (which includes latest data from both A and B) and sets it. 9. Thread A sets the `newEndpointsA`, which does not contain updates from B. https://github.com/line/armeria/blob/2367de299f71478bd93a618bd40b70a3676656c0/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java#L100 This results in stale endpoints being set, despite a newer state already being computed and applied. Modifications: - Fixed the race condition in `CompositeEndpointGroup`. Result: - `CompositeEndpointGroup` now handles concurrent updates correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation looks good in terms of correctness.
Alternatively, I think a lock based implementation could help avoid comparing endpoint lists which might be large.
I wanted to avoid using a lock since we don't know what's inside in
CAS uses the references when comparing, so it's no big deal. 😉 |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6220 +/- ##
============================================
- Coverage 74.46% 0 -74.47%
============================================
Files 1963 0 -1963
Lines 82437 0 -82437
Branches 10764 0 -10764
============================================
- Hits 61385 0 -61385
+ Misses 15918 0 -15918
+ Partials 5134 0 -5134 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
newEndpointsBuilder.addAll(endpointGroup.endpoints()); | ||
} | ||
final List<Endpoint> newEndpoints = newEndpointsBuilder.build(); | ||
if (merged.compareAndSet(oldEndpoints, newEndpoints)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because ImmutableList.of()
returns a singleton instance, rebuildEndpoints
may update outdated endpoints in rare cases.
- Thread A updates
endpointA
.- Thread A prepares to set
list(endpointA)
in L93
- Thread A prepares to set
- Thread B removes
endpointA
- Thread B sets empty -> empty in L94
- Thread A sets empty ->
list(endpointA)
in L94 - merged is
list(endpointA)
but the actual endpoints inendpointGroups
are empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! Updated. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! 👍 👍
Motivation:
CompositeEndpointGroup
has a race condition that can occur when multiple childEndpointGroup
s notify listeners at the same time from different threads. Consider the following sequence of events involving two groups, A and B:armeria/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java
Line 64 in 2367de2
dirty = true
.armeria/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java
Line 65 in 2367de2
dirty = false
.armeria/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java
Line 89 in 2367de2
newEndpointsA
.armeria/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java
Lines 95 to 98 in 2367de2
dirty = true
.dirty = false
.newEndpointsB
(which includes latest data from both A and B) and sets it.newEndpointsA
, which does not contain updates from B.armeria/core/src/main/java/com/linecorp/armeria/client/endpoint/CompositeEndpointGroup.java
Line 100 in 2367de2
This results in stale endpoints being set, despite a newer state already being computed and applied.
Modifications:
CompositeEndpointGroup
.Result:
CompositeEndpointGroup
now handles concurrent updates correctly.Motivation:
Explain why you're making this change and what problem you're trying to solve.
Modifications:
Result: