-
Notifications
You must be signed in to change notification settings - Fork 159
[#4140] ping-check: fix potential deadlock between PingCheckMgr and P… #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Thank you for finding this race. I am not sure if the provided fix is correct, as keeping a reference to the channel while stop is called might lead to some resources not being released (memory leak caused by not calling io_service_->stopAndPoll() after the channel last reference has been released). I'll try to fix the intertwined calls to mutex lock between PingCheckMgr from PingChannel. |
Thanks for pointing this out. I think the current order (channel_.reset() before io_service_->stopAndPoll()) is still safe:
|
Other possible fix would be to modify PingChannel::sendNext(): release the channel lock before invoking next_to_send_cb_ , then reacquire the channel lock afterward for a second check and send. This would unify the lock acquisition order and avoid the deadlock. Why I think this might be safe:
Do you think this alternative would be acceptable, or do you see potential issues I may have another problem? |
Second approach seems better. Will try to understand if there are other possible issues. |
I think that moving:
before MultiThreadingLock lock(*mutex_); in PingChannel::sendNext() is the best and simplest approach. the callback just returns the next target..if available, regardless of socket states, so it can be called at any time. |
Thank you for your review and helpful suggestions. I've updated the code as discussed: Could you please take another look and confirm if this aligns with your proposal? |
…ingChannel Split PingCheckMgr::nextToSend into fetch/update phases to avoid lock inversion
looks alright. ticket will be triaged, updated, reviewed and tested. thank you again for finding this and for your patience. |
I’d be glad if my contribution could be acknowledged. Appreciate your review and looking forward to seeing it in the next dev release. |
Of course we will acknowledge you for this. Please provide credentials for including them in the ChangeLog: name/nickname/alias/company...whatever you like us to mention. Thank you. |
Thanks! You can use the name liyunqing_kylin |
[PR] Fix potential deadlock between PingCheckMgr and PingChannel
Problem:
https://gitlab.isc.org/isc-projects/kea/-/issues/4140
A potential deadlock exists due to inconsistent lock acquisition order between PingCheckMgr and PingChannel:
Path 1:
PingChannel::sendNext() (holds PingChannel lock)
→ callback into PingCheckMgr::nextToSend()
→ checkSuspended() attempts to acquire PingCheckMgr lock.
Path 2:
PingCheckMgr::expirationTimedOut() (holds PingCheckMgr lock)
→ calls channel_->startSend() / channel_->startRead()
→ these attempt to acquire PingChannel lock.
Fix
Move the calls to
channel_->startSend()
/channel_->startRead()
outside the critical section guarded byPingCheckMgr::mutex_
.In the current implementation this was already mostly the case; this change ensures the call site in
expirationTimedOut()
consistently follows the lock-free pattern.Concurrency Considerations
more_pings
:channel_
:nullptr
, the null check prevents invalid access.Impact
PingCheckMgr
andPingChannel
.PingCheckMgr
does not call intoPingChannel
while holding its mutex.