Fix deadlock between syncd and orchagent syncd during initialization failure#1723
Merged
prsunny merged 8 commits intosonic-net:masterfrom Jan 23, 2026
Merged
Conversation
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
fccbb5b to
6859348
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
@lolyu , could you please help to review? |
Contributor
|
Requested @prsunny to review. Looks to be a very nice (better) to have fix. |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Author
|
/azpw run |
Collaborator
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Author
|
/azpw run |
Collaborator
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
479cb16 to
c8b1c16
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
When syncd requests a shutdown, orchagent may be blocked waiting for a response to an init view (or other NOTIFY) command. Since syncd stops processing commands while waiting for the shutdown response, orchagent never receives its response and cannot acknowledge the shutdown request - resulting in a deadlock. This fix adds the selectable channel to the select loop during shutdown-wait mode and handles incoming commands appropriately: - NOTIFY commands receive a SAI_STATUS_FAILURE response to unblock the waiting orchagent - Other commands are logged and ignored This prevents orchagent from hanging until timeout when syncd is failing. Signed-off-by: david.zagury <davidza@nvidia.com>
Signed-off-by: david.zagury <davidza@nvidia.com>
Signed-off-by: david.zagury <davidza@nvidia.com>
Signed-off-by: david.zagury <davidza@nvidia.com>
Signed-off-by: david.zagury <davidza@nvidia.com>
c8b1c16 to
0b72321
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Author
|
/azpw run |
Collaborator
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Author
@prabhataravind I addressed the copilot comments. |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
prabhataravind
approved these changes
Jan 22, 2026
Collaborator
|
Cherry-pick PR to msft-202412: Azure/sonic-sairedis.msft#104 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When syncd requests a shutdown, orchagent may be blocked waiting for a response to an init view (or other NOTIFY) command. Since syncd stops processing commands while waiting for the shutdown response, orchagent never receives its response and cannot acknowledge the shutdown request - resulting in a deadlock.
This fix adds the selectable channel to the select loop during shutdown-wait mode and handles incoming commands appropriately:
This prevents orchagent from hanging until timout when syncd is failing.
This fix - sonic-net/sonic-buildimage#24799