Add pop batch size support for ZMQ Consumer#1084
Merged
qiluo-msft merged 4 commits intosonic-net:masterfrom Oct 8, 2025
Merged
Add pop batch size support for ZMQ Consumer#1084qiluo-msft merged 4 commits intosonic-net:masterfrom
qiluo-msft merged 4 commits intosonic-net:masterfrom
Conversation
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This was referenced Sep 30, 2025
oleksandrivantsiv
approved these changes
Sep 30, 2025
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
|
@qiluo-msft , please review/merge. |
qiluo-msft
reviewed
Oct 2, 2025
qiluo-msft
reviewed
Oct 2, 2025
mssonicbld
added a commit
to mssonicbld/sonic-buildimage-msft
that referenced
this pull request
Oct 2, 2025
<!--
Please make sure you've read and understood our contributing guidelines:
https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md
** Make sure all your commits include a signature generated with `git commit -s` **
If this is a bug fix, make sure your description includes "fixes #xxxx", or
"closes #xxxx" or "resolves #xxxx"
Please provide the following information:
-->
#### Why I did it
Increase the pop batch size and Max Bulker limit to 65536 to speed up applying the high volume Dash configuration
Depends on
sonic-net/sonic-sairedis#1660
sonic-net/sonic-swss-common#1084
sonic-net/sonic-swss#3910
##### Work item tracking
- Microsoft ADO **(number only)**:
#### How I did it
#### How to verify it
```
root@sonic:/home/admin# ps -aux | grep orch
root 11118 1.5 0.4 464804 267368 pts/0 Sl 02:50 0:00 /usr/bin/orchagent -d /var/log/swss -b 65536 -z zmq_sync -k 65536 -m B0:CF:0E:20:8E:DE -q tcp://eth0-midplane
2025 Sep 30 18:48:38.911835 sonic NOTICE swss#orchagent: :- main: Setting maximum bulk size in bulk mode as 65536
```
Apply Scale config and verify
<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->
#### Which release branch to backport (provide reason below if selected)
<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->
- [ ] 202205
- [ ] 202211
- [ ] 202305
- [ ] 202311
- [ ] 202405
- [ ] 202411
- [ ] 202505
#### Tested branch (Please provide the tested image version)
<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->
- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->
#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
<!--
Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->
#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->
#### A picture of a cute animal (not mandatory but encouraged)
Merged
9 tasks
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
mssonicbld
added a commit
to Azure/sonic-buildimage-msft
that referenced
this pull request
Oct 3, 2025
…1691) <!-- Please make sure you've read and understood our contributing guidelines: https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` ** If this is a bug fix, make sure your description includes "fixes #xxxx", or "closes #xxxx" or "resolves #xxxx" Please provide the following information: --> #### Why I did it Increase the pop batch size and Max Bulker limit to 65536 to speed up applying the high volume Dash configuration Depends on sonic-net/sonic-sairedis#1660 sonic-net/sonic-swss-common#1084 sonic-net/sonic-swss#3910 ##### Work item tracking - Microsoft ADO **(number only)**: #### How I did it #### How to verify it ``` root@sonic:/home/admin# ps -aux | grep orch root 11118 1.5 0.4 464804 267368 pts/0 Sl 02:50 0:00 /usr/bin/orchagent -d /var/log/swss -b 65536 -z zmq_sync -k 65536 -m B0:CF:0E:20:8E:DE -q tcp://eth0-midplane 2025 Sep 30 18:48:38.911835 sonic NOTICE swss#orchagent: :- main: Setting maximum bulk size in bulk mode as 65536 ``` Apply Scale config and verify <!-- If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012. --> #### Which release branch to backport (provide reason below if selected) <!-- - Note we only backport fixes to a release branch, *not* features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 202205 - [ ] 202211 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 #### Tested branch (Please provide the tested image version) <!-- - Please provide tested image version - e.g. - [x] 20201231.100 --> - [ ] <!-- image version 1 --> - [ ] <!-- image version 2 --> #### Description for the changelog <!-- Write a short (one line) summary that describes the changes in this pull request for inclusion in the changelog: --> <!-- Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU. --> #### Link to config_db schema for YANG module changes <!-- Provide a link to config_db schema for the table for which YANG model is defined Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md --> #### A picture of a cute animal (not mandatory but encouraged)
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Author
|
@qiluo-msft, handled all comments. Please help signoff |
qiluo-msft
approved these changes
Oct 8, 2025
Collaborator
|
Cherry-pick PR to msft-202506: Azure/sonic-swss-common.msft#67 |
prsunny
pushed a commit
to sonic-net/sonic-swss
that referenced
this pull request
Oct 9, 2025
* [ZmqOrch] Optimize memory by popping batch size at a time What I did Used a reference instead of a unnecessary copy of a set object Optimize memory by popping batch size at a time NOTE: Please merge only after the below two PR's are merged sonic-net/sonic-swss-common#1084 sonic-net/sonic-sairedis#1660
mssonicbld
added a commit
to mssonicbld/sonic-swss.msft
that referenced
this pull request
Oct 9, 2025
<!-- Please make sure you have read and understood the contribution guildlines: https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md 1. Make sure your commit includes a signature generted with `git commit -s` 2. Make sure your commit title follows the correct format: [component]: description 3. Make sure your commit message contains enough details about the change and related tests 4. Make sure your pull request adds related reviewers, asignees, labels Please also provide the following information in this pull request: --> **What I did** 1. Used a reference instead of a unnecessary copy of a set object 2. Optimize memory by popping batch size at a time **NOTE: Please merge only after the below two PR's are merged** sonic-net/sonic-swss-common#1084 sonic-net/sonic-sairedis#1660 **Why I did it** To reduce peak memory usage when applying high-volume dash configuration **How I verified it** **Details if related**
mssonicbld
added a commit
to Azure/sonic-swss.msft
that referenced
this pull request
Oct 9, 2025
<!-- Please make sure you have read and understood the contribution guildlines: https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md 1. Make sure your commit includes a signature generted with `git commit -s` 2. Make sure your commit title follows the correct format: [component]: description 3. Make sure your commit message contains enough details about the change and related tests 4. Make sure your pull request adds related reviewers, asignees, labels Please also provide the following information in this pull request: --> **What I did** 1. Used a reference instead of a unnecessary copy of a set object 2. Optimize memory by popping batch size at a time **NOTE: Please merge only after the below two PR's are merged** sonic-net/sonic-swss-common#1084 sonic-net/sonic-sairedis#1660 **Why I did it** To reduce peak memory usage when applying high-volume dash configuration **How I verified it** **Details if related**
balanokia
pushed a commit
to balanokia/sonic-swss
that referenced
this pull request
Nov 17, 2025
…3910) * [ZmqOrch] Optimize memory by popping batch size at a time What I did Used a reference instead of a unnecessary copy of a set object Optimize memory by popping batch size at a time NOTE: Please merge only after the below two PR's are merged sonic-net/sonic-swss-common#1084 sonic-net/sonic-sairedis#1660
Collaborator
|
Cherry-pick PR to 202511: #1126 |
theasianpianist
pushed a commit
to theasianpianist/sonic-swss
that referenced
this pull request
Feb 4, 2026
…3910) * [ZmqOrch] Optimize memory by popping batch size at a time What I did Used a reference instead of a unnecessary copy of a set object Optimize memory by popping batch size at a time NOTE: Please merge only after the below two PR's are merged sonic-net/sonic-swss-common#1084 sonic-net/sonic-sairedis#1660 Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
baorliu
pushed a commit
to baorliu/sonic-swss
that referenced
this pull request
Feb 23, 2026
…3910) * [ZmqOrch] Optimize memory by popping batch size at a time What I did Used a reference instead of a unnecessary copy of a set object Optimize memory by popping batch size at a time NOTE: Please merge only after the below two PR's are merged sonic-net/sonic-swss-common#1084 sonic-net/sonic-sairedis#1660 Signed-off-by: Baorong Liu <96146196+baorliu@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What i did
Add pop batch size support to ZmqConsumerState Table to optimize memory and increase the speed for updating CRM counters/DASH Feedback when applying dash configuration at scale
Example:
Let's say we have a GNMI server which pushed X entries to orchagent. Current logic of ZmqConsumerState table would move X entries to m_toSync map.
Dashorch would create X entries in bulker. However, max_bulk size is often limited (currently 1000) And definitely much less than the size of m_toSync in this scale scenario.
So, effective memory during this time is
2 * X (1 copy in m_toSync + 1 copy in bulker)* size per objectuntil all those entries are applied to ASIC.With this change, only pop batch size entries are popped out to m_toSync and added to bulker. Thus peak memory utilization is cut in half in case of Dash Scale.
Another side effect of this change is the postprocessing for pop batch size items is done immediately in orchagent and there is no delay on updating CRM or GNMI Feedback loop. If not, post processing starts only after all the entries in m_toSync are applied to syncd which is not capped for current design
How i verified
UT and applying DASH config and making sure everything works
Before the update:
After the update: