-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[doc]: Stream Telemetry HLD #1795
base: master
Are you sure you want to change the base?
Conversation
bc0e247
to
c14e643
Compare
### Architecture Design | ||
|
||
This section covers the changes that are required in the SONiC architecture. In general, it is expected that the current architecture is not changed. | ||
This section should explain how the new feature/enhancement (module/sub-module) fits in the existing architecture. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. these "guidance" words are better to be removed. very confusing to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, Yes, there are many template "guidance" words I haven't removed them.
|
||
#### Modules #### | ||
|
||
##### Netlink Module ##### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i know the numbered prefix are not there in the template, but the doc will be much more readable with it. we should consider add them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @zhangyanzhao , Do you have any suggestions or concerns about adding a number prefix to each title in the template?
|
||
Pin CPU? | ||
|
||
![netlink_dma_channel](netlink_dma_channel.drawio.svg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we still uses the chunk size concept in our API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or leverage the existing solution, time-based reporting?
Signed-off-by: Ze Gan <zegan@microsoft.com>
290f7f9
to
0604b27
Compare
Signed-off-by: Ze Gan <zegan@microsoft.com>
Signed-off-by: Ze Gan <zegan@microsoft.com>
Signed-off-by: Ze Gan <zegan@microsoft.com>
|
||
- For high-frequency counters, the native IPFIX timestamp unit of seconds is insufficient. Therefore, we introduce an additional element, `observationTimeNanoseconds`, for each record to meet our requirements. | ||
- The enterprise bit is always set to 1 for stats records. | ||
- The element ID of IPFIX is derived from the object index. For example, for `Ethernet5`, the element ID will be `0x5 | 0x8000 = 0x8005`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a variable element ID per SAI object will blow up the number of templates in the report (separate template ID per port, queue etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my mind, I assume all stats(per port, queue etc) in one profile will be encoded into ONE template. I will generate a unique template ID for a profile.
|
||
### Data format | ||
|
||
We will use IPFIX as the report format, with all numbers in the IPFIX message in network-order (Big-endian). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot make the assumption that data will be Big- or Little-endian. If it is collected by the ASIC and DMAd to a ring buffer, then kernel driver would need to correct the endianness of each data set. We need to have a bit in the element ID or Enterprise number that would indicate endianness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you worried that endian conversion will consume too many CPU clocks?
``` | ||
STREAM_TELEMETRY_GROUP|{{profile_name}}|{{group_name}} | ||
"object_names": {{list of object name}} | ||
"object_counters": {{list of stats of object}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you plan to handle the update of the counter list or object list from the report POV? is it going to be a new template ID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I assume we will use a new template ID.
We treat any configuration update as a new profile. The previous session will be interrupted. The SAI report object will get a new SAI_TAM_REPORT_ATTR_REPORT_IPFIX_TEMPLATE_ID
Signed-off-by: Ze Gan <zegan@microsoft.com>
Signed-off-by: Ze Gan <zegan@microsoft.com>
Signed-off-by: Ze Gan <zegan@microsoft.com>
Signed-off-by: Ze Gan <zegan@microsoft.com>
Signed-off-by: Ze Gan <zegan@microsoft.com>
Signed-off-by: Ze Gan <zegan@microsoft.com>
Signed-off-by: Ze Gan <zegan@microsoft.com>
Signed-off-by: Ze Gan <zegan@microsoft.com>
Stream Telemetry HLD initial version