Read SFP temperature from TRANSCEIVER_DOM_TEMPERATURE table#60
Merged
judyjoseph merged 1 commit intoAzure:202412from Feb 25, 2026
Merged
Read SFP temperature from TRANSCEIVER_DOM_TEMPERATURE table#60judyjoseph merged 1 commit intoAzure:202412from
judyjoseph merged 1 commit intoAzure:202412from
Conversation
The xcvrd daemon reads SFP DOM sensor data from hardware and populates Redis tables in STATE_DB. This change modifies thermalctld to read SFP temperature data from these Redis tables instead of making direct platform API calls to the hardware. Temperature reading: - First tries TRANSCEIVER_DOM_TEMPERATURE table - Falls back to TRANSCEIVER_DOM_SENSOR table if not present Threshold reading: - First tries TRANSCEIVER_DOM_THRESHOLD table - Falls back to TRANSCEIVER_DOM_SENSOR table if not present Port mapping: - Uses SfpUtilHelper.get_physical_to_logical() API to map SFP index to logical port name for Redis table lookup Benefits: - Avoids duplicate hardware access (xcvrd already reads this data) - Reduces I2C bus contention - Uses cached data from xcvrd which is already available Signed-off-by: Vasundhara Volam <vvolam@microsoft.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-pick: sonic-net/sonic-platform-daemons#747
Description
This change modifies thermalctld to read SFP temperature and threshold data from Redis tables instead of making direct platform API calls to the hardware.
Changes:
TRANSCEIVER_DOM_TEMPERATUREtable first, falling back toTRANSCEIVER_DOM_SENSORtable if not presentTRANSCEIVER_DOM_THRESHOLDtable first, falling back toTRANSCEIVER_DOM_SENSORtable if not presentSfpUtilHelper.get_physical_to_logical()API to map SFP physical index to logical port name for Redis table lookup_init_sfp_util_helper()method to initialize port mappings_get_sfp_temperature_from_db()method to read temperature from Redis_refresh_sfp_temperature_status()method to update SFP thermal status from Redis dataMotivation and Context
The xcvrd daemon already reads SFP DOM sensor data from hardware and populates Redis tables in STATE_DB. Having thermalctld also read directly from hardware causes:
This change makes thermalctld consume the cached data from xcvrd instead of polling hardware directly, reducing I2C traffic and ensuring consistent temperature data across SONiC components.
Related HLD: https://github.com/sonic-net/SONiC/blob/master/doc/nvidia-thermal-algorithm/improve-sonic-thermal-algo.md
How Has This Been Tested?
Tested on Nvidia SN5640 platform with 66 SFP modules:
Verified temperature values match Redis source:
Verified thresholds are read from TRANSCEIVER_DOM_THRESHOLD table:
Additional Information (Optional)
The implementation follows the HLD design which specifies:
Temperature: Read from STATE_DB::TRANSCEIVER_DOM_TEMPERATURE|Ethernet*.temperature
Thresholds: Read from STATE_DB::TRANSCEIVER_DOM_THRESHOLD|Ethernet*.temphighwarning/temphighalarm
Fallback to TRANSCEIVER_DOM_SENSOR table is provided for backward compatibility with platforms that don't have the new tables populated yet.
Command output after the changes:
$ show plat temp Sensor Temperature High TH Low TH Crit High TH Crit Low TH Warning Timestamp ---------------------- ------------- --------- -------- -------------- ------------- --------- ----------------- ASIC 80.0 105 N/A 120 N/A False 20260211 19:13:38 Ambient Fan Side Temp 41.937 N/A N/A N/A N/A False 20260211 19:13:38 Ambient Port Side Temp 42.187 N/A N/A N/A N/A False 20260211 19:13:38 CPU Pack Temp 43.25 95.0 N/A 100.0 N/A False 20260211 19:13:38 PSU-1 Temp N/A N/A N/A N/A N/A False 20260211 19:13:38 PSU-2 Temp 42.5 63.0 N/A N/A N/A False 20260211 19:13:38 PSU-3 Temp N/A N/A N/A N/A N/A False 20260211 19:13:38 PSU-4 Temp 41.0 63.0 N/A N/A N/A False 20260211 19:13:38 SODIMM 2 Temp 43.25 85.0 N/A 95.0 N/A False 20260211 19:13:38 xSFP module 1 Temp 53.809 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38 xSFP module 2 Temp 59.086 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38 xSFP module 3 Temp 56.512 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38 xSFP module 4 Temp 53.934 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38 xSFP module 5 Temp 54.75 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38 xSFP module 6 Temp 64.375 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38 xSFP module 7 Temp 63.812 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38 xSFP module 8 Temp 55.645 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38 xSFP module 9 Temp 56.277 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38 xSFP module 10 Temp 64.066 75.0 -5.0 80.0 -10.0 False 20260211 19:13:38