Skip to content

Fix health check for SSD vendors: add a parser for ATP, and add a generic health ID for other brands#595

Merged
StormLiangMS merged 3 commits intosonic-net:masterfrom
markx-arista:fix-health-check-for-atp-and-generic-ssds
Sep 18, 2025
Merged

Fix health check for SSD vendors: add a parser for ATP, and add a generic health ID for other brands#595
StormLiangMS merged 3 commits intosonic-net:masterfrom
markx-arista:fix-health-check-for-atp-and-generic-ssds

Conversation

@markx-arista
Copy link
Contributor

@markx-arista markx-arista commented Sep 9, 2025

Description

Fix health check for SSD vendors: add a parser for ATP, and add a generic health ID for other brands.
Each vendor stores health information in different SMART attributes.
ATP stores it in attribute ID 248, we add a parser for it.
We also have SSDs use Attribute ID 231 and it is commonly used, so add it in the generic parser.
Skip obtaining vendor SSD info for ATP and Virtium NVMe SSD because they are handle by parse_generic_ssd_info and parse_vendor_ssd_info will overwrite data with N/A.
Add unit test cases for ATP SATA/NVMe SSD.

Motivation and Context

show platform ssdhealth shows N/A health for some qualified SSDs.

Back port request

  • 202412
  • 202505

How Has This Been Tested?

We have tested the code change on DUTs with different SSDs including all the qualified SSDs that show N/A in health and also on the ones that worked fine before.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@StormLiangMS
Copy link

hi @judyjoseph could you help to review?

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Skip obtaining vendor SSD info for ATP and Virtium NVMe SSD
Add unit test case for ATP NVMe SSD
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

@judyjoseph judyjoseph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@markx-arista
Copy link
Contributor Author

Expect conflict on 202412, so a separate PR for it is created: Azure/sonic-platform-common.msft#106

@StormLiangMS StormLiangMS merged commit 2bcbe3a into sonic-net:master Sep 18, 2025
5 checks passed
@StormLiangMS
Copy link

Hi @markx-arista
I noticed this failure only on 7050cx3, why we need to cherrypick to 202412?

@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202505: #598

@markx-arista
Copy link
Contributor Author

Hi @markx-arista I noticed this failure only on 7050cx3, why we need to cherrypick to 202412?

Hi, there were failures on our 202412, but we just found out those SSDs are not qualified. So PR to 202412 is not necessary, I will close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants