Skip to content

Conversation

@Frostman
Copy link
Member

@Frostman Frostman commented Oct 1, 2025

Seems like we're seeing agent trying to apply breakouts on some switches again and again as gNMI isn't showing Completed status for some breakouts for some reason. CLI still shows them as completed so it seems safe to assume no status is completed as well.

Related githedgehog/internal#239

Seems like we're seeing agent trying to apply breakouts on some switches
again and again as gNMI isn't showing Completed status for some
breakouts for some reason. CLI still shows them as completed so it seems
safe to assume no status is completed as well.

Related githedgehog/internal#239

Signed-off-by: Sergei Lukianov <me@slukjanov.name>
@Frostman Frostman requested a review from Copilot October 1, 2025 05:07
@Frostman Frostman requested a review from a team as a code owner October 1, 2025 05:07
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modifies the agent's breakout handling logic to treat missing status as "Completed" to prevent repeated breakout attempts on switches where gNMI doesn't report status correctly.

Key Changes

  • Modified status validation logic to handle nil status as completed state
  • Added explanatory comment about the new behavior

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-actions
Copy link

github-actions bot commented Oct 1, 2025

🚀 Temp artifacts published: v0-888c70a44 🚀

@Frostman Frostman requested a review from pau-hedgehog October 1, 2025 16:07
@Frostman
Copy link
Member Author

Frostman commented Oct 1, 2025

@pau-hedgehog could you please test it on both ds5k and ds3k by changing breakouts to make sure that we aren't facing any new issue with ports

e.g. it may cause another issue when breakout isn't fully completed but we already try to configure resulting ports and we'll get an issue that some port doesn't exist

@Frostman Frostman changed the title chore(agent): conside no status as completed for breakouts chore(agent): consider no status as completed for breakouts Oct 2, 2025
@pau-hedgehog pau-hedgehog added the ci:+hlab Enable hybrid VLAB tests label Oct 6, 2025
@pau-hedgehog
Copy link
Contributor

pau-hedgehog commented Oct 6, 2025

When running the RoCE Release Tests I'm hitting this on ds3000-06 in env-5:

time=2025-10-06T07:18:11.982Z level=DEBUG msg=Action idx=4 weight=7 summary="Delete Port Breakout 1/1" command=delete path="/components/component[name=1/1]/port/breakout-mode/groups/group[index=1]"
time=2025-10-06T07:18:12.535Z level=ERROR msg=Failed err="failed to run agent: failed to process agent config from file: failed to apply actions: GNMI set request failed: gnmi set request failed: rpc error: code = InvalidArgument desc = Port breakout is not allowed when switch is in lossless mode. Remove roce or lossless buffer configuration to breakout ports."

@github-actions
Copy link

github-actions bot commented Oct 6, 2025

🚀 Temp artifacts published: v0-888c70a44 🚀

@Frostman
Copy link
Member Author

Frostman commented Oct 9, 2025

@pau-hedgehog that doesn't look related... Could you please create a separate bug for attempting to set breakouts when lossless mode is enabled?

@pau-hedgehog
Copy link
Contributor

pau-hedgehog commented Oct 10, 2025

@pau-hedgehog that doesn't look related... Could you please create a separate bug for attempting to set breakouts when lossless mode is enabled?

Opened: #1129

@pau-hedgehog
Copy link
Contributor

Going to test this with:

kubectl -n fab patch fab/default --type=merge -p '{"spec":{"overrides":{"versions":{"fabric":{"api":"v0-888c70a44","agent":"v0-888c70a44","boot":"v0-888c70a44","controller":"v0-888c70a44","ctl":"v0-888c70a44","dhcpd":"v0-888c70a44"}}}}}'

@Frostman Frostman marked this pull request as draft October 14, 2025 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:+hlab Enable hybrid VLAB tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants