[countersyncd]: Modify the exit behavior of the main function#4197
[countersyncd]: Modify the exit behavior of the main function#4197prsunny merged 2 commits intosonic-net:masterfrom
Conversation
Signed-off-by: Ze Gan <ganze718@gmail.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This pull request modifies the countersyncd daemon's exit behavior to terminate the entire process as soon as any actor completes, rather than waiting for all actors to finish. This addresses a specific issue where the OpenTelemetry actor could fail (e.g., due to connection failures to the OTEL collector) but the daemon would continue running without exiting.
Changes:
- Replaced the "wait-for-all" actor completion pattern with a tokio::select! block that exits on first actor termination
- Added a new
exit_on_joinhelper function to handle actor exit with appropriate exit codes - Modified OtelActor error propagation to properly return errors rather than using early
?returns - Added .cargo/config.toml with rustflags to treat warnings as errors
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| crates/countersyncd/src/main.rs | Replaced synchronous wait-for-all actors with tokio::select! for first-exit semantics; added exit_on_join helper; updated exit code imports |
| crates/countersyncd/src/actor/otel.rs | Modified error handling to capture and propagate errors through run_error variable instead of early returns |
| .cargo/config.toml | Added build configuration to treat compiler warnings as errors |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Ze Gan <ganze718@gmail.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Hi @prsunny , could you please help to merge this PR? |
| displayName: "Compile sonic swss" | ||
| - script: | | ||
| cargo test | ||
| RUSTFLAGS=-Dwarnings cargo test |
There was a problem hiding this comment.
@Pterosaur could you set this in Cargo.toml instead, so that it applies always?
https://doc.rust-lang.org/cargo/reference/manifest.html#the-lints-section for reference.
There was a problem hiding this comment.
I did it before, but here is copilot review suggestions:
Setting rustflags to treat warnings as errors (-Dwarnings) is a good practice for CI/CD pipelines to ensure code quality. However, this can be problematic for local development as it makes the build process more strict. Different Rust versions may introduce new warnings, causing builds to fail unexpectedly on local machines or with newer toolchain versions.
Consider moving this configuration to CI-specific settings instead of applying it globally via .cargo/config.toml. This could be done through environment variables (RUSTFLAGS=-Dwarnings) in your CI configuration, or by using a separate profile. This approach allows developers to work with warnings locally while still enforcing zero-warning policy in CI.
I feel it's reasonable. To the developer, the local warning restriction may be not convenient.
There was a problem hiding this comment.
Fair point; it can also help with moving to newer versions of the Rust compiler if it is constrained to this CI pipeline.
|
Cherry-pick PR to msft-202412: Azure/sonic-swss.msft#206 |
|
Retrigger mssonicbld |
|
Cherry-pick PR to 202511: #4225 |
…net#4197) What I did The main function exits as soon as any actor terminates. Why I did it Otel actor may terminate due to failed to connect the otel collector. In the previous behavior, the main function will not exit because it's waiting for all actor terminating.
…net#4197) What I did The main function exits as soon as any actor terminates. Why I did it Otel actor may terminate due to failed to connect the otel collector. In the previous behavior, the main function will not exit because it's waiting for all actor terminating. Signed-off-by: Baorong Liu <96146196+baorliu@users.noreply.github.com>
|
@Pterosaur cherry pick PR didn't pass PR checker. Please check!!! |
What I did
The main function exits as soon as any actor terminates.
Why I did it
Otel actor may terminate due to failed to connect the otel collector. In the previous behavior, the main function will not exit because it's waiting for all actor terminating.
How I verified it
Check it locally:
Details if related