-
Notifications
You must be signed in to change notification settings - Fork 436
feat(service): add close() method for graceful connection shutdown #588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(service): add close() method for graceful connection shutdown #588
Conversation
This PR primarily fixes modelcontextprotocol#572 by enabling graceful shutdown without consuming self. While implementing this, I noticed delete_session() is spawned as a background task, which means close() may return before HTTP session cleanup completes. Since this is part of the same shutdown lifecycle and can cause resource leaks/races, I'm including a small, localized fix to ensure cleanup is completed before close() returns. If maintainers prefer, I can split the cleanup timing change into a follow-up PR. Changes: - Add close(&mut self) for graceful shutdown without consuming - Add close_with_timeout() for bounded shutdown operations - Add is_closed() to check connection state - Move HTTP delete_session from background spawn to inline cleanup - Add 5-second timeout on session cleanup to prevent indefinite hangs - Add Drop impl with debug log if dropped without explicit close Fixes modelcontextprotocol#572
cefbbd7 to
89db1bf
Compare
|
This seems to be using "cancel" to implement close, should these be two separate things? ie. closing the underlying transport, instead of just always cancelling? |
|
Good point; conceptually, 'close' and 'cancel' represent different intents. close() is meant to be a graceful, deterministic shutdown that ensures the underlying transport is closed and remote session cleanup completes (with timeouts so it can’t hang). cancel() is more of an abort/interrupt. In this PR, cancel() delegates to close() mainly to make the existing API safe by default and ensure authenticated sessions aren’t leaked (the core issue in #572). I’m happy to split the semantics more explicitly (e.g., graceful vs abort paths) if maintainers prefer keeping those behaviors distinct. |
|
@scutuatua-crypto do you have any opinions on the unanswered questions raised here before the PR gets merged? |
|
Does this submission seem to address the issue of graceless exit due to the server not closing invalid connections when the client aborts unexpectedly? |
|
The solution given here technically does address the issue in a workable way yes, maybe it should be left to later / a different issue to nail down the details? |
|
I'm fine with the change coupling the two concepts. Thanks |
Summary
This PR primarily fixes #572 by enabling graceful shutdown without consuming self. While implementing this, I noticed
delete_session()is spawned as a background task, which meansclose()may return before HTTP session cleanup completes. Since this is part of the same shutdown lifecycle and can cause resource leaks/races, I'm including a small, localized fix to ensure cleanup is completed beforeclose()returns. If maintainers prefer, I can split the cleanup timing change into a follow-up PR.Changes
Service Layer (
service.rs)close(&mut self)- Gracefully shuts down the connection and waits for cleanup to complete without consuming theRunningService. This was the main ask in Client connection is not closed on drop, and no way to gracefully stop #572.close_with_timeout(&mut self, Duration)- Same asclose()but with a bounded wait time. Useful for apps that need deterministic shutdown.is_closed(&self)- Check if the service has already been closed or cancelled.Dropimpl - Logs a debug message if dropped without explicitclose()call. TheDropGuardstill handles async cleanup, but this helps developers catch missing cleanup calls.HTTP Transport (
streamable_http_client.rs)delete_session()from a fire-and-forgettokio::spawnto inline cleanup at the end of the worker'srun()methodclose()from hanging indefinitely if the server is unresponsiveclose()returns, preventing orphaned sessions on authenticated MCP serversWhy both changes together?
The original issue mentions that "for authenticated MCP connections, the sessions on the remote would still be active." The service-layer
close()method alone doesn't fully solve this because the HTTP transport's session deletion was spawned as a background task. By moving it inline with a timeout,close()now guarantees bounded, deterministic cleanup.Design Decisions
Q: Can
close()hang forever?No. The HTTP session cleanup has a 5-second timeout. If the server doesn't respond, we log a warning and continue.
close_with_timeout()provides an additional layer of control at the service level.Q: What happens if cancelled mid-cleanup?
The cleanup runs after the main loop exits, so the cancellation token has already fired. The timeout ensures we don't block indefinitely regardless of server behavior.
Manual Verification
All tests pass, clippy is clean.
Test Plan
test_close_method- Verifiesclose()works and can be called twice safelytest_close_with_timeout- Verifies timeout-bounded shutdowntest_cancel_method- Confirms existingcancel()API still works (backward compat)test_drop_without_close- Verifies drop behavior doesn't panic and async cleanup still happensBackward Compatibility
cancel()method still works (now delegates toclose()internally)waiting()method still worksclose()still triggers cleanup viaDropGuard- just with a debug log nowFixes #572