feat: add uvicorn workers and connection backpressure for scalability#358
Draft
feat: add uvicorn workers and connection backpressure for scalability#358
Conversation
This addresses the primary bottleneck where Python's asyncio event loop runs on a single thread. With many concurrent SSE connections (e.g., 1000+), every new request competes for the same event loop, causing simple operations to work but complex ones (like tools/list) to timeout. Solution 2: Uvicorn Workers - Add --workers CLI argument (default: 1) to run multiple event loops in parallel - Each worker process handles its own set of connections - Note: SSE connections are stateful, so sticky sessions at the load balancer are required when using multiple workers Solution 5: Connection Limits with Backpressure - Add --max-connections CLI argument (default: 1000) per worker - New ConnectionMetrics class tracks active connections thread-safely - ConnectionLimitMiddleware returns HTTP 503 when at capacity - This prevents degradation for existing connections by rejecting new ones rather than allowing the event loop to become overloaded - Health check and info endpoints are excluded from connection tracking Co-Authored-By: Martin Vasko <Matovidlo2@gmail.com>
Contributor
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Co-Authored-By: Martin Vasko <Matovidlo2@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Link to Devin run: https://app.devin.ai/sessions/58bff70f3347474283d6dc04fa2d4783
Requested by: Martin Vasko (@Matovidlo)
Change Type
Summary
This PR addresses the primary scalability bottleneck where Python's asyncio event loop runs on a single thread. With many concurrent SSE connections (e.g., 1000+), every new request competes for the same event loop, causing simple operations to work but complex ones (like
tools/list) to timeout.Changes:
Uvicorn Workers (
--workersCLI argument, default: 1)Connection Limits with Backpressure (
--max-connectionsCLI argument, default: 1000)ConnectionMetricsclass tracks active connections thread-safelyConnectionLimitMiddlewarereturns HTTP 503 when at capacity/health-check) and info (/) endpoints are excluded from trackingKey files:
src/keboola_mcp_server/connections.py- New module with connection tracking and middlewaresrc/keboola_mcp_server/cli.py- CLI arguments and middleware integrationHuman Review Checklist
workersparameter works correctly withuvicorn.Server.serve()- typically multiple workers are spawned via uvicorn CLI, not the programmatic API. This may require usinguvicorn.run()instead or a different approach.ConnectionMetricsusingthreading.Lockin asyncio contextmax_connections=1000is appropriate for productionTesting
Streamable-HTTPtransports)Optional testing
canary-orionMCP (SSEandStreamable-HTTP)canary-orioncanary-orionChecklist
Release Notes
Justification, description
Adds scalability improvements for high-concurrency scenarios with new
--workersand--max-connectionsCLI options for HTTP-based transports.Plans for Customer Communication
N/A
Impact Analysis
Low risk - new optional CLI arguments with sensible defaults (workers=1, max-connections=1000). Existing behavior unchanged unless explicitly configured.
Deployment Plan
N/A
Rollback Plan
N/A
Post-Release Support Plan
N/A