feat: add uvicorn workers and connection backpressure for scalability by Matovidlo · Pull Request #358 · keboola/mcp-server

Matovidlo · 2026-01-19T11:58:54Z

Description

Link to Devin run: https://app.devin.ai/sessions/58bff70f3347474283d6dc04fa2d4783
Requested by: Martin Vasko (@Matovidlo)

Change Type

Major (breaking changes, significant new features)
Minor (new features, enhancements, backward compatible)
Patch (bug fixes, small improvements, no new features)

Summary

This PR addresses the primary scalability bottleneck where Python's asyncio event loop runs on a single thread. With many concurrent SSE connections (e.g., 1000+), every new request competes for the same event loop, causing simple operations to work but complex ones (like tools/list) to timeout.

Changes:

Uvicorn Workers (--workers CLI argument, default: 1)
- Each worker runs its own asyncio event loop, distributing load across CPU cores
- Note: SSE connections are stateful, so sticky sessions at the load balancer are required when using multiple workers
Connection Limits with Backpressure (--max-connections CLI argument, default: 1000)
- New ConnectionMetrics class tracks active connections thread-safely
- ConnectionLimitMiddleware returns HTTP 503 when at capacity
- Prevents degradation for existing connections by rejecting new ones early
- Health check (/health-check) and info (/) endpoints are excluded from tracking

Key files:

src/keboola_mcp_server/connections.py - New module with connection tracking and middleware
src/keboola_mcp_server/cli.py - CLI arguments and middleware integration

Human Review Checklist

IMPORTANT: Verify workers parameter works correctly with uvicorn.Server.serve() - typically multiple workers are spawned via uvicorn CLI, not the programmatic API. This may require using uvicorn.run() instead or a different approach.
Review thread-safety of ConnectionMetrics using threading.Lock in asyncio context
Confirm 503 response format is appropriate for MCP clients
Consider if default max_connections=1000 is appropriate for production
Note: No unit tests added for the new module - consider if tests should be required

Testing

Tested with Cursor AI desktop (Streamable-HTTP transports)

Optional testing

Tested with Cursor AI desktop (all transports)
Tested with claude.ai web and canary-orion MCP (SSE and Streamable-HTTP)
Tested with In Platform Agent on canary-orion
Tested with RO chat on canary-orion

Checklist

Self-review completed
Unit tests added/updated (if applicable)
Integration tests added/updated (if applicable)
Project version bumped according to the change type (if applicable)
Documentation updated (if applicable)

Release Notes

Justification, description

Adds scalability improvements for high-concurrency scenarios with new --workers and --max-connections CLI options for HTTP-based transports.

Plans for Customer Communication

N/A

Impact Analysis

Low risk - new optional CLI arguments with sensible defaults (workers=1, max-connections=1000). Existing behavior unchanged unless explicitly configured.

Deployment Plan

N/A

Rollback Plan

N/A

Post-Release Support Plan

N/A

This addresses the primary bottleneck where Python's asyncio event loop runs on a single thread. With many concurrent SSE connections (e.g., 1000+), every new request competes for the same event loop, causing simple operations to work but complex ones (like tools/list) to timeout. Solution 2: Uvicorn Workers - Add --workers CLI argument (default: 1) to run multiple event loops in parallel - Each worker process handles its own set of connections - Note: SSE connections are stateful, so sticky sessions at the load balancer are required when using multiple workers Solution 5: Connection Limits with Backpressure - Add --max-connections CLI argument (default: 1000) per worker - New ConnectionMetrics class tracks active connections thread-safely - ConnectionLimitMiddleware returns HTTP 503 when at capacity - This prevents degradation for existing connections by rejecting new ones rather than allowing the event loop to become overloaded - Health check and info endpoints are excluded from connection tracking Co-Authored-By: Martin Vasko <Matovidlo2@gmail.com>

devin-ai-integration · 2026-01-19T11:59:00Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

Co-Authored-By: Martin Vasko <Matovidlo2@gmail.com>

devin-ai-integration bot assigned Matovidlo Jan 19, 2026

chore: re-trigger CI to verify flaky test

e22026f

Co-Authored-By: Martin Vasko <Matovidlo2@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: add uvicorn workers and connection backpressure for scalability#358

feat: add uvicorn workers and connection backpressure for scalability#358
Matovidlo wants to merge 2 commits intomainfrom
devin/1768823690-scalability-workers-backpressure

Matovidlo commented Jan 19, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

devin-ai-integration bot commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

Matovidlo commented Jan 19, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Change Type

Summary

Human Review Checklist

Testing

Optional testing

Checklist

Release Notes

Uh oh!

devin-ai-integration bot commented Jan 19, 2026

🤖 Devin AI Engineer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Matovidlo commented Jan 19, 2026 •

edited by devin-ai-integration bot

Loading