docs(designs): MCP integration beyond tools#718

Open

mkmeral wants to merge 2 commits intostrands-agents:mainfrom

mkmeral:design/mcp-integration

Contributor

mkmeral commented Mar 30, 2026

Description

A design proposal for how to integrate MCP with Strands

Related Issues

strands-agents/sdk-python#1659

Type of Change

Design Doc

Checklist

I have read the CONTRIBUTING document
My changes follow the project's documentation style
I have tested the documentation locally using npm run dev
Links in the documentation are valid and working

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.


          docs(designs): MCP integration beyond tools

40ea0b2

mkmeral requested a deployment to manual-approval

March 30, 2026 18:48

— with

GitHub Actions Waiting

mkmeral temporarily deployed to auto-approve

March 30, 2026 18:48

— with

GitHub Actions Inactive

Contributor

github-actions bot commented Mar 30, 2026 •

edited

Loading

Documentation Preview Ready

Your documentation preview has been successfully deployed!

Preview URL: https://d3ehv1nix5p99z.cloudfront.net/pr-cms-718/docs/user-guide/quickstart/overview/

Updated at: 2026-04-07T16:51:40.815Z

mkmeral mentioned this pull request

[FEATURE] Support for multiple MCP servers (and loading from config file) strands-agents/sdk-python#482

Open

7 tasks


          docs(designs): add tasks, config/auth, and model-immediate-response s…

bb8e4a4

…ections

Incremental additions to the MCP design doc:
- Tasks section: current implementation status, spec gaps table, P1/P2 priorities
- Configuration & Auth: env passthrough, transport defaults, bearer token config sugar
- Open question strands-agents#6: model-immediate-response as future concern (async plumbing)

agent-of-mkmeral temporarily deployed to auto-approve

April 7, 2026 16:47

— with

GitHub Actions Inactive

agent-of-mkmeral requested a deployment to manual-approval

April 7, 2026 16:47

— with

GitHub Actions Waiting

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md


		- Tool list changes go unnoticed. Some MCP servers dynamically add or remove tools based on context (e.g., auth state, project type). The server sends `notifications/tools/list_changed`, but Strands' message handler only processes exceptions. The notification falls through silently, and the agent keeps using a stale tool list until restart.

		- Servers can't request LLM completions. The MCP spec allows servers to ask the client to generate text via `sampling/createMessage`. No production server uses this today, but the pattern is growing — it enables MCP servers that behave as agents rather than just tool providers.

Member

Unshure Apr 7, 2026

How is the pattern growing if no production servers use it? Are hobby or dev servers using it?

gautamsirdeshmukh reviewed

View reviewed changes

designs/0005-mcp-integration.md

+              agent = Agent(
+                  model=my_model,
+                  tools=[my_local_tool],     # local tools still go in tools=
+                  plugins=[plugin],          # MCP integration via plugin

gautamsirdeshmukh Apr 7, 2026

Need to de-dupe tool/client names being passed across the plugin initialization and agent initialization, otherwise we'll get a ValueError for tool already found when registering

Contributor Author

mkmeral Apr 7, 2026

Need to de-dupe tool/client names being passed across the plugin initialization and agent initialization

we should be adding a prefix to tool names when they come through mcp. if not, i'll add it, but yes. that's a common issue where tool names across local and other MCP servers can have the same name

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md


		There are also two bugs: `_create_call_tool_coroutine()` doesn't forward the `_meta` field from tool call arguments (breaking progress tokens and custom metadata), and `MCPToolResult` discards the `isError` flag from `CallToolResult` (making it impossible to distinguish application errors from protocol errors).

		Beyond the callback gaps, there's no integrated story. MCP events don't connect to the Strands hook system. There's no config file loading (every other MCP client supports this). There's no way to map MCP elicitation to Strands interrupts. If one of five MCP servers fails to start, the entire agent crashes.

Member

Unshure Apr 7, 2026

Are these actually blockers in the current setup? I think its pretty trivial to add config based loading in the current mcp client

Contributor Author

mkmeral Apr 7, 2026

Are these actually blockers in the current setup?

i think not really "blockers", otherwise we would have heard people saying it doesn't work.

That said, if you want to achieve implementation of mcp.json with optional loading, you need to do a bunch of manual work, and check the connection before passing it to the agent. Check the code here https://github.com/mkmeral/containerized-strands-agents/blob/main/src/containerized_strands_agents/agent.py#L253

Member

Unshure Apr 7, 2026

Sorry, trivial was probably the wrong phrase. I meant more "uncontroversial"

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md

Comment on lines +78 to +83

+              Or from a config file (standard format used by Claude Desktop, Cursor, VS Code):
+              ```python
+              plugin = MCPPlugin.from_config("mcp.json", fail_open=True)
+              agent = Agent(model=my_model, plugins=[plugin])
+              ```

Member

Unshure Apr 7, 2026

I think this is taking away from the doc a bit. Agree this is a gap, but the proposal here is to cover the gaps in the new mcp spec

Contributor Author

mkmeral Apr 7, 2026

What do you mean? I take the task as improving MCP in strands. Definitely agree that adding new spec updates is part of it, but not all imo

Member

Unshure Apr 7, 2026

I thought the purpose of this doc was to cover the mcp spec updates, not necessarily the mcp config feature request.

I agree that we should also have mcp config, but it is already being tracked as a part of this issue: strands-agents/sdk-python#482

I dont want this design discussion to get caught in the weeds of "if/how should we do mcp config", when we already have an issue tracking it that has been accepted by the team.

notowen333 reviewed

View reviewed changes

designs/0005-mcp-integration.md

+              - Every user reinvents the same patterns. "Route MCP logs to Python logging" is a ~15-line function everyone will write. "Refresh tool cache when tools change" is another ~20-line function everyone will write.
+              - The marginal cost per MCP feature is low but constant — each new spec feature means a new `MCPClient.__init__` parameter and documentation.
+              **Recommendation:** Ship the wire-through callbacks as part of any option — they're small, useful, and serve as an escape hatch for users who need direct control or want to bypass the plugin.

Contributor

notowen333 Apr 7, 2026

I think it's probably cleaner and more maintainable to carve one clear path for integration. This one looks like it requires a lot of lift on the user, so I'd lean towards not exposing callbacks at all

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md

+. Installs a default `logging_callback` that routes MCP server logs to Python's `logging` module
+. Installs a default `list_roots_callback` that exposes the current working directory
+              Users who want to react to MCP events subscribe via the hook system they already know:

Member

Unshure Apr 7, 2026

Does our hooks system allow for customer defined hook events? If not, we should just do that, and have this feature take advantage of that

Contributor Author

mkmeral Apr 7, 2026

I think so, MCP plugin does that essentially.

Essentially you can create any event that extends base hook, then the only blocker (not sure if it is) is calling the invoke callbacks in agent's hook registry

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md

Comment on lines +89 to +90

		4. Installs a default `logging_callback` that routes MCP server logs to Python's `logging` module
		5. Installs a default `list_roots_callback` that exposes the current working directory

Member

Unshure Apr 7, 2026

Curious what the "secure by default" version of this is? Do we want to allow an mcp server to read a filesystem by default? Or is it an explicit opt in?

Contributor Author

mkmeral Apr 7, 2026

I'll add more explicit wording there. In MCP plugin, all features are opt-in. I don't want to expose customer data to bunch of MCP servers :)

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md


		---

		### Option 2: Wire Through (pass callbacks to MCPClient)

Member

Unshure Apr 7, 2026

This is what we have today right? A lowlevel client that does some tool specific stuff, and lets the user implement the rest?

Contributor Author

mkmeral Apr 7, 2026

Yep, but only for elicitation. There are more callbacks that we can hook up

pgrayy reviewed

View reviewed changes

designs/0005-mcp-integration.md


		These patterns make MCP feel like a natural part of the framework rather than just plumbing. They can be built on top of any option but are easiest with Options 1 or 3 because of hook integration.

		### Elicitation as Interrupts

Member

pgrayy Apr 7, 2026

One problem here is that elicitation requires the mcp client connection to remain open. You can't shut down the agent, restore from session, and then respond to interrupts. That is partly why we setup elicitation as a pass through. Not sure if things have changed since.

Contributor Author

mkmeral Apr 7, 2026

That's actually a pretty good call out 😅

I think that limitation still exists. I'll dive a bit deeper

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md


		---

		### Option 3: Full Integration (first-class `mcp_clients` on Agent)

Member

Unshure Apr 7, 2026

I personally like this one, follows the session manager approach of a top level plugin primitive. MCP is THE industry standard for agentic communication today, so I think its ubiquitous enough to deserve a top level primitive spot

Contributor Author

mkmeral Apr 7, 2026

MCP is THE industry standard for agentic communication today

I wouldn't say agentic communication, but tool proxying sure. As I see it the main use case of MCP is tools and everything else is optional/additional. That's why I did not want to auto-connect everything (sampling, elicitation, etc) to agent. Then we will have more complexity on the core agent

I think plugins are a good middle ground

JackYPCOnline reviewed

View reviewed changes

designs/0005-mcp-integration.md


		---

		## Immediate Improvements (Ship Regardless of Option)

Contributor

JackYPCOnline Apr 7, 2026

Can TS have same pairity?

Contributor Author

mkmeral Apr 7, 2026

overall feature-wise? yep. I think it should be part of MCP project

gautamsirdeshmukh reviewed

View reviewed changes

designs/0005-mcp-integration.md

+              **P1 (ship soon):**
+              - Graceful startup failures (`fail_open`) — 30 lines, one broken server shouldn't crash the agent
+              - Progress callback passthrough — 20 lines, pass `progress_callback` to `call_tool()`

gautamsirdeshmukh Apr 7, 2026

Is this just a stopgap solution until the MCPProgressEvent hook is in place? Or do both progress event paths have different purposes?

JackYPCOnline reviewed

View reviewed changes

designs/0005-mcp-integration.md

+              - **Opt-in via `TasksConfig`**: Pass `TasksConfig()` to `MCPClient` constructor to enable
+              - **Server capability detection**: Caches `tasks.requests.tools.call` during `session.initialize()`
+              - **Tool-level negotiation**: Reads `execution.taskSupport` per tool (`required`, `optional`, `forbidden`)
+              - **Full lifecycle**: `call_tool_as_task` → `poll_task` → `get_task_result` with timeout protection

Contributor

JackYPCOnline Apr 7, 2026

Do we need to consider durable if we want a redesign?

Contributor Author

mkmeral Apr 7, 2026

good question, what would that look like? 🤔

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md


		---

		## Tasks

Member

Unshure Apr 7, 2026 •

edited

Loading

Ive seen a few attempts at this, and I think we need this feature regardless of MCP. A way for an llm to schedule a background task, and wait for it to respond at some point in the future. Could be particularly useful in running a background bash command, triggering a research agent, or calling an mcp tool. MCP should use this async task tool for its implementation

Contributor Author

mkmeral Apr 7, 2026

I implemented the same for MCP Dev Summit https://github.com/agent-of-mkmeral/strands-cli-agent

The main problem is plumbing. We can invoke the agent again with tool result, but where will the response go?

Additionally, if there is an ongoing conversation, the async injected context can hurt more than it helps. That's why I left this as more of a followup for now.

Maybe we should vend both option, and let users configure? so like default callbacks for task completion? 🤔

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md


		---

		## Immediate Improvements (Ship Regardless of Option)

Member

Unshure Apr 7, 2026

I think calling out active bugs takes a bit away from the discussion of a design proposal. A bug is a bug, and should be fixed; we dont need a design discussion for that. Lets try to keep these designs focused on new feature proposals

notowen333 reviewed

View reviewed changes

designs/0005-mcp-integration.md


		3. Include Option 2 (wire-through callbacks) as escape hatches inside MCPClient. Power users who want raw control or have unusual requirements can bypass the plugin.

		4. Revisit Option 3 (first-class) once we have adoption data on the plugin. If most users end up using MCPPlugin, promoting it to a native Agent parameter is straightforward.

Contributor

notowen333 Apr 7, 2026

Based on preliminary feature usage data we might already have enough signal to jump for this option which exposes the neatest interface to customers.

MCP is among the most popular feature we measured internally and on GH

Unshure reviewed

View reviewed changes

designs/0005-mcp-integration.md


		---

		## Willingness to Implement

Member

Unshure Apr 7, 2026

nit: Do we need this section?

pgrayy reviewed

View reviewed changes

designs/0005-mcp-integration.md


		---

		## Tasks

Member

pgrayy Apr 7, 2026 •

edited

Loading

Some model providers also support a background mode where you can send a request to the model and receive a response id to use for polling. I think it would actually make sense to exit out of the agent loop under these circumstances to allow the user to poll themselves. Polling internally defeats the purpose as connections remain open for the agent caller. I'd be curious if we could support something similar for background tools. It should work similarly to interrupts. We exit the agent loop and allow the user to reinvoke when ready.

This gets tricky though when executing multiple tools concurrently.

Contributor Author

mkmeral Apr 7, 2026

Instead of polling, you can also use notifications. So server can send notification. We need to implement both imho, so we can support whatever the mcp server supports

poshinchen reviewed

View reviewed changes

designs/0005-mcp-integration.md


		### Option 2: Wire Through (pass callbacks to MCPClient)

		The idea: The simplest possible approach. Add the four missing callback parameters to `MCPClient.__init__()`, pass them through to `ClientSession`, and let users handle everything themselves. No hook integration, no auto-wiring, no plugin.

Contributor

poshinchen Apr 7, 2026

If this is the simplest possible approach, can't we ship this with P0 tasks, then make option 1 (Plugin) as a follow-up?

notowen333 reviewed

View reviewed changes

designs/0005-mcp-integration.md


		## Open Questions

		1. Plugin location — Should MCPPlugin live inside the SDK (`strands.plugins.mcp`) or as a separate package? Inside = better discoverability, separate = faster iteration.

Contributor

notowen333 Apr 7, 2026

If we choose this option and this is the recommended path for MCP, it is much better DX to include it directly in sdk-python

JackYPCOnline reviewed

View reviewed changes

designs/0005-mcp-integration.md

		@@ -0,0 +1,429 @@
		# MCP Integration Beyond Tools

Contributor

JackYPCOnline Apr 7, 2026

What happens to existing tools=[mcp_client] users?

Contributor Author

mkmeral Apr 7, 2026

it works as is. i dont think we should be changing that behavior for now

JackYPCOnline reviewed

View reviewed changes

designs/0005-mcp-integration.md


		The `mcpServers` JSON config format we support today handles the basics (`command`, `args`, `url`, `headers`). A few small additions would improve the developer experience:

		- Pass-through environment keys: Let users specify env var names to forward from the host environment, instead of hardcoding values. Example: `"env": {"passthrough": ["AWS_PROFILE", "DATABASE_URL"]}` forwards those vars from the host into the stdio subprocess without exposing secrets in config files.

Contributor

JackYPCOnline Apr 7, 2026

Can you elaborate more about the reference here?

Contributor Author

mkmeral Apr 7, 2026 •

edited

Loading

e.g. https://kiro.dev/docs/mcp/configuration/

the idea is, MCP.json is not a good file, because it causes you to persist tokens in multiple places. With pass-through env vars, you can just say, this variable will come from environment

pgrayy reviewed

View reviewed changes

designs/0005-mcp-integration.md


		2. Message handler API — The plugin monkey-patches `_handle_error_message`. Should we add a public `set_message_handler()` on MCPClient?

		3. Elicitation-as-interrupts timing — The elicitation callback fires during tool execution (not before). The current interrupt mechanism lives on `BeforeToolCallEvent`. Bridging these needs design work. Worth doing now or deferring?

Member

pgrayy Apr 7, 2026 •

edited

Loading

We support interrupts from within decorator tool definitions as well. The interrupt method is on ToolContext. Just need to support raising interrupts from MCPTool is all. The piping is already in place. But see comment further above regarding why elicitation was setup as a pass through.

poshinchen reviewed

View reviewed changes

designs/0005-mcp-integration.md

+              **Cons:**
+              - Requires changes to `Agent.__init__()` — adding a parameter, import paths, and initialization logic. This is a higher-risk change that affects every user, even those who don't use MCP.
+              - Needs more design work around lifecycle (when do MCP sessions start/stop?), multi-agent sharing (can two agents share an MCPClient?), and backward compatibility (what about existing `tools=[mcp_client]` code?).

Contributor

poshinchen Apr 7, 2026

multi-agent sharing(can two agents share an MCPClient?)

This applies to all three options right? This is kind of a design question?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

Unshure Unshure left review comments

notowen333 notowen333 left review comments

gautamsirdeshmukh gautamsirdeshmukh left review comments

JackYPCOnline JackYPCOnline left review comments

pgrayy pgrayy left review comments

poshinchen poshinchen left review comments

At least 1 approving review is required to merge this pull request.

Labels

None yet