refactor: domain-driven prompt-response correlation via ExecutionEvent#794
Open
BlueHotDog wants to merge 8 commits intomainfrom
Open
refactor: domain-driven prompt-response correlation via ExecutionEvent#794BlueHotDog wants to merge 8 commits intomainfrom
BlueHotDog wants to merge 8 commits intomainfrom
Conversation
2c89b4d to
c031117
Compare
Surface :already_running from submit_user_message when an agent is already executing for the task, and add an integration test that exercises the guard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Skip title generation and log accurately when submit_user_message
returns {:ok, :already_running} instead of falling through to the
catch-all {:ok, _interaction} clause.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- execution_test: use chained setup matching main's pattern - tasks_test: use add_agent_response for sequence test to avoid already_running from blocking agent Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…correlation
Replace the fragile `pending_prompt_id` scalar in TaskChannel with a
correlated `pending_prompt` map that ties JSON-RPC request IDs to
domain interaction IDs.
**Domain changes:**
- Add `ExecutionEvent` struct as the ACL boundary between SwarmAi
infrastructure events and Frontman domain events. Carries `caused_by`
(the interaction_id of the triggering UserMessage).
- Thread `interaction_id` through execution metadata so completion
events carry causation context back to the channel.
- `submit_user_message` now rejects entirely when an execution is
already running — no message persisted, immediate error response.
- Extract `add_user_message` as a domain primitive for persisting
messages without starting execution.
- SwarmDispatcher wraps raw swarm events into `ExecutionEvent` before
PubSub broadcast.
- Rename `Execution.handle_swarm_event/3` to `classify_event/1`,
accepting `%ExecutionEvent{}` directly.
**Transport changes:**
- Replace `pending_prompt_id` with `pending_prompt` map containing
both `interaction_id` and `jsonrpc_id`.
- Channel receives `{:execution_event, %ExecutionEvent{}}` instead
of `{:swarm_event, {type, payload}}`.
- `resolve_pending_prompt` verifies causation match via `caused_by`.
Fixes the bug where a second prompt during execution overwrote the
pending prompt ID, causing the first prompt's response to be lost.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The 'crashed agent' and 'failed agent' e2e tests (added in #798) still asserted on the raw {:swarm_event, ...} tuple. After the ExecutionEvent refactor, SwarmDispatcher now broadcasts {:execution_event, %ExecutionEvent{}} instead. Update both assert_receive calls to match the new domain event format.
3022668 to
0c630f9
Compare
Collaborator
Author
|
Fixed |
When retries exhaust, finalize_turn was called without caused_by, silently skipping the causation mismatch check in resolve_pending_prompt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ng pending prompt Add test verifying that when all retries are exhausted the pending JSON-RPC prompt is resolved with an error response, covering the caused_by propagation path through handle_transient_error. Also remove the caused_by \\ nil default from finalize_turn/3 — the two callsites without an execution event now pass nil explicitly, making the arity mandatory. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
These functions operate solely on ExecutionEvent data — moving them there eliminates Feature Envy and keeps domain semantics co-located with the type. Callers (TaskChannel, SwarmDispatcher) now call ExecutionEvent.classify/1 and ExecutionEvent.classify_error/1 directly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ExecutionEventdomain struct as the ACL boundary between SwarmAi infrastructure events and Frontman domain eventsinteraction_id(causation) through execution metadata so completion events carry context back to the channelpending_prompt_idscalar with a correlatedpending_promptmap (interaction_id+jsonrpc_id)add_user_messageas a domain primitive (persist without execution), composesubmit_user_messagefrom itExecution.handle_swarm_event/3→Execution.classify_event/1accepting%ExecutionEvent{}directlyMotivation
Devin review flagged a bug: when a second
session/promptarrives while an agent is running,pending_prompt_idgets overwritten. The first prompt's JSON-RPC response is lost. Root cause: the transport layer was manually mirroring domain execution state via a bare scalar.Rather than patching the symptom, this PR fixes the architecture: the domain now carries its own correlation context through the execution lifecycle, and the transport maps domain IDs to protocol IDs.
Test plan
execution_test.exs:{:ok, :already_running}→{:error, :already_running}, verify rejected message is not persistedtask_channel_test.exs: all event builders useExecutionEvent,pending_prompt_id→pending_prompterror_propagation_test.exs,execution_sentry_test.exs:{:swarm_event, ...}→{:execution_event, %ExecutionEvent{...}}tasks_channel_test.exs: useadd_user_messagefor history-only tests🤖 Generated with Claude Code