Add magic string detection for reliable sub-agent hook reporting#118
Closed
Add magic string detection for reliable sub-agent hook reporting#118
Conversation
Reconciled with main branch (which added deepwork rules clear_queue CLI). Key improvements: 1. TASK_START/HOOK_FIRED magic string detection - Sub-agents ALWAYS output TASK_START at response start - If hook fires and blocks, they also output HOOK_FIRED - Detection: TASK_START present + no HOOK_FIRED = hook did NOT fire - Eliminates impossible HOOK_NOT_FIRED requirement 2. Fixed all file names in sub-agent prompts to match actual test files - feature.py → test_trigger_safety_mode.py - module_source.py → test_set_mode_source.py - etc. 3. Added max_turns: 5 timeout to prevent infinite hangs - For "should fire" tests: timeout = PASSED (confirms blocking) - For "should NOT fire" tests: timeout = FAILED 4. Fixed infinite block promise tags to match rule names 5. Uses new deepwork rules clear_queue CLI command
4b2b707 to
6f71bb0
Compare
- Document ~16k baseline input token cost per sub-agent (system prompt + tools) - Add "Keep your response brief" instruction to sub-agent prompts - Helps minimize additional token usage on top of unavoidable baseline
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a critical fix to the manual testing workflow by implementing magic string detection for reliable sub-agent hook reporting. Instead of relying on visual observation of blocking prompts or manual queue inspection, sub-agents now return standardized strings (
HOOK_FIRED:orHOOK_NOT_FIRED:) that the main agent checks to determine test pass/fail status.Key Changes
Magic String Protocol: Sub-agents are now instructed to return:
HOOK_FIRED: <description>if a DeepWork hook blocked themHOOK_NOT_FIRED: Task completed successfullyif no hook firedUpdated Test Instructions: All 8 test cases in both
run_fire_tests.mdandrun_not_fire_tests.mdnow include the magic string instruction at the end of each sub-agent promptFixed File Names: Corrected all test file references in sub-agent prompts to match actual test files:
feature.py→test_trigger_safety_mode.pymodule_source.py→test_set_mode_source.pyhandler_trigger.py→test_pair_mode_trigger.pydangerous.py→test_infinite_block_prompt.pyrisky.py→test_infinite_block_command.pyUpdated Promise Tags: Fixed infinite block test promise tags to match rule names:
<promise>I have verified this change is safe</promise>→<promise>Manual Test: Infinite Block Prompt</promise><promise>I have verified this change is safe</promise>→<promise>Manual Test: Infinite Block Command</promise>Quality Criteria Updates: Modified acceptance criteria to check for magic string detection instead of manual observation:
rules_checkDocumentation Improvements: Enhanced
test_reference.mdwith clear magic string detection explanation and updated critical rulesVersion Bump: Updated job version from 1.2.1 to 1.3.0 with changelog entry
Implementation Details
The magic string approach provides:
This change ensures the manual testing workflow is more robust and less dependent on visual observation or manual verification commands.