Add magic string detection for reliable sub-agent hook reporting by nhorton · Pull Request #118 · Unsupervisedcom/deepwork

nhorton · 2026-01-22T20:10:49Z

Summary

This PR introduces a critical fix to the manual testing workflow by implementing magic string detection for reliable sub-agent hook reporting. Instead of relying on visual observation of blocking prompts or manual queue inspection, sub-agents now return standardized strings (HOOK_FIRED: or HOOK_NOT_FIRED:) that the main agent checks to determine test pass/fail status.

Key Changes

Magic String Protocol: Sub-agents are now instructed to return:
- HOOK_FIRED: <description> if a DeepWork hook blocked them
- HOOK_NOT_FIRED: Task completed successfully if no hook fired
Updated Test Instructions: All 8 test cases in both run_fire_tests.md and run_not_fire_tests.md now include the magic string instruction at the end of each sub-agent prompt
Fixed File Names: Corrected all test file references in sub-agent prompts to match actual test files:
- feature.py → test_trigger_safety_mode.py
- module_source.py → test_set_mode_source.py
- handler_trigger.py → test_pair_mode_trigger.py
- dangerous.py → test_infinite_block_prompt.py
- risky.py → test_infinite_block_command.py
- And similar corrections for all other test files
Updated Promise Tags: Fixed infinite block test promise tags to match rule names:
- <promise>I have verified this change is safe</promise> → <promise>Manual Test: Infinite Block Prompt</promise>
- <promise>I have verified this change is safe</promise> → <promise>Manual Test: Infinite Block Command</promise>
Quality Criteria Updates: Modified acceptance criteria to check for magic string detection instead of manual observation:
- Changed from "Hooks Observed" to "Magic String Detection"
- Clarified that agents must NOT manually run rules_check
Documentation Improvements: Enhanced test_reference.md with clear magic string detection explanation and updated critical rules
Version Bump: Updated job version from 1.2.1 to 1.3.0 with changelog entry

Implementation Details

The magic string approach provides:

Reliability: Explicit, unambiguous signal from sub-agents about hook behavior
Simplicity: Main agent checks response text instead of inferring from blocking behavior
Consistency: All tests use the same detection mechanism
Fallback: Queue inspection still available if magic string is missing (inconclusive case)

This change ensures the manual testing workflow is more robust and less dependent on visual observation or manual verification commands.

Reconciled with main branch (which added deepwork rules clear_queue CLI). Key improvements: 1. TASK_START/HOOK_FIRED magic string detection - Sub-agents ALWAYS output TASK_START at response start - If hook fires and blocks, they also output HOOK_FIRED - Detection: TASK_START present + no HOOK_FIRED = hook did NOT fire - Eliminates impossible HOOK_NOT_FIRED requirement 2. Fixed all file names in sub-agent prompts to match actual test files - feature.py → test_trigger_safety_mode.py - module_source.py → test_set_mode_source.py - etc. 3. Added max_turns: 5 timeout to prevent infinite hangs - For "should fire" tests: timeout = PASSED (confirms blocking) - For "should NOT fire" tests: timeout = FAILED 4. Fixed infinite block promise tags to match rule names 5. Uses new deepwork rules clear_queue CLI command

- Document ~16k baseline input token cost per sub-agent (system prompt + tools) - Add "Keep your response brief" instruction to sub-agent prompts - Helps minimize additional token usage on top of unavoidable baseline

nhorton force-pushed the claude/review-manual-tests-AR4k2 branch from 4b2b707 to 6f71bb0 Compare January 22, 2026 20:19

claude and others added 3 commits January 22, 2026 20:22

Explicitly require model: haiku for all sub-agent Task calls

74911cd

Add token overhead note and efficiency instructions (v1.3.1)

231605b

- Document ~16k baseline input token cost per sub-agent (system prompt + tools) - Add "Keep your response brief" instruction to sub-agent prompts - Helps minimize additional token usage on top of unavoidable baseline

Ran install for the changes

85f429d

nhorton closed this Jan 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add magic string detection for reliable sub-agent hook reporting#118

Add magic string detection for reliable sub-agent hook reporting#118
nhorton wants to merge 4 commits intomainfrom
claude/review-manual-tests-AR4k2

nhorton commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nhorton commented Jan 22, 2026

Summary

Key Changes

Implementation Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants