Add manual test for infinite block handling#68
Merged
Conversation
This test verifies that a rule with no safety file option will always block when its trigger file is edited. The only way to proceed is by providing a promise tag in the correct format: <promise>Manual Test: Infinite Block</promise> This tests the promise mechanism for rules that cannot be satisfied by editing additional files, useful for enforcing policies where explicit acknowledgment is the only valid response.
Renamed the original infinite block test to clarify it uses a prompt action, and added a new command-based variant that uses `false` (always fails). Both tests verify the promise mechanism: - Infinite Block Prompt: Shows instructions, promise bypasses - Infinite Block Command: Runs failing command, promise skips execution Testing confirmed that promises work correctly for both action types - when a promise is provided, command-action rules are skipped entirely and the command never runs.
When a command-action rule fails, the error output now includes guidance on how to skip the rule using a promise tag. Previously, command errors only showed the command name and exit code, leaving the agent with no way to know how to proceed. The error output now includes: "The following command rules failed. To skip a rule, include `<promise>Rule Name</promise>` in your response." Also updated the infinite block command test to verify this behavior and removed redundant promise guidance from the test file (since the guidance should come from the hook output, not the test file).
When a command-action rule fails, the error output now includes the exact
promise tag needed to skip that specific rule:
To skip, include `<promise>Manual Test: Infinite Block Command</promise>` in your response.
Previously, the guidance was generic ("Rule Name") which didn't help agents
understand how to proceed. Now each failed rule shows its actual name.
Bumps version to 0.4.1 as this is a user-facing bug fix.
The promise skip hint now shows `<promise>✓ Rule Name</promise>` for nicer visual output when users see the instructions. Also updated the promise extraction regex to properly handle the checkmark prefix when parsing promises from the transcript.
Tests cover: - Simple promise tags - Promise tags with checkmark prefix (✓) - Multiple promises in same text - Case insensitivity of tag names - Whitespace and newline handling - Real-world command error promise format - Mixed formats (with and without checkmark) - Special characters in rule names This ensures the checkmark prefix handling never regresses.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This test verifies that a rule with no safety file option will always block when its trigger file is edited. The only way to proceed is by providing a promise tag in the correct format:
Manual Test: Infinite Block
This tests the promise mechanism for rules that cannot be satisfied by editing additional files, useful for enforcing policies where explicit acknowledgment is the only valid response.