diff --git a/.claude/commands/add_platform.verify.md b/.claude/commands/add_platform.verify.md index d92da75a..d937b537 100644 --- a/.claude/commands/add_platform.verify.md +++ b/.claude/commands/add_platform.verify.md @@ -14,7 +14,7 @@ hooks: 2. Running `deepwork install --platform ` completes without errors 3. Expected command files are created in the platform's command directory 4. Command file content matches the templates and job definitions - 5. Established DeepWork jobs (deepwork_jobs, deepwork_policy) are installed correctly + 5. Established DeepWork jobs (deepwork_jobs, deepwork_rules) are installed correctly 6. The platform can be used alongside existing platforms without conflicts If ALL criteria are met, include `✓ Quality Criteria Met`. @@ -121,7 +121,7 @@ Ensure the implementation step is complete: - `deepwork_jobs.define.md` exists (or equivalent for the platform) - `deepwork_jobs.implement.md` exists - `deepwork_jobs.refine.md` exists - - `deepwork_policy.define.md` exists + - `deepwork_rules.define.md` exists - All expected step commands exist 4. **Validate command file content** @@ -151,7 +151,7 @@ Ensure the implementation step is complete: - `deepwork install --platform ` completes without errors - All expected command files are created: - deepwork_jobs.define, implement, refine - - deepwork_policy.define + - deepwork_rules.define - Any other standard job commands - Command file content is correct: - Matches platform's expected format @@ -218,7 +218,7 @@ Verify the installation meets ALL criteria: 2. Running `deepwork install --platform ` completes without errors 3. Expected command files are created in the platform's command directory 4. Command file content matches the templates and job definitions -5. Established DeepWork jobs (deepwork_jobs, deepwork_policy) are installed correctly +5. Established DeepWork jobs (deepwork_jobs, deepwork_rules) are installed correctly 6. The platform can be used alongside existing platforms without conflicts If ALL criteria are met, include `✓ Quality Criteria Met`. diff --git a/.claude/commands/deepwork_jobs.implement.md b/.claude/commands/deepwork_jobs.implement.md index 132330f1..7c224679 100644 --- a/.claude/commands/deepwork_jobs.implement.md +++ b/.claude/commands/deepwork_jobs.implement.md @@ -19,9 +19,9 @@ hooks: 6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"? 7. **Sync Complete**: Has `deepwork sync` been run successfully? 8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`? - 9. **Policies Considered**: Have you thought about whether policies would benefit this job? - - If relevant policies were identified, did you explain them and offer to run `/deepwork_policy.define`? - - Not every job needs policies - only suggest when genuinely helpful. + 9. **Rules Considered**: Have you thought about whether rules would benefit this job? + - If relevant rules were identified, did you explain them and offer to run `/deepwork_rules.define`? + - Not every job needs rules - only suggest when genuinely helpful. If ANY criterion is not met, continue working to address it. If ALL criteria are satisfied, include `✓ Quality Criteria Met` in your response. @@ -200,19 +200,19 @@ This will: After running `deepwork sync`, look at the "To use the new commands" section in the output. **Relay these exact reload instructions to the user** so they know how to pick up the new commands. Don't just reference the sync output - tell them directly what they need to do (e.g., "Type 'exit' then run 'claude --resume'" for Claude Code, or "Run '/memory refresh'" for Gemini CLI). -### Step 7: Consider Policies for the New Job +### Step 7: Consider Rules for the New Job -After implementing the job, consider whether there are **policies** that would help enforce quality or consistency when working with this job's domain. +After implementing the job, consider whether there are **rules** that would help enforce quality or consistency when working with this job's domain. -**What are policies?** +**What are rules?** -Policies are automated guardrails defined in `.deepwork.policy.yml` that trigger when certain files change during an AI session. They help ensure: +Rules are automated guardrails stored as markdown files in `.deepwork/rules/` that trigger when certain files change during an AI session. They help ensure: - Documentation stays in sync with code - Team guidelines are followed - Architectural decisions are respected - Quality standards are maintained -**When to suggest policies:** +**When to suggest rules:** Think about the job you just implemented and ask: - Does this job produce outputs that other files depend on? @@ -220,28 +220,28 @@ Think about the job you just implemented and ask: - Are there quality checks or reviews that should happen when certain files in this domain change? - Could changes to the job's output files impact other parts of the project? -**Examples of policies that might make sense:** +**Examples of rules that might make sense:** -| Job Type | Potential Policy | -|----------|------------------| +| Job Type | Potential Rule | +|----------|----------------| | API Design | "Update API docs when endpoint definitions change" | | Database Schema | "Review migrations when schema files change" | | Competitive Research | "Update strategy docs when competitor analysis changes" | | Feature Development | "Update changelog when feature files change" | | Configuration Management | "Update install guide when config files change" | -**How to offer policy creation:** +**How to offer rule creation:** -If you identify one or more policies that would benefit the user, explain: -1. **What the policy would do** - What triggers it and what action it prompts +If you identify one or more rules that would benefit the user, explain: +1. **What the rule would do** - What triggers it and what action it prompts 2. **Why it would help** - How it prevents common mistakes or keeps things in sync 3. **What files it would watch** - The trigger patterns Then ask the user: -> "Would you like me to create this policy for you? I can run `/deepwork_policy.define` to set it up." +> "Would you like me to create this rule for you? I can run `/deepwork_rules.define` to set it up." -If the user agrees, invoke the `/deepwork_policy.define` command to guide them through creating the policy. +If the user agrees, invoke the `/deepwork_rules.define` command to guide them through creating the rule. **Example dialogue:** @@ -250,15 +250,15 @@ Based on the competitive_research job you just created, I noticed that when competitor analysis files change, it would be helpful to remind you to update your strategy documentation. -I'd suggest a policy like: +I'd suggest a rule like: - **Name**: "Update strategy when competitor analysis changes" - **Trigger**: `**/positioning_report.md` - **Action**: Prompt to review and update `docs/strategy.md` -Would you like me to create this policy? I can run `/deepwork_policy.define` to set it up. +Would you like me to create this rule? I can run `/deepwork_rules.define` to set it up. ``` -**Note:** Not every job needs policies. Only suggest them when they would genuinely help maintain consistency or quality. Don't force policies where they don't make sense. +**Note:** Not every job needs rules. Only suggest them when they would genuinely help maintain consistency or quality. Don't force rules where they don't make sense. ## Example Implementation @@ -292,8 +292,8 @@ Before marking this step complete, ensure: - [ ] `deepwork sync` executed successfully - [ ] Commands generated in platform directory - [ ] User informed to follow reload instructions from `deepwork sync` -- [ ] Considered whether policies would benefit this job (Step 7) -- [ ] If policies suggested, offered to run `/deepwork_policy.define` +- [ ] Considered whether rules would benefit this job (Step 7) +- [ ] If rules suggested, offered to run `/deepwork_rules.define` ## Quality Criteria @@ -305,7 +305,7 @@ Before marking this step complete, ensure: - Steps with user inputs explicitly use "ask structured questions" phrasing - Sync completed successfully - Commands available for use -- Thoughtfully considered relevant policies for the job domain +- Thoughtfully considered relevant rules for the job domain ## Inputs @@ -355,9 +355,9 @@ Verify the implementation meets ALL quality criteria before completing: 6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"? 7. **Sync Complete**: Has `deepwork sync` been run successfully? 8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`? -9. **Policies Considered**: Have you thought about whether policies would benefit this job? - - If relevant policies were identified, did you explain them and offer to run `/deepwork_policy.define`? - - Not every job needs policies - only suggest when genuinely helpful. +9. **Rules Considered**: Have you thought about whether rules would benefit this job? + - If relevant rules were identified, did you explain them and offer to run `/deepwork_rules.define`? + - Not every job needs rules - only suggest when genuinely helpful. If ANY criterion is not met, continue working to address it. If ALL criteria are satisfied, include `✓ Quality Criteria Met` in your response. diff --git a/.claude/commands/deepwork_policy.define.md b/.claude/commands/deepwork_policy.define.md deleted file mode 100644 index 9e7d1c20..00000000 --- a/.claude/commands/deepwork_policy.define.md +++ /dev/null @@ -1,288 +0,0 @@ ---- -description: Create or update policy entries in .deepwork.policy.yml ---- - -# deepwork_policy.define - -**Standalone command** in the **deepwork_policy** job - can be run anytime - -**Summary**: Policy enforcement for AI agent sessions - -## Job Overview - -Manages policies that automatically trigger when certain files change during an AI agent session. -Policies help ensure that code changes follow team guidelines, documentation is updated, -and architectural decisions are respected. - -Policies are defined in a `.deepwork.policy.yml` file at the root of your project. Each policy -specifies: -- Trigger patterns: Glob patterns for files that, when changed, should trigger the policy -- Safety patterns: Glob patterns for files that, if also changed, mean the policy doesn't need to fire -- Instructions: What the agent should do when the policy triggers - -Example use cases: -- Update installation docs when configuration files change -- Require security review when authentication code is modified -- Ensure API documentation stays in sync with API code -- Remind developers to update changelogs - - - -## Instructions - -# Define Policy - -## Objective - -Create or update policy entries in the `.deepwork.policy.yml` file to enforce team guidelines, documentation requirements, or other constraints when specific files change. - -## Task - -Guide the user through defining a new policy by asking structured questions. **Do not create the policy without first understanding what they want to enforce.** - -**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. - -### Step 1: Understand the Policy Purpose - -Start by asking structured questions to understand what the user wants to enforce: - -1. **What guideline or constraint should this policy enforce?** - - What situation triggers the need for action? - - What files or directories, when changed, should trigger this policy? - - Examples: "When config files change", "When API code changes", "When database schema changes" - -2. **What action should be taken?** - - What should the agent do when the policy triggers? - - Update documentation? Perform a security review? Update tests? - - Is there a specific file or process that needs attention? - -3. **Are there any "safety" conditions?** - - Are there files that, if also changed, mean the policy doesn't need to fire? - - For example: If config changes AND install_guide.md changes, assume docs are already updated - - This prevents redundant prompts when the user has already done the right thing - -### Step 2: Define the Trigger Patterns - -Help the user define glob patterns for files that should trigger the policy: - -**Common patterns:** -- `src/**/*.py` - All Python files in src directory (recursive) -- `app/config/**/*` - All files in app/config directory -- `*.md` - All markdown files in root -- `src/api/**/*` - All files in the API directory -- `migrations/**/*.sql` - All SQL migrations - -**Pattern syntax:** -- `*` - Matches any characters within a single path segment -- `**` - Matches any characters across multiple path segments (recursive) -- `?` - Matches a single character - -### Step 3: Define Safety Patterns (Optional) - -If there are files that, when also changed, mean the policy shouldn't fire: - -**Examples:** -- Policy: "Update install guide when config changes" - - Trigger: `app/config/**/*` - - Safety: `docs/install_guide.md` (if already updated, don't prompt) - -- Policy: "Security review for auth changes" - - Trigger: `src/auth/**/*` - - Safety: `SECURITY.md`, `docs/security_review.md` - -### Step 3b: Choose the Comparison Mode (Optional) - -The `compare_to` field controls what baseline is used when detecting "changed files": - -**Options:** -- `base` (default) - Compares to the base of the current branch (merge-base with main/master). This is the most common choice for feature branches, as it shows all changes made on the branch. -- `default_tip` - Compares to the current tip of the default branch (main/master). Useful when you want to see the difference from what's currently in production. -- `prompt` - Compares to the state at the start of each prompt. Useful for policies that should only fire based on changes made during a single agent response. - -**When to use each:** -- **base**: Best for most policies. "Did this branch change config files?" → trigger docs review -- **default_tip**: For policies about what's different from production/main -- **prompt**: For policies that should only consider very recent changes within the current session - -Most policies should use the default (`base`) and don't need to specify `compare_to`. - -### Step 4: Write the Instructions - -Create clear, actionable instructions for what the agent should do when the policy fires. - -**Good instructions include:** -- What to check or review -- What files might need updating -- Specific actions to take -- Quality criteria for completion - -**Example:** -``` -Configuration files have changed. Please: -1. Review docs/install_guide.md for accuracy -2. Update any installation steps that reference changed config -3. Verify environment variable documentation is current -4. Test that installation instructions still work -``` - -### Step 5: Create the Policy Entry - -Create or update `.deepwork.policy.yml` in the project root. - -**File Location**: `.deepwork.policy.yml` (root of project) - -**Format**: -```yaml -- name: "[Friendly name for the policy]" - trigger: "[glob pattern]" # or array: ["pattern1", "pattern2"] - safety: "[glob pattern]" # optional, or array - compare_to: "base" # optional: "base" (default), "default_tip", or "prompt" - instructions: | - [Multi-line instructions for the agent...] -``` - -**Alternative with instructions_file**: -```yaml -- name: "[Friendly name for the policy]" - trigger: "[glob pattern]" - safety: "[glob pattern]" - compare_to: "base" # optional - instructions_file: "path/to/instructions.md" -``` - -### Step 6: Verify the Policy - -After creating the policy: - -1. **Check the YAML syntax** - Ensure valid YAML formatting -2. **Test trigger patterns** - Verify patterns match intended files -3. **Review instructions** - Ensure they're clear and actionable -4. **Check for conflicts** - Ensure the policy doesn't conflict with existing ones - -## Example Policies - -### Update Documentation on Config Changes -```yaml -- name: "Update install guide on config changes" - trigger: "app/config/**/*" - safety: "docs/install_guide.md" - instructions: | - Configuration files have been modified. Please review docs/install_guide.md - and update it if any installation instructions need to change based on the - new configuration. -``` - -### Security Review for Auth Code -```yaml -- name: "Security review for authentication changes" - trigger: - - "src/auth/**/*" - - "src/security/**/*" - safety: - - "SECURITY.md" - - "docs/security_audit.md" - instructions: | - Authentication or security code has been changed. Please: - 1. Review for hardcoded credentials or secrets - 2. Check input validation on user inputs - 3. Verify access control logic is correct - 4. Update security documentation if needed -``` - -### API Documentation Sync -```yaml -- name: "API documentation update" - trigger: "src/api/**/*.py" - safety: "docs/api/**/*.md" - instructions: | - API code has changed. Please verify that API documentation in docs/api/ - is up to date with the code changes. Pay special attention to: - - New or changed endpoints - - Modified request/response schemas - - Updated authentication requirements -``` - -## Output Format - -### .deepwork.policy.yml -Create or update this file at the project root with the new policy entry. - -## Quality Criteria - -- Asked structured questions to understand user requirements -- Policy name is clear and descriptive -- Trigger patterns accurately match the intended files -- Safety patterns prevent unnecessary triggering -- Instructions are actionable and specific -- YAML is valid and properly formatted - -## Context - -Policies are evaluated automatically when you finish working on a task. The system: -1. Determines which files have changed based on each policy's `compare_to` setting: - - `base` (default): Files changed since the branch diverged from main/master - - `default_tip`: Files different from the current main/master branch - - `prompt`: Files changed since the last prompt submission -2. Checks if any changes match policy trigger patterns -3. Skips policies where safety patterns also matched -4. Prompts you with instructions for any triggered policies - -You can mark a policy as addressed by including `✓ Policy Name` in your response (replace Policy Name with the actual policy name). This tells the system you've already handled that policy's requirements. - - -## Inputs - -### User Parameters - -Please gather the following information from the user: -- **policy_purpose**: What guideline or constraint should this policy enforce? - - -## Work Branch Management - -All work for this job should be done on a dedicated work branch: - -1. **Check current branch**: - - If already on a work branch for this job (format: `deepwork/deepwork_policy-[instance]-[date]`), continue using it - - If on main/master, create a new work branch - -2. **Create work branch** (if needed): - ```bash - git checkout -b deepwork/deepwork_policy-[instance]-$(date +%Y%m%d) - ``` - Replace `[instance]` with a descriptive identifier (e.g., `acme`, `q1-launch`, etc.) - -## Output Requirements - -Create the following output(s): -- `.deepwork.policy.yml` -Ensure all outputs are: -- Well-formatted and complete -- Ready for review or use by subsequent steps - -## Completion - -After completing this step: - -1. **Verify outputs**: Confirm all required files have been created - -2. **Inform the user**: - - The define command is complete - - Outputs created: .deepwork.policy.yml - - This command can be run again anytime to make further changes - -## Command Complete - -This is a standalone command that can be run anytime. The outputs are ready for use. - -Consider: -- Reviewing the outputs -- Running `deepwork sync` if job definitions were changed -- Re-running this command later if further changes are needed - ---- - -## Context Files - -- Job definition: `.deepwork/jobs/deepwork_policy/job.yml` -- Step instructions: `.deepwork/jobs/deepwork_policy/steps/define.md` \ No newline at end of file diff --git a/.claude/commands/deepwork_rules.define.md b/.claude/commands/deepwork_rules.define.md new file mode 100644 index 00000000..148247f2 --- /dev/null +++ b/.claude/commands/deepwork_rules.define.md @@ -0,0 +1,339 @@ +--- +description: Create a new rule file in .deepwork/rules/ +--- + +# deepwork_rules.define + +**Standalone command** in the **deepwork_rules** job - can be run anytime + +**Summary**: Rules enforcement for AI agent sessions + +## Job Overview + +Manages rules that automatically trigger when certain files change during an AI agent session. +Rules help ensure that code changes follow team guidelines, documentation is updated, +and architectural decisions are respected. + +Rules are stored as individual markdown files with YAML frontmatter in the `.deepwork/rules/` +directory. Each rule file specifies: +- Detection mode: trigger/safety, set (bidirectional), or pair (directional) +- Patterns: Glob patterns for matching files, with optional variable capture +- Instructions: Markdown content describing what the agent should do + +Example use cases: +- Update installation docs when configuration files change +- Require security review when authentication code is modified +- Ensure API documentation stays in sync with API code +- Enforce source/test file pairing + + + +## Instructions + +# Define Rule + +## Objective + +Create a new rule file in the `.deepwork/rules/` directory to enforce team guidelines, documentation requirements, or other constraints when specific files change. + +## Task + +Guide the user through defining a new rule by asking structured questions. **Do not create the rule without first understanding what they want to enforce.** + +**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. + +### Step 1: Understand the Rule Purpose + +Start by asking structured questions to understand what the user wants to enforce: + +1. **What guideline or constraint should this rule enforce?** + - What situation triggers the need for action? + - What files or directories, when changed, should trigger this rule? + - Examples: "When config files change", "When API code changes", "When database schema changes" + +2. **What action should be taken?** + - What should the agent do when the rule triggers? + - Update documentation? Perform a security review? Update tests? + - Is there a specific file or process that needs attention? + +3. **Are there any "safety" conditions?** + - Are there files that, if also changed, mean the rule doesn't need to fire? + - For example: If config changes AND install_guide.md changes, assume docs are already updated + - This prevents redundant prompts when the user has already done the right thing + +### Step 2: Choose the Detection Mode + +Help the user select the appropriate detection mode: + +**Trigger/Safety Mode** (most common): +- Fires when trigger patterns match AND no safety patterns match +- Use for: "When X changes, check Y" rules +- Example: When config changes, verify install docs + +**Set Mode** (bidirectional correspondence): +- Fires when files that should change together don't all change +- Use for: Source/test pairing, model/migration sync +- Example: `src/foo.py` and `tests/foo_test.py` should change together + +**Pair Mode** (directional correspondence): +- Fires when a trigger file changes but expected files don't +- Changes to expected files alone do NOT trigger +- Use for: API code requires documentation updates (but docs can update independently) + +### Step 3: Define the Patterns + +Help the user define glob patterns for files. + +**Common patterns:** +- `src/**/*.py` - All Python files in src directory (recursive) +- `app/config/**/*` - All files in app/config directory +- `*.md` - All markdown files in root +- `src/api/**/*` - All files in the API directory +- `migrations/**/*.sql` - All SQL migrations + +**Variable patterns (for set/pair modes):** +- `src/{path}.py` - Captures path variable (e.g., `foo/bar` from `src/foo/bar.py`) +- `tests/{path}_test.py` - Uses same path variable in corresponding file +- `{name}` matches single segment, `{path}` matches multiple segments + +**Pattern syntax:** +- `*` - Matches any characters within a single path segment +- `**` - Matches any characters across multiple path segments (recursive) +- `?` - Matches a single character + +### Step 4: Choose the Comparison Mode (Optional) + +The `compare_to` field controls what baseline is used when detecting "changed files": + +**Options:** +- `base` (default) - Compares to the base of the current branch (merge-base with main/master). Best for feature branches. +- `default_tip` - Compares to the current tip of the default branch. Useful for seeing difference from production. +- `prompt` - Compares to the state at the start of each prompt. For rules about very recent changes. + +Most rules should use the default (`base`) and don't need to specify `compare_to`. + +### Step 5: Write the Instructions + +Create clear, actionable instructions for what the agent should do when the rule fires. + +**Good instructions include:** +- What to check or review +- What files might need updating +- Specific actions to take +- Quality criteria for completion + +**Template variables available in instructions:** +- `{trigger_files}` - Files that triggered the rule +- `{expected_files}` - Expected corresponding files (for set/pair modes) + +### Step 6: Create the Rule File + +Create a new file in `.deepwork/rules/` with a kebab-case filename: + +**File Location**: `.deepwork/rules/{rule-name}.md` + +**Format for Trigger/Safety Mode:** +```markdown +--- +name: Friendly Name for the Rule +trigger: "glob/pattern/**/*" # or array: ["pattern1", "pattern2"] +safety: "optional/pattern" # optional, or array +compare_to: base # optional: "base" (default), "default_tip", or "prompt" +--- +Instructions for the agent when this rule fires. + +Multi-line markdown content is supported. +``` + +**Format for Set Mode (bidirectional):** +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +Modified: {trigger_files} +Expected: {expected_files} +``` + +**Format for Pair Mode (directional):** +```markdown +--- +name: API Documentation +pair: + trigger: api/{path}.py + expects: docs/api/{path}.md +--- +API code requires documentation updates. + +Changed API: {trigger_files} +Update docs: {expected_files} +``` + +### Step 7: Verify the Rule + +After creating the rule: + +1. **Check the YAML frontmatter** - Ensure valid YAML formatting +2. **Test trigger patterns** - Verify patterns match intended files +3. **Review instructions** - Ensure they're clear and actionable +4. **Check for conflicts** - Ensure the rule doesn't conflict with existing ones + +## Example Rules + +### Update Documentation on Config Changes +`.deepwork/rules/config-docs.md`: +```markdown +--- +name: Update Install Guide on Config Changes +trigger: app/config/**/* +safety: docs/install_guide.md +--- +Configuration files have been modified. Please review docs/install_guide.md +and update it if any installation instructions need to change based on the +new configuration. +``` + +### Security Review for Auth Code +`.deepwork/rules/security-review.md`: +```markdown +--- +name: Security Review for Authentication Changes +trigger: + - src/auth/**/* + - src/security/**/* +safety: + - SECURITY.md + - docs/security_audit.md +--- +Authentication or security code has been changed. Please: + +1. Review for hardcoded credentials or secrets +2. Check input validation on user inputs +3. Verify access control logic is correct +4. Update security documentation if needed +``` + +### Source/Test Pairing +`.deepwork/rules/source-test-pairing.md`: +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +When modifying source code, ensure corresponding tests are updated. +When adding tests, ensure they test actual source code. + +Modified: {trigger_files} +Expected: {expected_files} +``` + +### API Documentation Sync +`.deepwork/rules/api-docs.md`: +```markdown +--- +name: API Documentation Update +pair: + trigger: src/api/{path}.py + expects: docs/api/{path}.md +--- +API code has changed. Please verify that API documentation in docs/api/ +is up to date with the code changes. Pay special attention to: + +- New or changed endpoints +- Modified request/response schemas +- Updated authentication requirements + +Changed API: {trigger_files} +Update: {expected_files} +``` + +## Output Format + +### .deepwork/rules/{rule-name}.md +Create a new file with the rule definition using YAML frontmatter and markdown body. + +## Quality Criteria + +- Asked structured questions to understand user requirements +- Rule name is clear and descriptive (used in promise tags) +- Correct detection mode selected for the use case +- Patterns accurately match the intended files +- Safety patterns prevent unnecessary triggering (if applicable) +- Instructions are actionable and specific +- YAML frontmatter is valid + +## Context + +Rules are evaluated automatically when the agent finishes a task. The system: +1. Determines which files have changed based on each rule's `compare_to` setting +2. Evaluates rules based on their detection mode (trigger/safety, set, or pair) +3. Skips rules where the correspondence is satisfied (for set/pair) or safety matched +4. Prompts you with instructions for any triggered rules + +You can mark a rule as addressed by including `Rule Name` in your response (replace Rule Name with the actual rule name from the `name` field). This tells the system you've already handled that rule's requirements. + + +## Inputs + +### User Parameters + +Please gather the following information from the user: +- **rule_purpose**: What guideline or constraint should this rule enforce? + + +## Work Branch Management + +All work for this job should be done on a dedicated work branch: + +1. **Check current branch**: + - If already on a work branch for this job (format: `deepwork/deepwork_rules-[instance]-[date]`), continue using it + - If on main/master, create a new work branch + +2. **Create work branch** (if needed): + ```bash + git checkout -b deepwork/deepwork_rules-[instance]-$(date +%Y%m%d) + ``` + Replace `[instance]` with a descriptive identifier (e.g., `acme`, `q1-launch`, etc.) + +## Output Requirements + +Create the following output(s): +- `.deepwork/rules/{rule-name}.md` +Ensure all outputs are: +- Well-formatted and complete +- Ready for review or use by subsequent steps + +## Completion + +After completing this step: + +1. **Verify outputs**: Confirm all required files have been created + +2. **Inform the user**: + - The define command is complete + - Outputs created: .deepwork/rules/{rule-name}.md + - This command can be run again anytime to make further changes + +## Command Complete + +This is a standalone command that can be run anytime. The outputs are ready for use. + +Consider: +- Reviewing the outputs +- Running `deepwork sync` if job definitions were changed +- Re-running this command later if further changes are needed + +--- + +## Context Files + +- Job definition: `.deepwork/jobs/deepwork_rules/job.yml` +- Step instructions: `.deepwork/jobs/deepwork_rules/steps/define.md` \ No newline at end of file diff --git a/.claude/commands/update.job.md b/.claude/commands/update.job.md index 1d2af384..9698eecf 100644 --- a/.claude/commands/update.job.md +++ b/.claude/commands/update.job.md @@ -38,7 +38,7 @@ hooks: ## Job Overview A workflow for maintaining standard jobs bundled with DeepWork. Standard jobs -(like `deepwork_jobs` and `deepwork_policy`) are source-controlled in +(like `deepwork_jobs` and `deepwork_rules`) are source-controlled in `src/deepwork/standard_jobs/` and must be edited there—never in `.deepwork/jobs/` or `.claude/commands/` directly. @@ -82,7 +82,7 @@ Standard jobs exist in THREE locations, but only ONE is the source of truth: #### 1. Identify the Standard Job to Update From conversation context, determine: -- Which standard job needs updating (e.g., `deepwork_jobs`, `deepwork_policy`) +- Which standard job needs updating (e.g., `deepwork_jobs`, `deepwork_rules`) - What changes are needed (job.yml, step instructions, hooks, etc.) Current standard jobs: diff --git a/.claude/settings.json b/.claude/settings.json index 4b7a20e6..d2fd4875 100644 --- a/.claude/settings.json +++ b/.claude/settings.json @@ -97,7 +97,7 @@ "hooks": [ { "type": "command", - "command": ".deepwork/jobs/deepwork_policy/hooks/user_prompt_submit.sh" + "command": ".deepwork/jobs/deepwork_rules/hooks/user_prompt_submit.sh" } ] } @@ -108,7 +108,7 @@ "hooks": [ { "type": "command", - "command": ".deepwork/jobs/deepwork_policy/hooks/policy_stop_hook.sh" + "command": "python -m deepwork.hooks.rules_check" } ] } diff --git a/.deepwork.policy.yml b/.deepwork.policy.yml deleted file mode 100644 index f2721da3..00000000 --- a/.deepwork.policy.yml +++ /dev/null @@ -1,71 +0,0 @@ -- name: "README Accuracy" - trigger: "src/**/*" - safety: "README.md" - instructions: | - Source code in src/ has been modified. Please review README.md for accuracy: - 1. Verify project overview still reflects current functionality - 2. Check that usage examples are still correct - 3. Ensure installation/setup instructions remain valid - 4. Update any sections that reference changed code - -- name: "Architecture Documentation Accuracy" - trigger: "src/**/*" - safety: "doc/architecture.md" - instructions: | - Source code in src/ has been modified. Please review doc/architecture.md for accuracy: - 1. Verify the documented architecture matches the current implementation - 2. Check that file paths and directory structures are still correct - 3. Ensure component descriptions reflect actual behavior - 4. Update any diagrams or flows that may have changed - -- name: "Standard Jobs Source of Truth" - trigger: - - ".deepwork/jobs/deepwork_jobs/**/*" - - ".deepwork/jobs/deepwork_policy/**/*" - safety: - - "src/deepwork/standard_jobs/deepwork_jobs/**/*" - - "src/deepwork/standard_jobs/deepwork_policy/**/*" - instructions: | - You modified files in `.deepwork/jobs/deepwork_jobs/` or `.deepwork/jobs/deepwork_policy/`. - - **These are installed copies, NOT the source of truth!** - - Standard jobs (deepwork_jobs, deepwork_policy) must be edited in their source location: - - Source: `src/deepwork/standard_jobs/[job_name]/` - - Installed copy: `.deepwork/jobs/[job_name]/` (DO NOT edit directly) - - **Required action:** - 1. Revert your changes to `.deepwork/jobs/deepwork_*/` - 2. Make the same changes in `src/deepwork/standard_jobs/[job_name]/` - 3. Run `deepwork install --platform claude` to sync changes - 4. Verify the changes propagated correctly - - See CLAUDE.md section "CRITICAL: Editing Standard Jobs" for details. - -- name: "Version and Changelog Update" - trigger: "src/**/*" - safety: - - "pyproject.toml" - - "CHANGELOG.md" - instructions: | - Source code in src/ has been modified. **You MUST evaluate whether version and changelog updates are needed.** - - **Evaluate the changes:** - 1. Is this a bug fix, new feature, breaking change, or internal refactor? - 2. Does this change affect the public API or user-facing behavior? - 3. Would users need to know about this change when upgrading? - - **If version update is needed:** - 1. Update the `version` field in `pyproject.toml` following semantic versioning: - - PATCH (0.1.x): Bug fixes, minor internal changes - - MINOR (0.x.0): New features, non-breaking changes - - MAJOR (x.0.0): Breaking changes - 2. Add an entry to `CHANGELOG.md` under an appropriate version header: - - Use categories: Added, Changed, Fixed, Removed, Deprecated, Security - - Include a clear, user-facing description of what changed - - Follow the Keep a Changelog format - - **If NO version update is needed** (e.g., tests only, comments, internal refactoring with no behavior change): - - Explicitly state why no version bump is required - - **This policy requires explicit action** - either update both files or justify why no update is needed. \ No newline at end of file diff --git a/.deepwork/.gitignore b/.deepwork/.gitignore index eed09d08..0ef10e54 100644 --- a/.deepwork/.gitignore +++ b/.deepwork/.gitignore @@ -1,3 +1,3 @@ # DeepWork temporary files -# These files are used for policy evaluation during sessions +# These files are used for rules evaluation during sessions .last_work_tree diff --git a/.deepwork/jobs/add_platform/job.yml b/.deepwork/jobs/add_platform/job.yml index 07544743..cca6d637 100644 --- a/.deepwork/jobs/add_platform/job.yml +++ b/.deepwork/jobs/add_platform/job.yml @@ -130,7 +130,7 @@ steps: 2. Running `deepwork install --platform ` completes without errors 3. Expected command files are created in the platform's command directory 4. Command file content matches the templates and job definitions - 5. Established DeepWork jobs (deepwork_jobs, deepwork_policy) are installed correctly + 5. Established DeepWork jobs (deepwork_jobs, deepwork_rules) are installed correctly 6. The platform can be used alongside existing platforms without conflicts If ALL criteria are met, include `✓ Quality Criteria Met`. diff --git a/.deepwork/jobs/add_platform/steps/verify.md b/.deepwork/jobs/add_platform/steps/verify.md index c4d35ffc..f3afe15a 100644 --- a/.deepwork/jobs/add_platform/steps/verify.md +++ b/.deepwork/jobs/add_platform/steps/verify.md @@ -52,7 +52,7 @@ Ensure the implementation step is complete: - `deepwork_jobs.define.md` exists (or equivalent for the platform) - `deepwork_jobs.implement.md` exists - `deepwork_jobs.refine.md` exists - - `deepwork_policy.define.md` exists + - `deepwork_rules.define.md` exists - All expected step commands exist 4. **Validate command file content** @@ -82,7 +82,7 @@ Ensure the implementation step is complete: - `deepwork install --platform ` completes without errors - All expected command files are created: - deepwork_jobs.define, implement, refine - - deepwork_policy.define + - deepwork_rules.define - Any other standard job commands - Command file content is correct: - Matches platform's expected format diff --git a/.deepwork/jobs/deepwork_jobs/job.yml b/.deepwork/jobs/deepwork_jobs/job.yml index e1afa5ee..e95aa2c0 100644 --- a/.deepwork/jobs/deepwork_jobs/job.yml +++ b/.deepwork/jobs/deepwork_jobs/job.yml @@ -77,9 +77,9 @@ steps: 6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"? 7. **Sync Complete**: Has `deepwork sync` been run successfully? 8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`? - 9. **Policies Considered**: Have you thought about whether policies would benefit this job? - - If relevant policies were identified, did you explain them and offer to run `/deepwork_policy.define`? - - Not every job needs policies - only suggest when genuinely helpful. + 9. **Rules Considered**: Have you thought about whether rules would benefit this job? + - If relevant rules were identified, did you explain them and offer to run `/deepwork_rules.define`? + - Not every job needs rules - only suggest when genuinely helpful. If ANY criterion is not met, continue working to address it. If ALL criteria are satisfied, include `✓ Quality Criteria Met` in your response. diff --git a/.deepwork/jobs/deepwork_jobs/steps/implement.md b/.deepwork/jobs/deepwork_jobs/steps/implement.md index a3a790f6..7771eaee 100644 --- a/.deepwork/jobs/deepwork_jobs/steps/implement.md +++ b/.deepwork/jobs/deepwork_jobs/steps/implement.md @@ -130,19 +130,19 @@ This will: After running `deepwork sync`, look at the "To use the new commands" section in the output. **Relay these exact reload instructions to the user** so they know how to pick up the new commands. Don't just reference the sync output - tell them directly what they need to do (e.g., "Type 'exit' then run 'claude --resume'" for Claude Code, or "Run '/memory refresh'" for Gemini CLI). -### Step 7: Consider Policies for the New Job +### Step 7: Consider Rules for the New Job -After implementing the job, consider whether there are **policies** that would help enforce quality or consistency when working with this job's domain. +After implementing the job, consider whether there are **rules** that would help enforce quality or consistency when working with this job's domain. -**What are policies?** +**What are rules?** -Policies are automated guardrails defined in `.deepwork.policy.yml` that trigger when certain files change during an AI session. They help ensure: +Rules are automated guardrails stored as markdown files in `.deepwork/rules/` that trigger when certain files change during an AI session. They help ensure: - Documentation stays in sync with code - Team guidelines are followed - Architectural decisions are respected - Quality standards are maintained -**When to suggest policies:** +**When to suggest rules:** Think about the job you just implemented and ask: - Does this job produce outputs that other files depend on? @@ -150,28 +150,28 @@ Think about the job you just implemented and ask: - Are there quality checks or reviews that should happen when certain files in this domain change? - Could changes to the job's output files impact other parts of the project? -**Examples of policies that might make sense:** +**Examples of rules that might make sense:** -| Job Type | Potential Policy | -|----------|------------------| +| Job Type | Potential Rule | +|----------|----------------| | API Design | "Update API docs when endpoint definitions change" | | Database Schema | "Review migrations when schema files change" | | Competitive Research | "Update strategy docs when competitor analysis changes" | | Feature Development | "Update changelog when feature files change" | | Configuration Management | "Update install guide when config files change" | -**How to offer policy creation:** +**How to offer rule creation:** -If you identify one or more policies that would benefit the user, explain: -1. **What the policy would do** - What triggers it and what action it prompts +If you identify one or more rules that would benefit the user, explain: +1. **What the rule would do** - What triggers it and what action it prompts 2. **Why it would help** - How it prevents common mistakes or keeps things in sync 3. **What files it would watch** - The trigger patterns Then ask the user: -> "Would you like me to create this policy for you? I can run `/deepwork_policy.define` to set it up." +> "Would you like me to create this rule for you? I can run `/deepwork_rules.define` to set it up." -If the user agrees, invoke the `/deepwork_policy.define` command to guide them through creating the policy. +If the user agrees, invoke the `/deepwork_rules.define` command to guide them through creating the rule. **Example dialogue:** @@ -180,15 +180,15 @@ Based on the competitive_research job you just created, I noticed that when competitor analysis files change, it would be helpful to remind you to update your strategy documentation. -I'd suggest a policy like: +I'd suggest a rule like: - **Name**: "Update strategy when competitor analysis changes" - **Trigger**: `**/positioning_report.md` - **Action**: Prompt to review and update `docs/strategy.md` -Would you like me to create this policy? I can run `/deepwork_policy.define` to set it up. +Would you like me to create this rule? I can run `/deepwork_rules.define` to set it up. ``` -**Note:** Not every job needs policies. Only suggest them when they would genuinely help maintain consistency or quality. Don't force policies where they don't make sense. +**Note:** Not every job needs rules. Only suggest them when they would genuinely help maintain consistency or quality. Don't force rules where they don't make sense. ## Example Implementation @@ -222,8 +222,8 @@ Before marking this step complete, ensure: - [ ] `deepwork sync` executed successfully - [ ] Commands generated in platform directory - [ ] User informed to follow reload instructions from `deepwork sync` -- [ ] Considered whether policies would benefit this job (Step 7) -- [ ] If policies suggested, offered to run `/deepwork_policy.define` +- [ ] Considered whether rules would benefit this job (Step 7) +- [ ] If rules suggested, offered to run `/deepwork_rules.define` ## Quality Criteria @@ -235,4 +235,4 @@ Before marking this step complete, ensure: - Steps with user inputs explicitly use "ask structured questions" phrasing - Sync completed successfully - Commands available for use -- Thoughtfully considered relevant policies for the job domain +- Thoughtfully considered relevant rules for the job domain diff --git a/.deepwork/jobs/deepwork_policy/hooks/global_hooks.yml b/.deepwork/jobs/deepwork_policy/hooks/global_hooks.yml deleted file mode 100644 index 0e024fc7..00000000 --- a/.deepwork/jobs/deepwork_policy/hooks/global_hooks.yml +++ /dev/null @@ -1,8 +0,0 @@ -# DeepWork Policy Hooks Configuration -# Maps Claude Code lifecycle events to hook scripts - -UserPromptSubmit: - - user_prompt_submit.sh - -Stop: - - policy_stop_hook.sh diff --git a/.deepwork/jobs/deepwork_policy/hooks/policy_stop_hook.sh b/.deepwork/jobs/deepwork_policy/hooks/policy_stop_hook.sh deleted file mode 100755 index b12d456c..00000000 --- a/.deepwork/jobs/deepwork_policy/hooks/policy_stop_hook.sh +++ /dev/null @@ -1,56 +0,0 @@ -#!/bin/bash -# policy_stop_hook.sh - Evaluates policies when the agent stops -# -# This script is called as a Claude Code Stop hook. It: -# 1. Evaluates policies from .deepwork.policy.yml -# 2. Computes changed files based on each policy's compare_to setting -# 3. Checks for tags in the conversation transcript -# 4. Returns JSON to block stop if policies need attention - -set -e - -# Check if policy file exists -if [ ! -f .deepwork.policy.yml ]; then - # No policies defined, nothing to do - exit 0 -fi - -# Read the hook input JSON from stdin -HOOK_INPUT="" -if [ ! -t 0 ]; then - HOOK_INPUT=$(cat) -fi - -# Extract transcript_path from the hook input JSON using jq -# Claude Code passes: {"session_id": "...", "transcript_path": "...", ...} -TRANSCRIPT_PATH="" -if [ -n "${HOOK_INPUT}" ]; then - TRANSCRIPT_PATH=$(echo "${HOOK_INPUT}" | jq -r '.transcript_path // empty' 2>/dev/null || echo "") -fi - -# Extract conversation text from the JSONL transcript -# The transcript is JSONL format - each line is a JSON object -# We need to extract the text content from assistant messages -conversation_context="" -if [ -n "${TRANSCRIPT_PATH}" ] && [ -f "${TRANSCRIPT_PATH}" ]; then - # Extract text content from all assistant messages in the transcript - # Each line is a JSON object; we extract .message.content[].text for assistant messages - conversation_context=$(cat "${TRANSCRIPT_PATH}" | \ - grep -E '"role"\s*:\s*"assistant"' | \ - jq -r '.message.content // [] | map(select(.type == "text")) | map(.text) | join("\n")' 2>/dev/null | \ - tr -d '\0' || echo "") -fi - -# Call the Python evaluator -# The Python module handles: -# - Parsing the policy file -# - Computing changed files based on each policy's compare_to setting -# - Matching changed files against triggers/safety patterns -# - Checking for promise tags in the conversation context -# - Generating appropriate JSON output -result=$(echo "${conversation_context}" | python -m deepwork.hooks.evaluate_policies \ - --policy-file .deepwork.policy.yml \ - 2>/dev/null || echo '{}') - -# Output the result (JSON for Claude Code hooks) -echo "${result}" diff --git a/.deepwork/jobs/deepwork_policy/job.yml b/.deepwork/jobs/deepwork_policy/job.yml deleted file mode 100644 index 777894ed..00000000 --- a/.deepwork/jobs/deepwork_policy/job.yml +++ /dev/null @@ -1,37 +0,0 @@ -name: deepwork_policy -version: "0.2.0" -summary: "Policy enforcement for AI agent sessions" -description: | - Manages policies that automatically trigger when certain files change during an AI agent session. - Policies help ensure that code changes follow team guidelines, documentation is updated, - and architectural decisions are respected. - - Policies are defined in a `.deepwork.policy.yml` file at the root of your project. Each policy - specifies: - - Trigger patterns: Glob patterns for files that, when changed, should trigger the policy - - Safety patterns: Glob patterns for files that, if also changed, mean the policy doesn't need to fire - - Instructions: What the agent should do when the policy triggers - - Example use cases: - - Update installation docs when configuration files change - - Require security review when authentication code is modified - - Ensure API documentation stays in sync with API code - - Remind developers to update changelogs - -changelog: - - version: "0.1.0" - changes: "Initial version" - - version: "0.2.0" - changes: "Standardized on 'ask structured questions' phrasing for user input" - -steps: - - id: define - name: "Define Policy" - description: "Create or update policy entries in .deepwork.policy.yml" - instructions_file: steps/define.md - inputs: - - name: policy_purpose - description: "What guideline or constraint should this policy enforce?" - outputs: - - .deepwork.policy.yml - dependencies: [] diff --git a/.deepwork/jobs/deepwork_policy/steps/define.md b/.deepwork/jobs/deepwork_policy/steps/define.md deleted file mode 100644 index 302eda7f..00000000 --- a/.deepwork/jobs/deepwork_policy/steps/define.md +++ /dev/null @@ -1,198 +0,0 @@ -# Define Policy - -## Objective - -Create or update policy entries in the `.deepwork.policy.yml` file to enforce team guidelines, documentation requirements, or other constraints when specific files change. - -## Task - -Guide the user through defining a new policy by asking structured questions. **Do not create the policy without first understanding what they want to enforce.** - -**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. - -### Step 1: Understand the Policy Purpose - -Start by asking structured questions to understand what the user wants to enforce: - -1. **What guideline or constraint should this policy enforce?** - - What situation triggers the need for action? - - What files or directories, when changed, should trigger this policy? - - Examples: "When config files change", "When API code changes", "When database schema changes" - -2. **What action should be taken?** - - What should the agent do when the policy triggers? - - Update documentation? Perform a security review? Update tests? - - Is there a specific file or process that needs attention? - -3. **Are there any "safety" conditions?** - - Are there files that, if also changed, mean the policy doesn't need to fire? - - For example: If config changes AND install_guide.md changes, assume docs are already updated - - This prevents redundant prompts when the user has already done the right thing - -### Step 2: Define the Trigger Patterns - -Help the user define glob patterns for files that should trigger the policy: - -**Common patterns:** -- `src/**/*.py` - All Python files in src directory (recursive) -- `app/config/**/*` - All files in app/config directory -- `*.md` - All markdown files in root -- `src/api/**/*` - All files in the API directory -- `migrations/**/*.sql` - All SQL migrations - -**Pattern syntax:** -- `*` - Matches any characters within a single path segment -- `**` - Matches any characters across multiple path segments (recursive) -- `?` - Matches a single character - -### Step 3: Define Safety Patterns (Optional) - -If there are files that, when also changed, mean the policy shouldn't fire: - -**Examples:** -- Policy: "Update install guide when config changes" - - Trigger: `app/config/**/*` - - Safety: `docs/install_guide.md` (if already updated, don't prompt) - -- Policy: "Security review for auth changes" - - Trigger: `src/auth/**/*` - - Safety: `SECURITY.md`, `docs/security_review.md` - -### Step 3b: Choose the Comparison Mode (Optional) - -The `compare_to` field controls what baseline is used when detecting "changed files": - -**Options:** -- `base` (default) - Compares to the base of the current branch (merge-base with main/master). This is the most common choice for feature branches, as it shows all changes made on the branch. -- `default_tip` - Compares to the current tip of the default branch (main/master). Useful when you want to see the difference from what's currently in production. -- `prompt` - Compares to the state at the start of each prompt. Useful for policies that should only fire based on changes made during a single agent response. - -**When to use each:** -- **base**: Best for most policies. "Did this branch change config files?" → trigger docs review -- **default_tip**: For policies about what's different from production/main -- **prompt**: For policies that should only consider very recent changes within the current session - -Most policies should use the default (`base`) and don't need to specify `compare_to`. - -### Step 4: Write the Instructions - -Create clear, actionable instructions for what the agent should do when the policy fires. - -**Good instructions include:** -- What to check or review -- What files might need updating -- Specific actions to take -- Quality criteria for completion - -**Example:** -``` -Configuration files have changed. Please: -1. Review docs/install_guide.md for accuracy -2. Update any installation steps that reference changed config -3. Verify environment variable documentation is current -4. Test that installation instructions still work -``` - -### Step 5: Create the Policy Entry - -Create or update `.deepwork.policy.yml` in the project root. - -**File Location**: `.deepwork.policy.yml` (root of project) - -**Format**: -```yaml -- name: "[Friendly name for the policy]" - trigger: "[glob pattern]" # or array: ["pattern1", "pattern2"] - safety: "[glob pattern]" # optional, or array - compare_to: "base" # optional: "base" (default), "default_tip", or "prompt" - instructions: | - [Multi-line instructions for the agent...] -``` - -**Alternative with instructions_file**: -```yaml -- name: "[Friendly name for the policy]" - trigger: "[glob pattern]" - safety: "[glob pattern]" - compare_to: "base" # optional - instructions_file: "path/to/instructions.md" -``` - -### Step 6: Verify the Policy - -After creating the policy: - -1. **Check the YAML syntax** - Ensure valid YAML formatting -2. **Test trigger patterns** - Verify patterns match intended files -3. **Review instructions** - Ensure they're clear and actionable -4. **Check for conflicts** - Ensure the policy doesn't conflict with existing ones - -## Example Policies - -### Update Documentation on Config Changes -```yaml -- name: "Update install guide on config changes" - trigger: "app/config/**/*" - safety: "docs/install_guide.md" - instructions: | - Configuration files have been modified. Please review docs/install_guide.md - and update it if any installation instructions need to change based on the - new configuration. -``` - -### Security Review for Auth Code -```yaml -- name: "Security review for authentication changes" - trigger: - - "src/auth/**/*" - - "src/security/**/*" - safety: - - "SECURITY.md" - - "docs/security_audit.md" - instructions: | - Authentication or security code has been changed. Please: - 1. Review for hardcoded credentials or secrets - 2. Check input validation on user inputs - 3. Verify access control logic is correct - 4. Update security documentation if needed -``` - -### API Documentation Sync -```yaml -- name: "API documentation update" - trigger: "src/api/**/*.py" - safety: "docs/api/**/*.md" - instructions: | - API code has changed. Please verify that API documentation in docs/api/ - is up to date with the code changes. Pay special attention to: - - New or changed endpoints - - Modified request/response schemas - - Updated authentication requirements -``` - -## Output Format - -### .deepwork.policy.yml -Create or update this file at the project root with the new policy entry. - -## Quality Criteria - -- Asked structured questions to understand user requirements -- Policy name is clear and descriptive -- Trigger patterns accurately match the intended files -- Safety patterns prevent unnecessary triggering -- Instructions are actionable and specific -- YAML is valid and properly formatted - -## Context - -Policies are evaluated automatically when you finish working on a task. The system: -1. Determines which files have changed based on each policy's `compare_to` setting: - - `base` (default): Files changed since the branch diverged from main/master - - `default_tip`: Files different from the current main/master branch - - `prompt`: Files changed since the last prompt submission -2. Checks if any changes match policy trigger patterns -3. Skips policies where safety patterns also matched -4. Prompts you with instructions for any triggered policies - -You can mark a policy as addressed by including `✓ Policy Name` in your response (replace Policy Name with the actual policy name). This tells the system you've already handled that policy's requirements. diff --git a/.deepwork/jobs/deepwork_policy/hooks/capture_prompt_work_tree.sh b/.deepwork/jobs/deepwork_rules/hooks/capture_prompt_work_tree.sh similarity index 100% rename from .deepwork/jobs/deepwork_policy/hooks/capture_prompt_work_tree.sh rename to .deepwork/jobs/deepwork_rules/hooks/capture_prompt_work_tree.sh diff --git a/.deepwork/jobs/deepwork_rules/hooks/global_hooks.yml b/.deepwork/jobs/deepwork_rules/hooks/global_hooks.yml new file mode 100644 index 00000000..a310d31a --- /dev/null +++ b/.deepwork/jobs/deepwork_rules/hooks/global_hooks.yml @@ -0,0 +1,8 @@ +# DeepWork Rules Hooks Configuration +# Maps lifecycle events to hook scripts or Python modules + +UserPromptSubmit: + - user_prompt_submit.sh + +Stop: + - module: deepwork.hooks.rules_check diff --git a/.deepwork/jobs/deepwork_policy/hooks/user_prompt_submit.sh b/.deepwork/jobs/deepwork_rules/hooks/user_prompt_submit.sh similarity index 100% rename from .deepwork/jobs/deepwork_policy/hooks/user_prompt_submit.sh rename to .deepwork/jobs/deepwork_rules/hooks/user_prompt_submit.sh diff --git a/.deepwork/jobs/deepwork_rules/job.yml b/.deepwork/jobs/deepwork_rules/job.yml new file mode 100644 index 00000000..af540bc4 --- /dev/null +++ b/.deepwork/jobs/deepwork_rules/job.yml @@ -0,0 +1,39 @@ +name: deepwork_rules +version: "0.3.0" +summary: "Rules enforcement for AI agent sessions" +description: | + Manages rules that automatically trigger when certain files change during an AI agent session. + Rules help ensure that code changes follow team guidelines, documentation is updated, + and architectural decisions are respected. + + Rules are stored as individual markdown files with YAML frontmatter in the `.deepwork/rules/` + directory. Each rule file specifies: + - Detection mode: trigger/safety, set (bidirectional), or pair (directional) + - Patterns: Glob patterns for matching files, with optional variable capture + - Instructions: Markdown content describing what the agent should do + + Example use cases: + - Update installation docs when configuration files change + - Require security review when authentication code is modified + - Ensure API documentation stays in sync with API code + - Enforce source/test file pairing + +changelog: + - version: "0.1.0" + changes: "Initial version" + - version: "0.2.0" + changes: "Standardized on 'ask structured questions' phrasing for user input" + - version: "0.3.0" + changes: "Migrated to v2 format - individual markdown files in .deepwork/rules/" + +steps: + - id: define + name: "Define Rule" + description: "Create a new rule file in .deepwork/rules/" + instructions_file: steps/define.md + inputs: + - name: rule_purpose + description: "What guideline or constraint should this rule enforce?" + outputs: + - .deepwork/rules/{rule-name}.md + dependencies: [] diff --git a/.deepwork/jobs/deepwork_rules/rules/.gitkeep b/.deepwork/jobs/deepwork_rules/rules/.gitkeep new file mode 100644 index 00000000..429162b4 --- /dev/null +++ b/.deepwork/jobs/deepwork_rules/rules/.gitkeep @@ -0,0 +1,13 @@ +# This directory contains example rule templates. +# Copy and customize these files to create your own rules. +# +# Rule files use YAML frontmatter in markdown format: +# +# --- +# name: Rule Name +# trigger: "pattern/**/*" +# safety: "optional/pattern" +# --- +# Instructions in markdown here. +# +# See doc/rules_syntax.md for full documentation. diff --git a/.deepwork/jobs/deepwork_rules/rules/api-documentation-sync.md.example b/.deepwork/jobs/deepwork_rules/rules/api-documentation-sync.md.example new file mode 100644 index 00000000..427da7ae --- /dev/null +++ b/.deepwork/jobs/deepwork_rules/rules/api-documentation-sync.md.example @@ -0,0 +1,10 @@ +--- +name: API Documentation Sync +trigger: src/api/**/* +safety: docs/api/**/*.md +--- +API code has changed. Please verify that API documentation is up to date: + +- New or changed endpoints +- Modified request/response schemas +- Updated authentication requirements diff --git a/.deepwork/jobs/deepwork_rules/rules/readme-documentation.md.example b/.deepwork/jobs/deepwork_rules/rules/readme-documentation.md.example new file mode 100644 index 00000000..6be90c83 --- /dev/null +++ b/.deepwork/jobs/deepwork_rules/rules/readme-documentation.md.example @@ -0,0 +1,10 @@ +--- +name: README Documentation +trigger: src/**/* +safety: README.md +--- +Source code has been modified. Please review README.md for accuracy: + +1. Verify the project overview reflects current functionality +2. Check that usage examples are still correct +3. Ensure installation/setup instructions remain valid diff --git a/.deepwork/jobs/deepwork_rules/rules/security-review.md.example b/.deepwork/jobs/deepwork_rules/rules/security-review.md.example new file mode 100644 index 00000000..abce3194 --- /dev/null +++ b/.deepwork/jobs/deepwork_rules/rules/security-review.md.example @@ -0,0 +1,11 @@ +--- +name: Security Review for Auth Changes +trigger: + - src/auth/**/* + - src/security/**/* +--- +Authentication or security code has been changed. Please: + +1. Review for hardcoded credentials or secrets +2. Check input validation on user inputs +3. Verify access control logic is correct diff --git a/.deepwork/jobs/deepwork_rules/rules/source-test-pairing.md.example b/.deepwork/jobs/deepwork_rules/rules/source-test-pairing.md.example new file mode 100644 index 00000000..3ebd6968 --- /dev/null +++ b/.deepwork/jobs/deepwork_rules/rules/source-test-pairing.md.example @@ -0,0 +1,13 @@ +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +When modifying source code, ensure corresponding tests are updated. +When adding tests, ensure they test actual source code. + +Modified source: {trigger_files} +Expected tests: {expected_files} diff --git a/.deepwork/jobs/deepwork_rules/steps/define.md b/.deepwork/jobs/deepwork_rules/steps/define.md new file mode 100644 index 00000000..1e38a5e6 --- /dev/null +++ b/.deepwork/jobs/deepwork_rules/steps/define.md @@ -0,0 +1,249 @@ +# Define Rule + +## Objective + +Create a new rule file in the `.deepwork/rules/` directory to enforce team guidelines, documentation requirements, or other constraints when specific files change. + +## Task + +Guide the user through defining a new rule by asking structured questions. **Do not create the rule without first understanding what they want to enforce.** + +**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. + +### Step 1: Understand the Rule Purpose + +Start by asking structured questions to understand what the user wants to enforce: + +1. **What guideline or constraint should this rule enforce?** + - What situation triggers the need for action? + - What files or directories, when changed, should trigger this rule? + - Examples: "When config files change", "When API code changes", "When database schema changes" + +2. **What action should be taken?** + - What should the agent do when the rule triggers? + - Update documentation? Perform a security review? Update tests? + - Is there a specific file or process that needs attention? + +3. **Are there any "safety" conditions?** + - Are there files that, if also changed, mean the rule doesn't need to fire? + - For example: If config changes AND install_guide.md changes, assume docs are already updated + - This prevents redundant prompts when the user has already done the right thing + +### Step 2: Choose the Detection Mode + +Help the user select the appropriate detection mode: + +**Trigger/Safety Mode** (most common): +- Fires when trigger patterns match AND no safety patterns match +- Use for: "When X changes, check Y" rules +- Example: When config changes, verify install docs + +**Set Mode** (bidirectional correspondence): +- Fires when files that should change together don't all change +- Use for: Source/test pairing, model/migration sync +- Example: `src/foo.py` and `tests/foo_test.py` should change together + +**Pair Mode** (directional correspondence): +- Fires when a trigger file changes but expected files don't +- Changes to expected files alone do NOT trigger +- Use for: API code requires documentation updates (but docs can update independently) + +### Step 3: Define the Patterns + +Help the user define glob patterns for files. + +**Common patterns:** +- `src/**/*.py` - All Python files in src directory (recursive) +- `app/config/**/*` - All files in app/config directory +- `*.md` - All markdown files in root +- `src/api/**/*` - All files in the API directory +- `migrations/**/*.sql` - All SQL migrations + +**Variable patterns (for set/pair modes):** +- `src/{path}.py` - Captures path variable (e.g., `foo/bar` from `src/foo/bar.py`) +- `tests/{path}_test.py` - Uses same path variable in corresponding file +- `{name}` matches single segment, `{path}` matches multiple segments + +**Pattern syntax:** +- `*` - Matches any characters within a single path segment +- `**` - Matches any characters across multiple path segments (recursive) +- `?` - Matches a single character + +### Step 4: Choose the Comparison Mode (Optional) + +The `compare_to` field controls what baseline is used when detecting "changed files": + +**Options:** +- `base` (default) - Compares to the base of the current branch (merge-base with main/master). Best for feature branches. +- `default_tip` - Compares to the current tip of the default branch. Useful for seeing difference from production. +- `prompt` - Compares to the state at the start of each prompt. For rules about very recent changes. + +Most rules should use the default (`base`) and don't need to specify `compare_to`. + +### Step 5: Write the Instructions + +Create clear, actionable instructions for what the agent should do when the rule fires. + +**Good instructions include:** +- What to check or review +- What files might need updating +- Specific actions to take +- Quality criteria for completion + +**Template variables available in instructions:** +- `{trigger_files}` - Files that triggered the rule +- `{expected_files}` - Expected corresponding files (for set/pair modes) + +### Step 6: Create the Rule File + +Create a new file in `.deepwork/rules/` with a kebab-case filename: + +**File Location**: `.deepwork/rules/{rule-name}.md` + +**Format for Trigger/Safety Mode:** +```markdown +--- +name: Friendly Name for the Rule +trigger: "glob/pattern/**/*" # or array: ["pattern1", "pattern2"] +safety: "optional/pattern" # optional, or array +compare_to: base # optional: "base" (default), "default_tip", or "prompt" +--- +Instructions for the agent when this rule fires. + +Multi-line markdown content is supported. +``` + +**Format for Set Mode (bidirectional):** +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +Modified: {trigger_files} +Expected: {expected_files} +``` + +**Format for Pair Mode (directional):** +```markdown +--- +name: API Documentation +pair: + trigger: api/{path}.py + expects: docs/api/{path}.md +--- +API code requires documentation updates. + +Changed API: {trigger_files} +Update docs: {expected_files} +``` + +### Step 7: Verify the Rule + +After creating the rule: + +1. **Check the YAML frontmatter** - Ensure valid YAML formatting +2. **Test trigger patterns** - Verify patterns match intended files +3. **Review instructions** - Ensure they're clear and actionable +4. **Check for conflicts** - Ensure the rule doesn't conflict with existing ones + +## Example Rules + +### Update Documentation on Config Changes +`.deepwork/rules/config-docs.md`: +```markdown +--- +name: Update Install Guide on Config Changes +trigger: app/config/**/* +safety: docs/install_guide.md +--- +Configuration files have been modified. Please review docs/install_guide.md +and update it if any installation instructions need to change based on the +new configuration. +``` + +### Security Review for Auth Code +`.deepwork/rules/security-review.md`: +```markdown +--- +name: Security Review for Authentication Changes +trigger: + - src/auth/**/* + - src/security/**/* +safety: + - SECURITY.md + - docs/security_audit.md +--- +Authentication or security code has been changed. Please: + +1. Review for hardcoded credentials or secrets +2. Check input validation on user inputs +3. Verify access control logic is correct +4. Update security documentation if needed +``` + +### Source/Test Pairing +`.deepwork/rules/source-test-pairing.md`: +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +When modifying source code, ensure corresponding tests are updated. +When adding tests, ensure they test actual source code. + +Modified: {trigger_files} +Expected: {expected_files} +``` + +### API Documentation Sync +`.deepwork/rules/api-docs.md`: +```markdown +--- +name: API Documentation Update +pair: + trigger: src/api/{path}.py + expects: docs/api/{path}.md +--- +API code has changed. Please verify that API documentation in docs/api/ +is up to date with the code changes. Pay special attention to: + +- New or changed endpoints +- Modified request/response schemas +- Updated authentication requirements + +Changed API: {trigger_files} +Update: {expected_files} +``` + +## Output Format + +### .deepwork/rules/{rule-name}.md +Create a new file with the rule definition using YAML frontmatter and markdown body. + +## Quality Criteria + +- Asked structured questions to understand user requirements +- Rule name is clear and descriptive (used in promise tags) +- Correct detection mode selected for the use case +- Patterns accurately match the intended files +- Safety patterns prevent unnecessary triggering (if applicable) +- Instructions are actionable and specific +- YAML frontmatter is valid + +## Context + +Rules are evaluated automatically when the agent finishes a task. The system: +1. Determines which files have changed based on each rule's `compare_to` setting +2. Evaluates rules based on their detection mode (trigger/safety, set, or pair) +3. Skips rules where the correspondence is satisfied (for set/pair) or safety matched +4. Prompts you with instructions for any triggered rules + +You can mark a rule as addressed by including `Rule Name` in your response (replace Rule Name with the actual rule name from the `name` field). This tells the system you've already handled that rule's requirements. diff --git a/.deepwork/jobs/update/job.yml b/.deepwork/jobs/update/job.yml index 0c6e2b6e..4f8ab339 100644 --- a/.deepwork/jobs/update/job.yml +++ b/.deepwork/jobs/update/job.yml @@ -3,7 +3,7 @@ version: "1.1.0" summary: "Update standard jobs in src/ and sync to installed locations" description: | A workflow for maintaining standard jobs bundled with DeepWork. Standard jobs - (like `deepwork_jobs` and `deepwork_policy`) are source-controlled in + (like `deepwork_jobs` and `deepwork_rules`) are source-controlled in `src/deepwork/standard_jobs/` and must be edited there—never in `.deepwork/jobs/` or `.claude/commands/` directly. diff --git a/.deepwork/jobs/update/steps/job.md b/.deepwork/jobs/update/steps/job.md index 0c7f70ab..b226b4f6 100644 --- a/.deepwork/jobs/update/steps/job.md +++ b/.deepwork/jobs/update/steps/job.md @@ -25,7 +25,7 @@ Standard jobs exist in THREE locations, but only ONE is the source of truth: #### 1. Identify the Standard Job to Update From conversation context, determine: -- Which standard job needs updating (e.g., `deepwork_jobs`, `deepwork_policy`) +- Which standard job needs updating (e.g., `deepwork_jobs`, `deepwork_rules`) - What changes are needed (job.yml, step instructions, hooks, etc.) Current standard jobs: diff --git a/.deepwork/rules/architecture-documentation-accuracy.md b/.deepwork/rules/architecture-documentation-accuracy.md new file mode 100644 index 00000000..42f74f88 --- /dev/null +++ b/.deepwork/rules/architecture-documentation-accuracy.md @@ -0,0 +1,10 @@ +--- +name: Architecture Documentation Accuracy +trigger: src/**/* +safety: doc/architecture.md +--- +Source code in src/ has been modified. Please review doc/architecture.md for accuracy: +1. Verify the documented architecture matches the current implementation +2. Check that file paths and directory structures are still correct +3. Ensure component descriptions reflect actual behavior +4. Update any diagrams or flows that may have changed diff --git a/.deepwork/rules/manual-test-command-action.md b/.deepwork/rules/manual-test-command-action.md new file mode 100644 index 00000000..966ab2de --- /dev/null +++ b/.deepwork/rules/manual-test-command-action.md @@ -0,0 +1,19 @@ +--- +name: "Manual Test: Command Action" +trigger: manual_tests/test_command_action/test_command_action.txt +action: + command: echo "$(date '+%Y-%m-%d %H:%M:%S') - Command triggered by edit to {file}" >> manual_tests/test_command_action/test_command_action_log.txt + run_for: each_match +compare_to: prompt +--- + +# Manual Test: Command Action + +This rule automatically appends a timestamped log entry when the +test file is edited. No agent prompt is shown - the command runs +automatically. + +## This tests: + +The command action feature where rules can execute shell commands +instead of prompting the agent. The command should be idempotent. diff --git a/.deepwork/rules/manual-test-multi-safety.md b/.deepwork/rules/manual-test-multi-safety.md new file mode 100644 index 00000000..4ce978cb --- /dev/null +++ b/.deepwork/rules/manual-test-multi-safety.md @@ -0,0 +1,25 @@ +--- +name: "Manual Test: Multi Safety" +trigger: manual_tests/test_multi_safety/test_multi_safety.py +safety: + - manual_tests/test_multi_safety/test_multi_safety_changelog.md + - manual_tests/test_multi_safety/test_multi_safety_version.txt +compare_to: prompt +--- + +# Manual Test: Multiple Safety Patterns + +You changed the source file without updating version info! + +**Changed:** `{trigger_files}` + +## What to do: + +1. Update the changelog: `manual_tests/test_multi_safety/test_multi_safety_changelog.md` +2. And/or update the version: `manual_tests/test_multi_safety/test_multi_safety_version.txt` +3. Or acknowledge with `Manual Test: Multi Safety` + +## This tests: + +Trigger/safety mode with MULTIPLE safety patterns. The rule is +suppressed if ANY of the safety files are also edited. diff --git a/.deepwork/rules/manual-test-pair-mode.md b/.deepwork/rules/manual-test-pair-mode.md new file mode 100644 index 00000000..9c2379bf --- /dev/null +++ b/.deepwork/rules/manual-test-pair-mode.md @@ -0,0 +1,26 @@ +--- +name: "Manual Test: Pair Mode" +pair: + trigger: manual_tests/test_pair_mode/test_pair_mode_trigger.py + expects: manual_tests/test_pair_mode/test_pair_mode_expected.md +compare_to: prompt +--- + +# Manual Test: Pair Mode (Directional Correspondence) + +API code changed without documentation update! + +**Changed:** `{trigger_files}` +**Expected:** `{expected_files}` + +## What to do: + +1. Update the API documentation in `test_pair_mode_expected.md` +2. Or acknowledge with `Manual Test: Pair Mode` + +## This tests: + +The "pair" detection mode where there's a ONE-WAY relationship. +When the trigger file changes, the expected file must also change. +BUT the expected file can change independently (docs can be updated +without requiring code changes). diff --git a/.deepwork/rules/manual-test-set-mode.md b/.deepwork/rules/manual-test-set-mode.md new file mode 100644 index 00000000..abe504ec --- /dev/null +++ b/.deepwork/rules/manual-test-set-mode.md @@ -0,0 +1,26 @@ +--- +name: "Manual Test: Set Mode" +set: + - manual_tests/test_set_mode/test_set_mode_source.py + - manual_tests/test_set_mode/test_set_mode_test.py +compare_to: prompt +--- + +# Manual Test: Set Mode (Bidirectional Correspondence) + +Source and test files must change together! + +**Changed:** `{trigger_files}` +**Missing:** `{expected_files}` + +## What to do: + +1. If you changed the source file, update the corresponding test file +2. If you changed the test file, ensure the source file reflects those changes +3. Or acknowledge with `Manual Test: Set Mode` + +## This tests: + +The "set" detection mode where files in a set must ALL change together. +This is bidirectional - the rule fires regardless of which file in the set +was edited first. diff --git a/.deepwork/rules/manual-test-trigger-safety.md b/.deepwork/rules/manual-test-trigger-safety.md new file mode 100644 index 00000000..b144a2a0 --- /dev/null +++ b/.deepwork/rules/manual-test-trigger-safety.md @@ -0,0 +1,21 @@ +--- +name: "Manual Test: Trigger Safety" +trigger: manual_tests/test_trigger_safety_mode/test_trigger_safety_mode.py +safety: manual_tests/test_trigger_safety_mode/test_trigger_safety_mode_doc.md +compare_to: prompt +--- + +# Manual Test: Trigger/Safety Mode + +You edited `{trigger_files}` without updating the documentation. + +## What to do: + +1. Review the changes in the source file +2. Update `manual_tests/test_trigger_safety_mode/test_trigger_safety_mode_doc.md` to reflect changes +3. Or acknowledge this is intentional with `Manual Test: Trigger Safety` + +## This tests: + +The basic trigger/safety detection mode where editing the trigger file +causes the rule to fire UNLESS the safety file is also edited. diff --git a/.deepwork/rules/readme-accuracy.md b/.deepwork/rules/readme-accuracy.md new file mode 100644 index 00000000..8284142b --- /dev/null +++ b/.deepwork/rules/readme-accuracy.md @@ -0,0 +1,10 @@ +--- +name: README Accuracy +trigger: src/**/* +safety: README.md +--- +Source code in src/ has been modified. Please review README.md for accuracy: +1. Verify project overview still reflects current functionality +2. Check that usage examples are still correct +3. Ensure installation/setup instructions remain valid +4. Update any sections that reference changed code diff --git a/.deepwork/rules/standard-jobs-source-of-truth.md b/.deepwork/rules/standard-jobs-source-of-truth.md new file mode 100644 index 00000000..3698489d --- /dev/null +++ b/.deepwork/rules/standard-jobs-source-of-truth.md @@ -0,0 +1,24 @@ +--- +name: Standard Jobs Source of Truth +trigger: + - .deepwork/jobs/deepwork_jobs/**/* + - .deepwork/jobs/deepwork_rules/**/* +safety: + - src/deepwork/standard_jobs/deepwork_jobs/**/* + - src/deepwork/standard_jobs/deepwork_rules/**/* +--- +You modified files in `.deepwork/jobs/deepwork_jobs/` or `.deepwork/jobs/deepwork_rules/`. + +**These are installed copies, NOT the source of truth!** + +Standard jobs (deepwork_jobs, deepwork_rules) must be edited in their source location: +- Source: `src/deepwork/standard_jobs/[job_name]/` +- Installed copy: `.deepwork/jobs/[job_name]/` (DO NOT edit directly) + +**Required action:** +1. Revert your changes to `.deepwork/jobs/deepwork_*/` +2. Make the same changes in `src/deepwork/standard_jobs/[job_name]/` +3. Run `deepwork install --platform claude` to sync changes +4. Verify the changes propagated correctly + +See CLAUDE.md section "CRITICAL: Editing Standard Jobs" for details. diff --git a/.deepwork/rules/version-and-changelog-update.md b/.deepwork/rules/version-and-changelog-update.md new file mode 100644 index 00000000..58e35088 --- /dev/null +++ b/.deepwork/rules/version-and-changelog-update.md @@ -0,0 +1,28 @@ +--- +name: Version and Changelog Update +trigger: src/**/* +safety: + - pyproject.toml + - CHANGELOG.md +--- +Source code in src/ has been modified. **You MUST evaluate whether version and changelog updates are needed.** + +**Evaluate the changes:** +1. Is this a bug fix, new feature, breaking change, or internal refactor? +2. Does this change affect the public API or user-facing behavior? +3. Would users need to know about this change when upgrading? + +**If version update is needed:** +1. Update the `version` field in `pyproject.toml` following semantic versioning: + - PATCH (0.1.x): Bug fixes, minor internal changes + - MINOR (0.x.0): New features, non-breaking changes + - MAJOR (x.0.0): Breaking changes +2. Add an entry to `CHANGELOG.md` under an appropriate version header: + - Use categories: Added, Changed, Fixed, Removed, Deprecated, Security + - Include a clear, user-facing description of what changed + - Follow the Keep a Changelog format + +**If NO version update is needed** (e.g., tests only, comments, internal refactoring with no behavior change): +- Explicitly state why no version bump is required + +**This rule requires explicit action** - either update both files or justify why no update is needed. diff --git a/.gemini/commands/add_platform/verify.toml b/.gemini/commands/add_platform/verify.toml index 1ee56ab8..acfd9671 100644 --- a/.gemini/commands/add_platform/verify.toml +++ b/.gemini/commands/add_platform/verify.toml @@ -96,7 +96,7 @@ Ensure the implementation step is complete: - `deepwork_jobs.define.md` exists (or equivalent for the platform) - `deepwork_jobs.implement.md` exists - `deepwork_jobs.refine.md` exists - - `deepwork_policy.define.md` exists + - `deepwork_rules.define.md` exists - All expected step commands exist 4. **Validate command file content** @@ -126,7 +126,7 @@ Ensure the implementation step is complete: - `deepwork install --platform ` completes without errors - All expected command files are created: - deepwork_jobs.define, implement, refine - - deepwork_policy.define + - deepwork_rules.define - Any other standard job commands - Command file content is correct: - Matches platform's expected format diff --git a/.gemini/commands/deepwork_jobs/implement.toml b/.gemini/commands/deepwork_jobs/implement.toml index 3e922243..4cc5a989 100644 --- a/.gemini/commands/deepwork_jobs/implement.toml +++ b/.gemini/commands/deepwork_jobs/implement.toml @@ -168,19 +168,19 @@ This will: After running `deepwork sync`, look at the "To use the new commands" section in the output. **Relay these exact reload instructions to the user** so they know how to pick up the new commands. Don't just reference the sync output - tell them directly what they need to do (e.g., "Type 'exit' then run 'claude --resume'" for Claude Code, or "Run '/memory refresh'" for Gemini CLI). -### Step 7: Consider Policies for the New Job +### Step 7: Consider Rules for the New Job -After implementing the job, consider whether there are **policies** that would help enforce quality or consistency when working with this job's domain. +After implementing the job, consider whether there are **rules** that would help enforce quality or consistency when working with this job's domain. -**What are policies?** +**What are rules?** -Policies are automated guardrails defined in `.deepwork.policy.yml` that trigger when certain files change during an AI session. They help ensure: +Rules are automated guardrails stored as markdown files in `.deepwork/rules/` that trigger when certain files change during an AI session. They help ensure: - Documentation stays in sync with code - Team guidelines are followed - Architectural decisions are respected - Quality standards are maintained -**When to suggest policies:** +**When to suggest rules:** Think about the job you just implemented and ask: - Does this job produce outputs that other files depend on? @@ -188,28 +188,28 @@ Think about the job you just implemented and ask: - Are there quality checks or reviews that should happen when certain files in this domain change? - Could changes to the job's output files impact other parts of the project? -**Examples of policies that might make sense:** +**Examples of rules that might make sense:** -| Job Type | Potential Policy | -|----------|------------------| +| Job Type | Potential Rule | +|----------|----------------| | API Design | "Update API docs when endpoint definitions change" | | Database Schema | "Review migrations when schema files change" | | Competitive Research | "Update strategy docs when competitor analysis changes" | | Feature Development | "Update changelog when feature files change" | | Configuration Management | "Update install guide when config files change" | -**How to offer policy creation:** +**How to offer rule creation:** -If you identify one or more policies that would benefit the user, explain: -1. **What the policy would do** - What triggers it and what action it prompts +If you identify one or more rules that would benefit the user, explain: +1. **What the rule would do** - What triggers it and what action it prompts 2. **Why it would help** - How it prevents common mistakes or keeps things in sync 3. **What files it would watch** - The trigger patterns Then ask the user: -> "Would you like me to create this policy for you? I can run `/deepwork_policy.define` to set it up." +> "Would you like me to create this rule for you? I can run `/deepwork_rules.define` to set it up." -If the user agrees, invoke the `/deepwork_policy.define` command to guide them through creating the policy. +If the user agrees, invoke the `/deepwork_rules.define` command to guide them through creating the rule. **Example dialogue:** @@ -218,15 +218,15 @@ Based on the competitive_research job you just created, I noticed that when competitor analysis files change, it would be helpful to remind you to update your strategy documentation. -I'd suggest a policy like: +I'd suggest a rule like: - **Name**: "Update strategy when competitor analysis changes" - **Trigger**: `**/positioning_report.md` - **Action**: Prompt to review and update `docs/strategy.md` -Would you like me to create this policy? I can run `/deepwork_policy.define` to set it up. +Would you like me to create this rule? I can run `/deepwork_rules.define` to set it up. ``` -**Note:** Not every job needs policies. Only suggest them when they would genuinely help maintain consistency or quality. Don't force policies where they don't make sense. +**Note:** Not every job needs rules. Only suggest them when they would genuinely help maintain consistency or quality. Don't force rules where they don't make sense. ## Example Implementation @@ -260,8 +260,8 @@ Before marking this step complete, ensure: - [ ] `deepwork sync` executed successfully - [ ] Commands generated in platform directory - [ ] User informed to follow reload instructions from `deepwork sync` -- [ ] Considered whether policies would benefit this job (Step 7) -- [ ] If policies suggested, offered to run `/deepwork_policy.define` +- [ ] Considered whether rules would benefit this job (Step 7) +- [ ] If rules suggested, offered to run `/deepwork_rules.define` ## Quality Criteria @@ -273,7 +273,7 @@ Before marking this step complete, ensure: - Steps with user inputs explicitly use "ask structured questions" phrasing - Sync completed successfully - Commands available for use -- Thoughtfully considered relevant policies for the job domain +- Thoughtfully considered relevant rules for the job domain ## Inputs diff --git a/.gemini/commands/deepwork_policy/define.toml b/.gemini/commands/deepwork_policy/define.toml deleted file mode 100644 index ca45a47f..00000000 --- a/.gemini/commands/deepwork_policy/define.toml +++ /dev/null @@ -1,295 +0,0 @@ -# deepwork_policy:define -# -# Create or update policy entries in .deepwork.policy.yml -# -# Generated by DeepWork - do not edit manually - -description = "Create or update policy entries in .deepwork.policy.yml" - -prompt = """ -# deepwork_policy:define - -**Standalone command** in the **deepwork_policy** job - can be run anytime - -**Summary**: Policy enforcement for AI agent sessions - -## Job Overview - -Manages policies that automatically trigger when certain files change during an AI agent session. -Policies help ensure that code changes follow team guidelines, documentation is updated, -and architectural decisions are respected. - -Policies are defined in a `.deepwork.policy.yml` file at the root of your project. Each policy -specifies: -- Trigger patterns: Glob patterns for files that, when changed, should trigger the policy -- Safety patterns: Glob patterns for files that, if also changed, mean the policy doesn't need to fire -- Instructions: What the agent should do when the policy triggers - -Example use cases: -- Update installation docs when configuration files change -- Require security review when authentication code is modified -- Ensure API documentation stays in sync with API code -- Remind developers to update changelogs - - - -## Instructions - -# Define Policy - -## Objective - -Create or update policy entries in the `.deepwork.policy.yml` file to enforce team guidelines, documentation requirements, or other constraints when specific files change. - -## Task - -Guide the user through defining a new policy by asking structured questions. **Do not create the policy without first understanding what they want to enforce.** - -**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. - -### Step 1: Understand the Policy Purpose - -Start by asking structured questions to understand what the user wants to enforce: - -1. **What guideline or constraint should this policy enforce?** - - What situation triggers the need for action? - - What files or directories, when changed, should trigger this policy? - - Examples: "When config files change", "When API code changes", "When database schema changes" - -2. **What action should be taken?** - - What should the agent do when the policy triggers? - - Update documentation? Perform a security review? Update tests? - - Is there a specific file or process that needs attention? - -3. **Are there any "safety" conditions?** - - Are there files that, if also changed, mean the policy doesn't need to fire? - - For example: If config changes AND install_guide.md changes, assume docs are already updated - - This prevents redundant prompts when the user has already done the right thing - -### Step 2: Define the Trigger Patterns - -Help the user define glob patterns for files that should trigger the policy: - -**Common patterns:** -- `src/**/*.py` - All Python files in src directory (recursive) -- `app/config/**/*` - All files in app/config directory -- `*.md` - All markdown files in root -- `src/api/**/*` - All files in the API directory -- `migrations/**/*.sql` - All SQL migrations - -**Pattern syntax:** -- `*` - Matches any characters within a single path segment -- `**` - Matches any characters across multiple path segments (recursive) -- `?` - Matches a single character - -### Step 3: Define Safety Patterns (Optional) - -If there are files that, when also changed, mean the policy shouldn't fire: - -**Examples:** -- Policy: "Update install guide when config changes" - - Trigger: `app/config/**/*` - - Safety: `docs/install_guide.md` (if already updated, don't prompt) - -- Policy: "Security review for auth changes" - - Trigger: `src/auth/**/*` - - Safety: `SECURITY.md`, `docs/security_review.md` - -### Step 3b: Choose the Comparison Mode (Optional) - -The `compare_to` field controls what baseline is used when detecting "changed files": - -**Options:** -- `base` (default) - Compares to the base of the current branch (merge-base with main/master). This is the most common choice for feature branches, as it shows all changes made on the branch. -- `default_tip` - Compares to the current tip of the default branch (main/master). Useful when you want to see the difference from what's currently in production. -- `prompt` - Compares to the state at the start of each prompt. Useful for policies that should only fire based on changes made during a single agent response. - -**When to use each:** -- **base**: Best for most policies. "Did this branch change config files?" → trigger docs review -- **default_tip**: For policies about what's different from production/main -- **prompt**: For policies that should only consider very recent changes within the current session - -Most policies should use the default (`base`) and don't need to specify `compare_to`. - -### Step 4: Write the Instructions - -Create clear, actionable instructions for what the agent should do when the policy fires. - -**Good instructions include:** -- What to check or review -- What files might need updating -- Specific actions to take -- Quality criteria for completion - -**Example:** -``` -Configuration files have changed. Please: -1. Review docs/install_guide.md for accuracy -2. Update any installation steps that reference changed config -3. Verify environment variable documentation is current -4. Test that installation instructions still work -``` - -### Step 5: Create the Policy Entry - -Create or update `.deepwork.policy.yml` in the project root. - -**File Location**: `.deepwork.policy.yml` (root of project) - -**Format**: -```yaml -- name: "[Friendly name for the policy]" - trigger: "[glob pattern]" # or array: ["pattern1", "pattern2"] - safety: "[glob pattern]" # optional, or array - compare_to: "base" # optional: "base" (default), "default_tip", or "prompt" - instructions: | - [Multi-line instructions for the agent...] -``` - -**Alternative with instructions_file**: -```yaml -- name: "[Friendly name for the policy]" - trigger: "[glob pattern]" - safety: "[glob pattern]" - compare_to: "base" # optional - instructions_file: "path/to/instructions.md" -``` - -### Step 6: Verify the Policy - -After creating the policy: - -1. **Check the YAML syntax** - Ensure valid YAML formatting -2. **Test trigger patterns** - Verify patterns match intended files -3. **Review instructions** - Ensure they're clear and actionable -4. **Check for conflicts** - Ensure the policy doesn't conflict with existing ones - -## Example Policies - -### Update Documentation on Config Changes -```yaml -- name: "Update install guide on config changes" - trigger: "app/config/**/*" - safety: "docs/install_guide.md" - instructions: | - Configuration files have been modified. Please review docs/install_guide.md - and update it if any installation instructions need to change based on the - new configuration. -``` - -### Security Review for Auth Code -```yaml -- name: "Security review for authentication changes" - trigger: - - "src/auth/**/*" - - "src/security/**/*" - safety: - - "SECURITY.md" - - "docs/security_audit.md" - instructions: | - Authentication or security code has been changed. Please: - 1. Review for hardcoded credentials or secrets - 2. Check input validation on user inputs - 3. Verify access control logic is correct - 4. Update security documentation if needed -``` - -### API Documentation Sync -```yaml -- name: "API documentation update" - trigger: "src/api/**/*.py" - safety: "docs/api/**/*.md" - instructions: | - API code has changed. Please verify that API documentation in docs/api/ - is up to date with the code changes. Pay special attention to: - - New or changed endpoints - - Modified request/response schemas - - Updated authentication requirements -``` - -## Output Format - -### .deepwork.policy.yml -Create or update this file at the project root with the new policy entry. - -## Quality Criteria - -- Asked structured questions to understand user requirements -- Policy name is clear and descriptive -- Trigger patterns accurately match the intended files -- Safety patterns prevent unnecessary triggering -- Instructions are actionable and specific -- YAML is valid and properly formatted - -## Context - -Policies are evaluated automatically when you finish working on a task. The system: -1. Determines which files have changed based on each policy's `compare_to` setting: - - `base` (default): Files changed since the branch diverged from main/master - - `default_tip`: Files different from the current main/master branch - - `prompt`: Files changed since the last prompt submission -2. Checks if any changes match policy trigger patterns -3. Skips policies where safety patterns also matched -4. Prompts you with instructions for any triggered policies - -You can mark a policy as addressed by including `✓ Policy Name` in your response (replace Policy Name with the actual policy name). This tells the system you've already handled that policy's requirements. - - -## Inputs - -### User Parameters - -Please gather the following information from the user: -- **policy_purpose**: What guideline or constraint should this policy enforce? - - -## Work Branch Management - -All work for this job should be done on a dedicated work branch: - -1. **Check current branch**: - - If already on a work branch for this job (format: `deepwork/deepwork_policy-[instance]-[date]`), continue using it - - If on main/master, create a new work branch - -2. **Create work branch** (if needed): - ```bash - git checkout -b deepwork/deepwork_policy-[instance]-$(date +%Y%m%d) - ``` - Replace `[instance]` with a descriptive identifier (e.g., `acme`, `q1-launch`, etc.) - -## Output Requirements - -Create the following output(s): -- `.deepwork.policy.yml` - -Ensure all outputs are: -- Well-formatted and complete -- Ready for review or use by subsequent steps - -## Completion - -After completing this step: - -1. **Verify outputs**: Confirm all required files have been created - -2. **Inform the user**: - - The define command is complete - - Outputs created: .deepwork.policy.yml - - This command can be run again anytime to make further changes - -## Command Complete - -This is a standalone command that can be run anytime. The outputs are ready for use. - -Consider: -- Reviewing the outputs -- Running `deepwork sync` if job definitions were changed -- Re-running this command later if further changes are needed - ---- - -## Context Files - -- Job definition: `.deepwork/jobs/deepwork_policy/job.yml` -- Step instructions: `.deepwork/jobs/deepwork_policy/steps/define.md` -""" \ No newline at end of file diff --git a/.gemini/commands/deepwork_rules/define.toml b/.gemini/commands/deepwork_rules/define.toml new file mode 100644 index 00000000..28d6d5b4 --- /dev/null +++ b/.gemini/commands/deepwork_rules/define.toml @@ -0,0 +1,346 @@ +# deepwork_rules:define +# +# Create a new rule file in .deepwork/rules/ +# +# Generated by DeepWork - do not edit manually + +description = "Create a new rule file in .deepwork/rules/" + +prompt = """ +# deepwork_rules:define + +**Standalone command** in the **deepwork_rules** job - can be run anytime + +**Summary**: Rules enforcement for AI agent sessions + +## Job Overview + +Manages rules that automatically trigger when certain files change during an AI agent session. +Rules help ensure that code changes follow team guidelines, documentation is updated, +and architectural decisions are respected. + +Rules are stored as individual markdown files with YAML frontmatter in the `.deepwork/rules/` +directory. Each rule file specifies: +- Detection mode: trigger/safety, set (bidirectional), or pair (directional) +- Patterns: Glob patterns for matching files, with optional variable capture +- Instructions: Markdown content describing what the agent should do + +Example use cases: +- Update installation docs when configuration files change +- Require security review when authentication code is modified +- Ensure API documentation stays in sync with API code +- Enforce source/test file pairing + + + +## Instructions + +# Define Rule + +## Objective + +Create a new rule file in the `.deepwork/rules/` directory to enforce team guidelines, documentation requirements, or other constraints when specific files change. + +## Task + +Guide the user through defining a new rule by asking structured questions. **Do not create the rule without first understanding what they want to enforce.** + +**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. + +### Step 1: Understand the Rule Purpose + +Start by asking structured questions to understand what the user wants to enforce: + +1. **What guideline or constraint should this rule enforce?** + - What situation triggers the need for action? + - What files or directories, when changed, should trigger this rule? + - Examples: "When config files change", "When API code changes", "When database schema changes" + +2. **What action should be taken?** + - What should the agent do when the rule triggers? + - Update documentation? Perform a security review? Update tests? + - Is there a specific file or process that needs attention? + +3. **Are there any "safety" conditions?** + - Are there files that, if also changed, mean the rule doesn't need to fire? + - For example: If config changes AND install_guide.md changes, assume docs are already updated + - This prevents redundant prompts when the user has already done the right thing + +### Step 2: Choose the Detection Mode + +Help the user select the appropriate detection mode: + +**Trigger/Safety Mode** (most common): +- Fires when trigger patterns match AND no safety patterns match +- Use for: "When X changes, check Y" rules +- Example: When config changes, verify install docs + +**Set Mode** (bidirectional correspondence): +- Fires when files that should change together don't all change +- Use for: Source/test pairing, model/migration sync +- Example: `src/foo.py` and `tests/foo_test.py` should change together + +**Pair Mode** (directional correspondence): +- Fires when a trigger file changes but expected files don't +- Changes to expected files alone do NOT trigger +- Use for: API code requires documentation updates (but docs can update independently) + +### Step 3: Define the Patterns + +Help the user define glob patterns for files. + +**Common patterns:** +- `src/**/*.py` - All Python files in src directory (recursive) +- `app/config/**/*` - All files in app/config directory +- `*.md` - All markdown files in root +- `src/api/**/*` - All files in the API directory +- `migrations/**/*.sql` - All SQL migrations + +**Variable patterns (for set/pair modes):** +- `src/{path}.py` - Captures path variable (e.g., `foo/bar` from `src/foo/bar.py`) +- `tests/{path}_test.py` - Uses same path variable in corresponding file +- `{name}` matches single segment, `{path}` matches multiple segments + +**Pattern syntax:** +- `*` - Matches any characters within a single path segment +- `**` - Matches any characters across multiple path segments (recursive) +- `?` - Matches a single character + +### Step 4: Choose the Comparison Mode (Optional) + +The `compare_to` field controls what baseline is used when detecting "changed files": + +**Options:** +- `base` (default) - Compares to the base of the current branch (merge-base with main/master). Best for feature branches. +- `default_tip` - Compares to the current tip of the default branch. Useful for seeing difference from production. +- `prompt` - Compares to the state at the start of each prompt. For rules about very recent changes. + +Most rules should use the default (`base`) and don't need to specify `compare_to`. + +### Step 5: Write the Instructions + +Create clear, actionable instructions for what the agent should do when the rule fires. + +**Good instructions include:** +- What to check or review +- What files might need updating +- Specific actions to take +- Quality criteria for completion + +**Template variables available in instructions:** +- `{trigger_files}` - Files that triggered the rule +- `{expected_files}` - Expected corresponding files (for set/pair modes) + +### Step 6: Create the Rule File + +Create a new file in `.deepwork/rules/` with a kebab-case filename: + +**File Location**: `.deepwork/rules/{rule-name}.md` + +**Format for Trigger/Safety Mode:** +```markdown +--- +name: Friendly Name for the Rule +trigger: "glob/pattern/**/*" # or array: ["pattern1", "pattern2"] +safety: "optional/pattern" # optional, or array +compare_to: base # optional: "base" (default), "default_tip", or "prompt" +--- +Instructions for the agent when this rule fires. + +Multi-line markdown content is supported. +``` + +**Format for Set Mode (bidirectional):** +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +Modified: {trigger_files} +Expected: {expected_files} +``` + +**Format for Pair Mode (directional):** +```markdown +--- +name: API Documentation +pair: + trigger: api/{path}.py + expects: docs/api/{path}.md +--- +API code requires documentation updates. + +Changed API: {trigger_files} +Update docs: {expected_files} +``` + +### Step 7: Verify the Rule + +After creating the rule: + +1. **Check the YAML frontmatter** - Ensure valid YAML formatting +2. **Test trigger patterns** - Verify patterns match intended files +3. **Review instructions** - Ensure they're clear and actionable +4. **Check for conflicts** - Ensure the rule doesn't conflict with existing ones + +## Example Rules + +### Update Documentation on Config Changes +`.deepwork/rules/config-docs.md`: +```markdown +--- +name: Update Install Guide on Config Changes +trigger: app/config/**/* +safety: docs/install_guide.md +--- +Configuration files have been modified. Please review docs/install_guide.md +and update it if any installation instructions need to change based on the +new configuration. +``` + +### Security Review for Auth Code +`.deepwork/rules/security-review.md`: +```markdown +--- +name: Security Review for Authentication Changes +trigger: + - src/auth/**/* + - src/security/**/* +safety: + - SECURITY.md + - docs/security_audit.md +--- +Authentication or security code has been changed. Please: + +1. Review for hardcoded credentials or secrets +2. Check input validation on user inputs +3. Verify access control logic is correct +4. Update security documentation if needed +``` + +### Source/Test Pairing +`.deepwork/rules/source-test-pairing.md`: +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +When modifying source code, ensure corresponding tests are updated. +When adding tests, ensure they test actual source code. + +Modified: {trigger_files} +Expected: {expected_files} +``` + +### API Documentation Sync +`.deepwork/rules/api-docs.md`: +```markdown +--- +name: API Documentation Update +pair: + trigger: src/api/{path}.py + expects: docs/api/{path}.md +--- +API code has changed. Please verify that API documentation in docs/api/ +is up to date with the code changes. Pay special attention to: + +- New or changed endpoints +- Modified request/response schemas +- Updated authentication requirements + +Changed API: {trigger_files} +Update: {expected_files} +``` + +## Output Format + +### .deepwork/rules/{rule-name}.md +Create a new file with the rule definition using YAML frontmatter and markdown body. + +## Quality Criteria + +- Asked structured questions to understand user requirements +- Rule name is clear and descriptive (used in promise tags) +- Correct detection mode selected for the use case +- Patterns accurately match the intended files +- Safety patterns prevent unnecessary triggering (if applicable) +- Instructions are actionable and specific +- YAML frontmatter is valid + +## Context + +Rules are evaluated automatically when the agent finishes a task. The system: +1. Determines which files have changed based on each rule's `compare_to` setting +2. Evaluates rules based on their detection mode (trigger/safety, set, or pair) +3. Skips rules where the correspondence is satisfied (for set/pair) or safety matched +4. Prompts you with instructions for any triggered rules + +You can mark a rule as addressed by including `Rule Name` in your response (replace Rule Name with the actual rule name from the `name` field). This tells the system you've already handled that rule's requirements. + + +## Inputs + +### User Parameters + +Please gather the following information from the user: +- **rule_purpose**: What guideline or constraint should this rule enforce? + + +## Work Branch Management + +All work for this job should be done on a dedicated work branch: + +1. **Check current branch**: + - If already on a work branch for this job (format: `deepwork/deepwork_rules-[instance]-[date]`), continue using it + - If on main/master, create a new work branch + +2. **Create work branch** (if needed): + ```bash + git checkout -b deepwork/deepwork_rules-[instance]-$(date +%Y%m%d) + ``` + Replace `[instance]` with a descriptive identifier (e.g., `acme`, `q1-launch`, etc.) + +## Output Requirements + +Create the following output(s): +- `.deepwork/rules/{rule-name}.md` + +Ensure all outputs are: +- Well-formatted and complete +- Ready for review or use by subsequent steps + +## Completion + +After completing this step: + +1. **Verify outputs**: Confirm all required files have been created + +2. **Inform the user**: + - The define command is complete + - Outputs created: .deepwork/rules/{rule-name}.md + - This command can be run again anytime to make further changes + +## Command Complete + +This is a standalone command that can be run anytime. The outputs are ready for use. + +Consider: +- Reviewing the outputs +- Running `deepwork sync` if job definitions were changed +- Re-running this command later if further changes are needed + +--- + +## Context Files + +- Job definition: `.deepwork/jobs/deepwork_rules/job.yml` +- Step instructions: `.deepwork/jobs/deepwork_rules/steps/define.md` +""" \ No newline at end of file diff --git a/.gemini/commands/update/job.toml b/.gemini/commands/update/job.toml index 474171d9..c38490e5 100644 --- a/.gemini/commands/update/job.toml +++ b/.gemini/commands/update/job.toml @@ -16,7 +16,7 @@ prompt = """ ## Job Overview A workflow for maintaining standard jobs bundled with DeepWork. Standard jobs -(like `deepwork_jobs` and `deepwork_policy`) are source-controlled in +(like `deepwork_jobs` and `deepwork_rules`) are source-controlled in `src/deepwork/standard_jobs/` and must be edited there—never in `.deepwork/jobs/` or `.claude/commands/` directly. @@ -60,7 +60,7 @@ Standard jobs exist in THREE locations, but only ONE is the source of truth: #### 1. Identify the Standard Job to Update From conversation context, determine: -- Which standard job needs updating (e.g., `deepwork_jobs`, `deepwork_policy`) +- Which standard job needs updating (e.g., `deepwork_jobs`, `deepwork_rules`) - What changes are needed (job.yml, step instructions, hooks, etc.) Current standard jobs: diff --git a/CHANGELOG.md b/CHANGELOG.md index 2fb45116..41243448 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,24 +5,44 @@ All notable changes to DeepWork will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.4.0] - 2026-01-16 + +### Added +- Rules system v2 with frontmatter markdown format in `.deepwork/rules/` + - Detection modes: trigger/safety (default), set (bidirectional), pair (directional) + - Action types: prompt (show instructions), command (run idempotent commands) + - Variable pattern matching with `{path}` (multi-segment) and `{name}` (single-segment) + - Queue system in `.deepwork/tmp/rules/queue/` for state tracking and deduplication +- New core modules: + - `pattern_matcher.py`: Variable pattern matching with regex-based capture + - `rules_queue.py`: Queue system for rule state persistence + - `command_executor.py`: Command action execution with variable substitution +- Updated `rules_check.py` hook to use v2 system with queue-based deduplication + +### Changed +- Documentation updated with v2 rules examples and configuration + +### Removed +- v1 rules format (`.deepwork.rules.yml`) - now only v2 frontmatter markdown format is supported + ## [0.3.0] - 2026-01-16 ### Added - Cross-platform hook wrapper system for writing hooks once and running on multiple platforms - `wrapper.py`: Normalizes input/output between Claude Code and Gemini CLI - `claude_hook.sh` and `gemini_hook.sh`: Platform-specific shell wrappers - - `policy_check.py`: Cross-platform policy evaluation hook + - `rules_check.py`: Cross-platform rule evaluation hook - Platform documentation in `doc/platforms/` with hook references and learnings - Claude Code platform documentation (`doc/platforms/claude/`) - `update.job` for maintaining standard jobs (#41) - `make_new_job.sh` script and templates directory for job scaffolding (#37) -- Default policy template file created during `deepwork install` (#42) +- Default rules template file created during `deepwork install` (#42) - Full e2e test suite: define → implement → execute workflow (#45) - Automated tests for all shell scripts and hook wrappers (#40) ### Changed - Standardized on "ask structured questions" phrasing across all jobs (#48) -- deepwork_jobs bumped to v0.5.0, deepwork_policy to v0.2.0 +- deepwork_jobs bumped to v0.5.0, deepwork_rules to v0.2.0 ### Fixed - Stop hooks now properly return blocking JSON (#38) @@ -31,7 +51,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.1.1] - 2026-01-15 ### Added -- `compare_to` option in policy system for flexible change detection (#34) +- `compare_to` option in rules system for flexible change detection (#34) - `base` (default): Compare to merge-base with default branch - `default_tip`: Two-dot diff against default branch tip - `prompt`: Compare to state captured at prompt submission @@ -43,27 +63,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Supplementary markdown file support for job steps (#19) - Browser automation capability consideration in job definition (#32) - Platform-specific reload instructions in adapters (#31) -- Version and changelog update policy to enforce version tracking on src changes +- Version and changelog update rule to enforce version tracking on src changes - Added claude and copilot to CLA allowlist (#26) ### Changed -- Moved git diff logic into evaluate_policies.py for per-policy handling (#34) +- Moved git diff logic into evaluate_rules.py for per-rule handling (#34) - Renamed `capture_work_tree.sh` to `capture_prompt_work_tree.sh` (#34) - Updated README with PyPI install instructions using pipx, uv, and pip (#22) - Updated deepwork_jobs job version to 0.2.0 ### Fixed -- Stop hooks now correctly return blocking JSON when policies fire +- Stop hooks now correctly return blocking JSON when rules fire - Added shell script tests to verify stop hook blocking behavior ### Removed - `refine` step (replaced by `learn` command) (#27) -- `get_changed_files.sh` hook (logic moved to Python policy evaluator) (#34) +- `get_changed_files.sh` hook (logic moved to Python rule evaluator) (#34) ## [0.1.0] - Initial Release Initial version. +[0.4.0]: https://github.com/anthropics/deepwork/releases/tag/0.4.0 [0.3.0]: https://github.com/anthropics/deepwork/releases/tag/0.3.0 [0.1.1]: https://github.com/anthropics/deepwork/releases/tag/0.1.1 [0.1.0]: https://github.com/anthropics/deepwork/releases/tag/0.1.0 diff --git a/README.md b/README.md index 33319968..96816677 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,7 @@ DeepWork is a tool for defining and executing multi-step workflows with AI codin | OpenCode | Planned | Markdown | No | | GitHub Copilot CLI | Planned | Markdown | No (tool permissions only) | -> **Tip:** New to DeepWork? Claude Code has the most complete feature support, including quality validation hooks and automated policies. For browser automation, Claude in Chrome (Anthropic's browser extension) works well with DeepWork workflows. +> **Tip:** New to DeepWork? Claude Code has the most complete feature support, including quality validation hooks and automated rules. For browser automation, Claude in Chrome (Anthropic's browser extension) works well with DeepWork workflows. ## Easy Installation In your Agent CLI (ex. `claude`), ask: @@ -61,8 +61,7 @@ This will: - Create `.deepwork/` directory structure - Generate core DeepWork jobs - Install DeepWork jobs for your AI assistant -- Configure hooks for your AI assistant to enable policies -- Create a `.deepwork.policy.yml` template file with example policies +- Configure hooks for your AI assistant to enable rules ## Quick Start @@ -178,6 +177,10 @@ DeepWork follows a **Git-native, installation-only** design: your-project/ ├── .deepwork/ │ ├── config.yml # Platform configuration +│ ├── rules/ # Rule definitions (v2 format) +│ │ └── rule-name.md # Individual rule files +│ ├── tmp/ # Temporary state (gitignored) +│ │ └── rules/queue/ # Rule evaluation queue │ └── jobs/ # Job definitions │ └── job_name/ │ ├── job.yml # Job metadata @@ -208,11 +211,16 @@ deepwork/ │ ├── core/ # Core functionality │ │ ├── parser.py # Job definition parsing │ │ ├── detector.py # Platform detection -│ │ └── generator.py # Skill file generation +│ │ ├── generator.py # Skill file generation +│ │ ├── rules_parser.py # Rule parsing +│ │ ├── pattern_matcher.py # Variable pattern matching +│ │ ├── rules_queue.py # Rule state queue +│ │ └── command_executor.py # Command action execution │ ├── hooks/ # Cross-platform hook wrappers │ │ ├── wrapper.py # Input/output normalization -│ │ ├── claude_hook.sh # Claude Code adapter -│ │ └── gemini_hook.sh # Gemini CLI adapter +│ │ ├── rules_check.py # Rule evaluation hook +│ │ ├── claude_hook.sh # Claude Code adapter +│ │ └── gemini_hook.sh # Gemini CLI adapter │ ├── templates/ # Jinja2 templates │ │ ├── claude/ # Claude Code templates │ │ └── gemini/ # Gemini CLI templates @@ -227,34 +235,50 @@ deepwork/ ## Features -### 📋 Job Definition +### Job Definition Define structured, multi-step workflows where each step has clear requirements and produces specific results. - **Dependency Management**: Explicitly link steps with automatic sequence handling and cycle detection. - **Artifact Passing**: Seamlessly use file outputs from one step as inputs for future steps. - **Dynamic Inputs**: Support for both fixed file references and interactive user parameters. - **Human-Readable YAML**: Simple, declarative job definitions that are easy to version and maintain. -### 🌿 Git-Native Workflow +### Git-Native Workflow Maintain a clean repository with automatic branch management and isolation. - **Automatic Branching**: Every job execution happens on a dedicated work branch (e.g., `deepwork/my-job-2024`). - **Namespace Isolation**: Run multiple concurrent jobs or instances without versioning conflicts. - **Full Traceability**: All AI-generated changes, logs, and artifacts are tracked natively in your Git history. -### 🛡️ Automated Policies -Enforce project standards and best practices without manual oversight. Policies monitor file changes and automatically prompt your AI assistant to follow specific guidelines when relevant code is modified. -- **Automatic Triggers**: Detect when specific files or directories are changed to fire relevant policies. +### Automated Rules +Enforce project standards and best practices without manual oversight. Rules monitor file changes and automatically prompt your AI assistant to follow specific guidelines when relevant code is modified. +- **Automatic Triggers**: Detect when specific files or directories are changed to fire relevant rules. +- **File Correspondence**: Define bidirectional (set) or directional (pair) relationships between files. +- **Command Actions**: Run idempotent commands (formatters, linters) automatically when files change. - **Contextual Guidance**: Instructions are injected directly into the AI's workflow at the right moment. -- **Common Use Cases**: Keep documentation in sync, enforce security reviews, or automate changelog updates. -**Example Policy**: -```yaml -# Enforce documentation updates when config changes -- name: "Update docs on config changes" - trigger: "app/config/**/*" - instructions: "Configuration files changed. Please update docs/install_guide.md." +**Example Rule** (`.deepwork/rules/source-test-pairing.md`): +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +When source files change, corresponding test files should also change. +Please create or update tests for the modified source files. +``` + +**Example Command Rule** (`.deepwork/rules/format-python.md`): +```markdown +--- +name: Format Python +trigger: "**/*.py" +action: + command: "ruff format {file}" + run_for: each_match +--- ``` -### 🚀 Multi-Platform Support +### Multi-Platform Support Generate native commands and skills tailored for your AI coding assistant. - **Native Integration**: Works directly with the skill/command formats of supported agents. - **Context-Aware**: Skills include all necessary context (instructions, inputs, and dependencies) for the AI. diff --git a/claude.md b/claude.md index 34a4c011..9141a2b9 100644 --- a/claude.md +++ b/claude.md @@ -184,7 +184,7 @@ my-project/ ## CRITICAL: Editing Standard Jobs -**Standard jobs** (like `deepwork_jobs` and `deepwork_policy`) are bundled with DeepWork and installed to user projects. They exist in THREE locations: +**Standard jobs** (like `deepwork_jobs` and `deepwork_rules`) are bundled with DeepWork and installed to user projects. They exist in THREE locations: 1. **Source of truth**: `src/deepwork/standard_jobs/[job_name]/` - The canonical source files 2. **Installed copy**: `.deepwork/jobs/[job_name]/` - Installed by `deepwork install` @@ -209,7 +209,7 @@ Instead, follow this workflow: Standard jobs are defined in `src/deepwork/standard_jobs/`. Currently: - `deepwork_jobs` - Core job management commands (define, implement, refine) -- `deepwork_policy` - Policy enforcement system +- `deepwork_rules` - Rules enforcement system If a job exists in `src/deepwork/standard_jobs/`, it is a standard job and MUST be edited there. diff --git a/doc/architecture.md b/doc/architecture.md index 6ddf971c..29400973 100644 --- a/doc/architecture.md +++ b/doc/architecture.md @@ -46,15 +46,17 @@ deepwork/ # DeepWork tool repository │ │ ├── detector.py # AI platform detection │ │ ├── generator.py # Command file generation │ │ ├── parser.py # Job definition parsing -│ │ ├── policy_parser.py # Policy definition parsing -│ │ └── hooks_syncer.py # Hook syncing to platforms +│ │ ├── rules_parser.py # Rule definition parsing +│ │ ├── pattern_matcher.py # Variable pattern matching for rules +│ │ ├── rules_queue.py # Rule state queue system +│ │ ├── command_executor.py # Command action execution +│ │ └── hooks_syncer.py # Hook syncing to platforms │ ├── hooks/ # Hook system and cross-platform wrappers │ │ ├── __init__.py │ │ ├── wrapper.py # Cross-platform input/output normalization │ │ ├── claude_hook.sh # Shell wrapper for Claude Code │ │ ├── gemini_hook.sh # Shell wrapper for Gemini CLI -│ │ ├── policy_check.py # Cross-platform policy evaluation hook -│ │ └── evaluate_policies.py # Legacy policy evaluation CLI +│ │ └── rules_check.py # Cross-platform rule evaluation hook │ ├── templates/ # Command templates for each platform │ │ ├── claude/ │ │ │ └── command-job-step.md.jinja @@ -64,18 +66,17 @@ deepwork/ # DeepWork tool repository │ │ ├── deepwork_jobs/ │ │ │ ├── job.yml │ │ │ └── steps/ -│ │ └── deepwork_policy/ # Policy management job +│ │ └── deepwork_rules/ # Rule management job │ │ ├── job.yml │ │ ├── steps/ │ │ │ └── define.md │ │ └── hooks/ # Hook scripts │ │ ├── global_hooks.yml │ │ ├── user_prompt_submit.sh -│ │ ├── capture_prompt_work_tree.sh -│ │ └── policy_stop_hook.sh +│ │ └── capture_prompt_work_tree.sh │ ├── schemas/ # Definition schemas │ │ ├── job_schema.py -│ │ └── policy_schema.py +│ │ └── rules_schema.py │ └── utils/ │ ├── fs.py │ ├── git.py @@ -120,9 +121,10 @@ def install(platform: str): # Inject core job definitions inject_deepwork_jobs(".deepwork/jobs/") - # Create default policy template (if not exists) - if not exists(".deepwork.policy.yml"): - copy_template("default_policy.yml", ".deepwork.policy.yml") + # Create rules directory with example templates (if not exists) + if not exists(".deepwork/rules/"): + create_directory(".deepwork/rules/") + copy_example_rules(".deepwork/rules/") # Update config (supports multiple platforms) config = load_yaml(".deepwork/config.yml") or {} @@ -281,31 +283,35 @@ my-project/ # User's project (target) │ ├── deepwork_jobs.define.md # Core DeepWork commands │ ├── deepwork_jobs.implement.md │ ├── deepwork_jobs.refine.md -│ ├── deepwork_policy.define.md # Policy management +│ ├── deepwork_rules.define.md # Rule management │ ├── competitive_research.identify_competitors.md │ └── ... ├── .deepwork/ # DeepWork configuration │ ├── config.yml # Platform config -│ ├── .gitignore # Ignores .last_work_tree +│ ├── .gitignore # Ignores tmp/ directory +│ ├── rules/ # Rule definitions (v2 format) +│ │ ├── source-test-pairing.md +│ │ ├── format-python.md +│ │ └── api-docs.md +│ ├── tmp/ # Temporary state (gitignored) +│ │ └── rules/queue/ # Rule evaluation queue │ └── jobs/ # Job definitions │ ├── deepwork_jobs/ # Core job for managing jobs │ │ ├── job.yml │ │ └── steps/ -│ ├── deepwork_policy/ # Policy management job +│ ├── deepwork_rules/ # Rule management job │ │ ├── job.yml │ │ ├── steps/ │ │ │ └── define.md │ │ └── hooks/ # Hook scripts (installed from standard_jobs) │ │ ├── global_hooks.yml │ │ ├── user_prompt_submit.sh -│ │ ├── capture_prompt_work_tree.sh -│ │ └── policy_stop_hook.sh +│ │ └── capture_prompt_work_tree.sh │ ├── competitive_research/ │ │ ├── job.yml # Job metadata │ │ └── steps/ │ └── ad_campaign/ │ └── ... -├── .deepwork.policy.yml # Policy definitions (project root) ├── (rest of user's project files) └── README.md ``` @@ -994,63 +1000,131 @@ Github Actions are used for all CI/CD tasks. --- -## Policies +## Rules -Policies are automated enforcement rules that trigger based on file changes during an AI agent session. They help ensure that: +Rules are automated enforcement mechanisms that trigger based on file changes during an AI agent session. They help ensure that: - Documentation stays in sync with code changes - Security reviews happen when sensitive code is modified - Team guidelines are followed automatically +- File correspondences are maintained (e.g., source/test pairing) -### Policy Configuration File +### Rules System v2 (Frontmatter Markdown) -Policies are defined in `.deepwork.policy.yml` at the project root: +Rules are defined as individual markdown files in `.deepwork/rules/`: +``` +.deepwork/rules/ +├── source-test-pairing.md +├── format-python.md +└── api-docs.md +``` + +Each rule file uses YAML frontmatter with a markdown body for instructions: + +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +When source files change, corresponding test files should also change. +Please create or update tests for the modified source files. +``` + +### Detection Modes + +Rules support three detection modes: + +**1. Trigger/Safety (default)** - Fire when trigger matches but safety doesn't: +```yaml +--- +name: Update install guide +trigger: "app/config/**/*" +safety: "docs/install_guide.md" +--- +``` + +**2. Set (bidirectional)** - Enforce file correspondence in both directions: +```yaml +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +``` +Uses variable patterns like `{path}` (multi-segment) and `{name}` (single-segment) for matching. + +**3. Pair (directional)** - Trigger requires corresponding files, but not vice versa: +```yaml +--- +name: API Documentation +pair: + trigger: src/api/{name}.py + expects: docs/api/{name}.md +--- +``` + +### Action Types + +**1. Prompt (default)** - Show instructions to the agent: +```yaml +--- +name: Security Review +trigger: "src/auth/**/*" +--- +Please check for hardcoded credentials and validate input. +``` + +**2. Command** - Run an idempotent command: ```yaml -- name: "Update install guide on config changes" - trigger: "app/config/**/*" - safety: "docs/install_guide.md" - instructions: | - Configuration files have been modified. Please review docs/install_guide.md - and update it if any installation instructions need to change. - -- name: "Security review for auth changes" - trigger: - - "src/auth/**/*" - - "src/security/**/*" - safety: - - "SECURITY.md" - - "docs/security_audit.md" - instructions: | - Authentication or security code has been changed. Please: - 1. Check for hardcoded credentials - 2. Verify input validation - 3. Review access control logic +--- +name: Format Python +trigger: "**/*.py" +action: + command: "ruff format {file}" + run_for: each_match # or "all_matches" +--- ``` -### Policy Evaluation Flow +### Rule Evaluation Flow 1. **Session Start**: When a Claude Code session begins, the baseline git state is captured 2. **Agent Works**: The AI agent performs tasks, potentially modifying files -3. **Session Stop**: When the agent finishes: - - Changed files are detected by comparing against the baseline - - Each policy is evaluated: - - If any changed file matches a `trigger` pattern AND - - No changed file matches a `safety` pattern AND - - The agent hasn't marked it with a `` tag - - → The policy fires - - If policies fire, Claude is prompted to address them -4. **Promise Tags**: Agents can mark policies as addressed by including `✓ Policy Name` in their response +3. **Session Stop**: When the agent finishes (after_agent event): + - Changed files are detected based on `compare_to` setting (base, default_tip, or prompt) + - Each rule is evaluated based on its detection mode + - Queue entries are created in `.deepwork/tmp/rules/queue/` for deduplication + - For command actions: commands are executed, results tracked + - For prompt actions: if rule fires and not already promised, agent is prompted +4. **Promise Tags**: Agents can mark rules as addressed by including `✓ Rule Name` in their response + +### Queue System + +Rule state is tracked in `.deepwork/tmp/rules/queue/` with files named `{hash}.{status}.json`: +- `queued` - Detected, awaiting evaluation +- `passed` - Rule satisfied (promise found or command succeeded) +- `failed` - Rule not satisfied +- `skipped` - Safety pattern matched + +This prevents re-prompting for the same rule violation within a session. ### Hook Integration -Policies are implemented using Claude Code's hooks system. The `deepwork_policy` standard job includes: +The v2 rules system uses the cross-platform hook wrapper: ``` -.deepwork/jobs/deepwork_policy/hooks/ -├── global_hooks.yml # Maps lifecycle events to scripts -├── user_prompt_submit.sh # Captures baseline at each prompt -├── capture_prompt_work_tree.sh # Creates git state snapshot for compare_to: prompt -└── policy_stop_hook.sh # Evaluates policies on stop (calls Python evaluator) +src/deepwork/hooks/ +├── wrapper.py # Cross-platform input/output normalization +├── rules_check.py # Rule evaluation hook (v2) +├── claude_hook.sh # Claude Code shell wrapper +└── gemini_hook.sh # Gemini CLI shell wrapper +``` + +Hooks are called via the shell wrappers: +```bash +claude_hook.sh deepwork.hooks.rules_check ``` The hooks are installed to `.claude/settings.json` during `deepwork sync`: @@ -1058,11 +1132,8 @@ The hooks are installed to `.claude/settings.json` during `deepwork sync`: ```json { "hooks": { - "UserPromptSubmit": [ - {"matcher": "", "hooks": [{"type": "command", "command": ".deepwork/jobs/deepwork_policy/hooks/user_prompt_submit.sh"}]} - ], "Stop": [ - {"matcher": "", "hooks": [{"type": "command", "command": ".deepwork/jobs/deepwork_policy/hooks/policy_stop_hook.sh"}]} + {"matcher": "", "hooks": [{"type": "command", "command": "python -m deepwork.hooks.rules_check"}]} ] } } @@ -1118,34 +1189,34 @@ def my_hook(input: HookInput) -> HookOutput: See `doc/platforms/` for detailed platform-specific hook documentation. -### Policy Schema +### Rule Schema -Policies are validated against a JSON Schema: +Rules are validated against a JSON Schema: ```yaml -- name: string # Required: Friendly name for the policy +- name: string # Required: Friendly name for the rule trigger: string|array # Required: Glob pattern(s) for triggering files safety: string|array # Optional: Glob pattern(s) for safety files instructions: string # Required (unless instructions_file): What to do instructions_file: string # Alternative: Path to instructions file ``` -### Defining Policies +### Defining Rules -Use the `/deepwork_policy.define` command to interactively create policies: +Use the `/deepwork_rules.define` command to interactively create rules: ``` -User: /deepwork_policy.define +User: /deepwork_rules.define -Claude: I'll help you define a new policy. What guideline or constraint - should this policy enforce? +Claude: I'll help you define a new rule. What guideline or constraint + should this rule enforce? User: When API code changes, the API documentation should be updated Claude: Got it. Let me ask a few questions... [Interactive dialog to define trigger, safety, and instructions] -Claude: ✓ Created policy "API documentation update" in .deepwork.policy.yml +Claude: Created rule "API documentation update" in .deepwork/rules/api-documentation.md ``` --- diff --git a/doc/platforms/gemini/hooks.md b/doc/platforms/gemini/hooks.md index b9103a6f..e8cc11d9 100644 --- a/doc/platforms/gemini/hooks.md +++ b/doc/platforms/gemini/hooks.md @@ -34,11 +34,11 @@ Hooks are configured in `settings.json` at various levels: "matcher": "*", "hooks": [ { - "name": "policy-check", + "name": "rules-check", "type": "command", - "command": ".gemini/hooks/policy_check.sh", + "command": ".gemini/hooks/rules_check.sh", "timeout": 60000, - "description": "Evaluates DeepWork policies" + "description": "Evaluates DeepWork rules" } ] } @@ -264,7 +264,7 @@ Block the agent from completing: ```json { "decision": "deny", - "reason": "Policy X requires attention before completing" + "reason": "Rule X requires attention before completing" } ``` @@ -287,7 +287,7 @@ Block tool execution: ```json { "decision": "deny", - "reason": "Security policy violation" + "reason": "Security rule violation" } ``` diff --git a/doc/rules_syntax.md b/doc/rules_syntax.md new file mode 100644 index 00000000..f4c3ae83 --- /dev/null +++ b/doc/rules_syntax.md @@ -0,0 +1,567 @@ +# Rules Configuration Syntax + +This document describes the syntax for rule files in the `.deepwork/rules/` directory. + +## Directory Structure + +Rules are stored as individual markdown files with YAML frontmatter: + +``` +.deepwork/ +└── rules/ + ├── readme-accuracy.md + ├── source-test-pairing.md + ├── api-documentation.md + └── python-formatting.md +``` + +Each file has: +- **Frontmatter**: YAML configuration between `---` delimiters +- **Body**: Instructions (for prompt actions) or description (for command actions) + +This structure enables code files to reference rules: +```python +# Read the rule `.deepwork/rules/source-test-pairing.md` before editing +class AuthService: + ... +``` + +## Quick Reference + +### Simple Trigger with Prompt + +`.deepwork/rules/readme-accuracy.md`: +```markdown +--- +name: README Accuracy +trigger: src/**/* +safety: README.md +--- +Source code changed. Please verify README.md is accurate. + +Check that: +- All public APIs are documented +- Examples are up to date +- Installation instructions are correct +``` + +### Correspondence Set (bidirectional) + +`.deepwork/rules/source-test-pairing.md`: +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +When modifying source code, ensure corresponding tests are updated. +When adding tests, ensure they test actual source code. +``` + +### Correspondence Pair (directional) + +`.deepwork/rules/api-documentation.md`: +```markdown +--- +name: API Documentation +pair: + trigger: api/{path}.py + expects: docs/api/{path}.md +--- +API changes require documentation updates. + +When modifying an API endpoint, update its documentation to reflect: +- Parameter changes +- Response format changes +- New error conditions +``` + +### Command Action + +`.deepwork/rules/python-formatting.md`: +```markdown +--- +name: Python Formatting +trigger: "**/*.py" +action: + command: ruff format {file} +--- +Automatically formats Python files using ruff. + +This rule runs `ruff format` on any changed Python files to ensure +consistent code style across the codebase. +``` + +## Rule Structure + +Every rule has two orthogonal aspects: + +### Detection Mode + +How the rule decides when to fire: + +| Mode | Field | Description | +|------|-------|-------------| +| **Trigger/Safety** | `trigger`, `safety` | Fire when trigger matches and safety doesn't | +| **Set** | `set` | Fire when file correspondence is incomplete (bidirectional) | +| **Pair** | `pair` | Fire when file correspondence is incomplete (directional) | + +### Action Type + +What happens when the rule fires: + +| Type | Field | Description | +|------|-------|-------------| +| **Prompt** (default) | (markdown body) | Show instructions to the agent | +| **Command** | `action.command` | Run an idempotent command | + +## Detection Modes + +### Trigger/Safety Mode + +The simplest detection mode. Fires when changed files match `trigger` patterns and no changed files match `safety` patterns. + +```yaml +--- +name: Security Review +trigger: + - src/auth/**/* + - src/crypto/**/* +safety: SECURITY.md +compare_to: base +--- +``` + +### Set Mode (Bidirectional Correspondence) + +Defines files that should change together. If ANY file in a correspondence group changes, ALL related files should also change. + +```yaml +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +``` + +**How it works:** + +1. A file changes that matches one pattern in the set +2. System extracts the variable portions (e.g., `{path}`) +3. System generates expected files by substituting into other patterns +4. If ALL expected files also changed: rule is satisfied (no trigger) +5. If ANY expected file is missing: rule fires + +If `src/auth/login.py` changes: +- Extracts `{path}` = `auth/login` +- Expects `tests/auth/login_test.py` to also change +- If test didn't change, fires with instructions + +If `tests/auth/login_test.py` changes: +- Extracts `{path}` = `auth/login` +- Expects `src/auth/login.py` to also change +- If source didn't change, fires with instructions + +### Pair Mode (Directional Correspondence) + +Defines directional relationships. Changes to trigger files require corresponding expected files to change, but not vice versa. + +```yaml +--- +name: API Documentation +pair: + trigger: api/{module}/{name}.py + expects: docs/api/{module}/{name}.md +--- +``` + +Can specify multiple expected patterns: + +```yaml +--- +pair: + trigger: api/{path}.py + expects: + - docs/api/{path}.md + - schemas/{path}.json +--- +``` + +If `api/users/create.py` changes: +- Expects `docs/api/users/create.md` to also change +- If doc didn't change, fires with instructions + +If `docs/api/users/create.md` changes alone: +- No trigger (documentation can be updated independently) + +## Action Types + +### Prompt Action (Default) + +The markdown body after frontmatter serves as instructions shown to the agent. This is the default when no `action` field is specified. + +**Template Variables in Instructions:** + +| Variable | Description | +|----------|-------------| +| `{trigger_file}` | The file that triggered the rule | +| `{trigger_files}` | All files that matched trigger patterns | +| `{expected_files}` | Expected corresponding files (for sets/pairs) | + +### Command Action + +Runs an idempotent command instead of prompting the agent. + +```yaml +--- +name: Python Formatting +trigger: "**/*.py" +safety: "*.pyi" +action: + command: ruff format {file} + run_for: each_match +--- +``` + +**Template Variables in Commands:** + +| Variable | Description | Available When | +|----------|-------------|----------------| +| `{file}` | Single file path | `run_for: each_match` | +| `{files}` | Space-separated file paths | `run_for: all_matches` | +| `{repo_root}` | Repository root directory | Always | + +**Idempotency Requirement:** + +Commands should be idempotent--running them multiple times produces the same result. Lint formatters like `black`, `ruff format`, and `prettier` are good examples: they produce consistent output regardless of how many times they run. + +## Pattern Syntax + +### Basic Glob Patterns + +Standard glob patterns work in `trigger` and `safety` fields: + +| Pattern | Matches | +|---------|---------| +| `*.py` | Python files in current directory | +| `**/*.py` | Python files in any directory | +| `src/**/*` | All files under src/ | +| `test_*.py` | Files starting with `test_` | +| `*.{js,ts}` | JavaScript and TypeScript files | + +### Variable Patterns + +Variable patterns use `{name}` syntax to capture path segments: + +| Pattern | Captures | Example Match | +|---------|----------|---------------| +| `src/{path}.py` | `{path}` = multi-segment path | `src/foo/bar.py` -> `path=foo/bar` | +| `src/{name}.py` | `{name}` = single segment | `src/utils.py` -> `name=utils` | +| `{module}/{name}.py` | Both variables | `auth/login.py` -> `module=auth, name=login` | + +**Variable Naming Conventions:** + +- `{path}` - Conventional name for multi-segment captures (`**/*`) +- `{name}` - Conventional name for single-segment captures (`*`) +- Custom names allowed: `{module}`, `{component}`, etc. + +**Multi-Segment vs Single-Segment:** + +By default, `{path}` matches multiple path segments and `{name}` matches one: + +```yaml +# {path} matches: foo, foo/bar, foo/bar/baz +- "src/{path}.py" # src/foo.py, src/foo/bar.py, src/a/b/c.py + +# {name} matches only single segment +- "src/{name}.py" # src/foo.py (NOT src/foo/bar.py) +``` + +To explicitly control this, use `{**name}` for multi-segment or `{*name}` for single: + +```yaml +- "src/{**module}/index.py" # src/foo/bar/index.py -> module=foo/bar +- "src/{*component}.py" # src/Button.py -> component=Button +``` + +## Field Reference + +### name (required) + +Human-friendly name for the rule. Displayed in promise tags and output. + +```yaml +--- +name: Source/Test Pairing +--- +``` + +### File Naming + +Rule files are named using kebab-case with `.md` extension: +- `readme-accuracy.md` +- `source-test-pairing.md` +- `api-documentation.md` + +The filename serves as the rule's identifier in the queue system. + +### trigger + +File patterns that cause the rule to fire (trigger/safety mode). Can be string or array. + +```yaml +--- +trigger: src/**/*.py +--- + +--- +trigger: + - src/**/*.py + - lib/**/*.py +--- +``` + +### safety (optional) + +File patterns that suppress the rule. If ANY changed file matches a safety pattern, the rule does not fire. + +```yaml +--- +safety: CHANGELOG.md +--- + +--- +safety: + - CHANGELOG.md + - docs/**/* +--- +``` + +### set + +List of patterns defining bidirectional file relationships (set mode). + +```yaml +--- +set: + - src/{path}.py + - tests/{path}_test.py +--- +``` + +### pair + +Object with `trigger` and `expects` patterns for directional relationships (pair mode). + +```yaml +--- +pair: + trigger: api/{path}.py + expects: docs/api/{path}.md +--- + +--- +pair: + trigger: api/{path}.py + expects: + - docs/api/{path}.md + - schemas/{path}.json +--- +``` + +### action (optional) + +Specifies a command to run instead of prompting. + +```yaml +--- +action: + command: ruff format {file} + run_for: each_match # or all_matches +--- +``` + +### compare_to (optional) + +Determines the baseline for detecting file changes. + +| Value | Description | +|-------|-------------| +| `base` (default) | Compare to merge-base with default branch | +| `default_tip` | Compare to current tip of default branch | +| `prompt` | Compare to state at last prompt submission | + +```yaml +--- +compare_to: prompt +--- +``` + +## Complete Examples + +### Example 1: Test Coverage Rule + +`.deepwork/rules/test-coverage.md`: +```markdown +--- +name: Test Coverage +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source code was modified without corresponding test updates. + +Modified source: {trigger_file} +Expected test: {expected_files} + +Please either: +1. Add/update tests for the changed code +2. Explain why tests are not needed +``` + +### Example 2: Documentation Sync + +`.deepwork/rules/api-documentation-sync.md`: +```markdown +--- +name: API Documentation Sync +pair: + trigger: src/api/{module}/{endpoint}.py + expects: + - docs/api/{module}/{endpoint}.md + - openapi/{module}.yaml +--- +API endpoint changed. Please update: +- Documentation: {expected_files} +- Ensure OpenAPI spec is current +``` + +### Example 3: Auto-formatting Pipeline + +`.deepwork/rules/python-black-formatting.md`: +```markdown +--- +name: Python Black Formatting +trigger: "**/*.py" +safety: + - "**/*.pyi" + - "**/migrations/**" +action: + command: black {file} + run_for: each_match +--- +Formats Python files using Black. + +Excludes: +- Type stub files (*.pyi) +- Database migration files +``` + +### Example 4: Multi-file Correspondence + +`.deepwork/rules/full-stack-feature-sync.md`: +```markdown +--- +name: Full Stack Feature Sync +set: + - backend/api/{feature}/routes.py + - backend/api/{feature}/models.py + - frontend/src/api/{feature}.ts + - frontend/src/components/{feature}/**/* +--- +Feature files should be updated together across the stack. + +When modifying a feature, ensure: +- Backend routes are updated +- Backend models are updated +- Frontend API client is updated +- Frontend components are updated +``` + +### Example 5: Conditional Safety + +`.deepwork/rules/version-bump-required.md`: +```markdown +--- +name: Version Bump Required +trigger: + - src/**/*.py + - pyproject.toml +safety: + - pyproject.toml + - CHANGELOG.md +--- +Code changes detected. Before merging, ensure: +- Version is bumped in pyproject.toml (if needed) +- CHANGELOG.md is updated + +This rule is suppressed if you've already modified pyproject.toml +or CHANGELOG.md, as that indicates you're handling versioning. +``` + +## Promise Tags + +When a rule fires but should be dismissed, use promise tags in the conversation. The tag content should be human-readable, using the rule's `name` field: + +``` +Source/Test Pairing +API Documentation Sync +``` + +The friendly name makes promise tags easy to read when displayed in the conversation. The system matches promise tags to rules using case-insensitive comparison of the `name` field. + +## Validation + +Rule files are validated on load. Common errors: + +**Invalid frontmatter:** +``` +Error: .deepwork/rules/my-rule.md - invalid YAML frontmatter +``` + +**Missing required field:** +``` +Error: .deepwork/rules/my-rule.md - must have 'trigger', 'set', or 'pair' +``` + +**Invalid pattern:** +``` +Error: .deepwork/rules/test-coverage.md - invalid pattern "src/{path" - unclosed brace +``` + +**Conflicting fields:** +``` +Error: .deepwork/rules/my-rule.md - has both 'trigger' and 'set' - use one or the other +``` + +**Empty body:** +``` +Error: .deepwork/rules/my-rule.md - instruction rules require markdown body +``` + +## Referencing Rules in Code + +A key benefit of the `.deepwork/rules/` folder structure is that code files can reference rules directly: + +```python +# Read `.deepwork/rules/source-test-pairing.md` before editing this file + +class UserService: + """Service for user management.""" + pass +``` + +```typescript +// This file is governed by `.deepwork/rules/api-documentation.md` +// Any changes here require corresponding documentation updates + +export async function createUser(data: UserInput): Promise { + // ... +} +``` + +This helps AI agents and human developers understand which rules apply to specific files. diff --git a/doc/rules_system_design.md b/doc/rules_system_design.md new file mode 100644 index 00000000..24e296b5 --- /dev/null +++ b/doc/rules_system_design.md @@ -0,0 +1,547 @@ +# Rules System Design + +## Overview + +The deepwork rules system enables automated enforcement of development standards during AI-assisted coding sessions. This document describes the architecture for the next-generation rules system with support for: + +1. **File correspondence matching** (sets and pairs) +2. **Idempotent command execution** +3. **Stateful evaluation with queue-based processing** +4. **Efficient agent output management** + +## Core Concepts + +### Rule Structure + +Every rule has two orthogonal aspects: + +**Detection Mode** - How the rule decides when to fire: + +| Mode | Field | Description | +|------|-------|-------------| +| **Trigger/Safety** | `trigger`, `safety` | Fire when trigger matches and safety doesn't | +| **Set** | `set` | Fire when file correspondence is incomplete (bidirectional) | +| **Pair** | `pair` | Fire when file correspondence is incomplete (directional) | + +**Action Type** - What happens when the rule fires: + +| Type | Field | Description | +|------|-------|-------------| +| **Prompt** (default) | (markdown body) | Show instructions to the agent | +| **Command** | `action.command` | Run an idempotent command | + +### Detection Modes + +**Trigger/Safety Mode** +- Simplest mode: fire when files match `trigger` and none match `safety` +- Good for general checks like "source changed, verify README" + +**Set Mode (Bidirectional Correspondence)** +- Define N patterns that share a common variable path +- If ANY file matching one pattern changes, ALL corresponding files should change +- Example: Source files and their tests + +**Pair Mode (Directional Correspondence)** +- Define a trigger pattern and one or more expected patterns +- Changes to trigger files require corresponding expected files to also change +- Changes to expected files alone do not trigger the rule +- Example: API code requires documentation updates + +### Pattern Variables + +Patterns use `{name}` syntax for capturing variable path segments: + +``` +src/{path}.py # {path} captures everything between src/ and .py +tests/{path}_test.py # {path} must match the same value +``` + +Special variable names: +- `{path}` - Matches any path segments (equivalent to `**/*`) +- `{name}` - Matches a single path segment (equivalent to `*`) +- `{**}` - Explicit multi-segment wildcard +- `{*}` - Explicit single-segment wildcard + +### Action Types + +**Prompt Action (default)** +The markdown body of the rule file serves as instructions shown to the agent. + +**Command Action** +```yaml +action: + command: "ruff format {file}" + run_for: each_match +``` + +Command actions should be idempotent—running them multiple times produces the same result. Lint formatters like `black`, `ruff format`, and `prettier` are good examples. + +## Architecture + +### Component Overview + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Rules System │ +├─────────────────────────────────────────────────────────────────┤ +│ │ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ +│ │ Detector │───▶│ Queue │◀───│ Evaluator │ │ +│ │ │ │ │ │ │ │ +│ │ - Watch files│ │ .deepwork/ │ │ - Process │ │ +│ │ - Match rules│ │ tmp/rules/ │ │ queued │ │ +│ │ - Create │ │ queue/ │ │ - Run action │ │ +│ │ entries │ │ │ │ - Update │ │ +│ └──────────────┘ └──────────────┘ │ status │ │ +│ └──────────────┘ │ +│ │ +│ ┌──────────────┐ ┌──────────────┐ │ +│ │ Matcher │ │ Resolver │ │ +│ │ │ │ │ │ +│ │ - Pattern │ │ - Variable │ │ +│ │ matching │ │ extraction │ │ +│ │ - Glob │ │ - Path │ │ +│ │ expansion │ │ generation │ │ +│ └──────────────┘ └──────────────┘ │ +│ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +### Detector + +The detector identifies when rules should be evaluated: + +1. **Trigger Detection**: Monitors for file changes that match rule triggers +2. **Deduplication**: Computes a hash to avoid re-processing identical triggers +3. **Queue Entry Creation**: Creates entries for the evaluator to process + +**Trigger Hash Computation**: +```python +hash_input = f"{rule_name}:{sorted(trigger_files)}:{baseline_ref}" +trigger_hash = sha256(hash_input.encode()).hexdigest()[:12] +``` + +The baseline_ref varies by `compare_to` mode: +- `base`: merge-base commit hash +- `default_tip`: remote tip commit hash +- `prompt`: timestamp of last prompt submission + +### Queue + +The queue persists rule trigger state in `.deepwork/tmp/rules/queue/`: + +``` +.deepwork/tmp/rules/queue/ +├── {hash}.queued.json # Detected, awaiting evaluation +├── {hash}.passed.json # Evaluated, rule satisfied +├── {hash}.failed.json # Evaluated, rule not satisfied +└── {hash}.skipped.json # Safety pattern matched, skipped +``` + +**Queue Entry Schema**: +```json +{ + "rule_name": "string", + "trigger_hash": "string", + "status": "queued|passed|failed|skipped", + "created_at": "ISO8601 timestamp", + "evaluated_at": "ISO8601 timestamp or null", + "baseline_ref": "string", + "trigger_files": ["array", "of", "files"], + "expected_files": ["array", "of", "files"], + "matched_files": ["array", "of", "files"], + "action_result": { + "type": "prompt|command", + "output": "string or null", + "exit_code": "number or null" + } +} +``` + +**Queue Cleanup**: +Since `.deepwork/tmp/` is gitignored, queue entries are transient local state. No aggressive cleanup is required—entries can accumulate without causing issues. The directory can be safely deleted at any time to reset state. + +### Evaluator + +The evaluator processes queued entries: + +1. **Load Entry**: Read queued entry from disk +2. **Verify Still Relevant**: Re-check that trigger conditions still apply +3. **Execute Action**: + - For prompts: Format message and return to hook system + - For commands: Execute command, verify idempotency +4. **Update Status**: Mark as passed, failed, or skipped +5. **Report Results**: Return appropriate response to caller + +### Matcher + +Pattern matching with variable extraction: + +**Algorithm**: +```python +def match_pattern(pattern: str, filepath: str) -> dict[str, str] | None: + """ + Match filepath against pattern, extracting variables. + + Returns dict of {variable_name: captured_value} or None if no match. + """ + # Convert pattern to regex with named groups + # {path} -> (?P.+) + # {name} -> (?P[^/]+) + # Literal parts are escaped + regex = pattern_to_regex(pattern) + match = re.fullmatch(regex, filepath) + if match: + return match.groupdict() + return None +``` + +**Pattern Compilation**: +```python +def pattern_to_regex(pattern: str) -> str: + """Convert pattern with {var} placeholders to regex.""" + result = [] + for segment in parse_pattern(pattern): + if segment.is_variable: + if segment.name in ('path', '**'): + result.append(f'(?P<{segment.name}>.+)') + else: + result.append(f'(?P<{segment.name}>[^/]+)') + else: + result.append(re.escape(segment.value)) + return ''.join(result) +``` + +### Resolver + +Generates expected filepaths from patterns and captured variables: + +```python +def resolve_pattern(pattern: str, variables: dict[str, str]) -> str: + """ + Substitute variables into pattern to generate filepath. + + Example: + resolve_pattern("tests/{path}_test.py", {"path": "foo/bar"}) + -> "tests/foo/bar_test.py" + """ + result = pattern + for name, value in variables.items(): + result = result.replace(f'{{{name}}}', value) + return result +``` + +## Evaluation Flow + +### Standard Instruction Rule + +``` +1. Detector: File changes detected +2. Detector: Check each rule's trigger patterns +3. Detector: For matching rule, compute trigger hash +4. Detector: If hash not in queue, create .queued entry +5. Evaluator: Process queued entry +6. Evaluator: Check safety patterns against changed files +7. Evaluator: If safety matches, mark .skipped +8. Evaluator: If no safety match, return instructions to agent +9. Agent: Addresses rule, includes tag +10. Evaluator: On next check, mark .passed (promise found) +``` + +### Correspondence Rule (Set) + +``` +1. Detector: File src/foo/bar.py changed +2. Matcher: Matches pattern "src/{path}.py" with {path}="foo/bar" +3. Resolver: Generate expected files from other patterns: + - "tests/{path}_test.py" -> "tests/foo/bar_test.py" +4. Detector: Check if tests/foo/bar_test.py also changed +5. Detector: If yes, mark .skipped (correspondence satisfied) +6. Detector: If no, create .queued entry +7. Evaluator: Return instructions prompting for test update +``` + +### Correspondence Rule (Pair) + +``` +1. Detector: File api/users.py changed (trigger pattern) +2. Matcher: Matches "api/{path}.py" with {path}="users" +3. Resolver: Generate expected: "docs/api/users.md" +4. Detector: Check if docs/api/users.md also changed +5. Detector: If yes, mark .skipped +6. Detector: If no, create .queued entry +7. Evaluator: Return instructions + +Note: If only docs/api/users.md changed (not api/users.py), +the pair rule does NOT trigger (directional). +``` + +### Command Rule + +``` +1. Detector: Python file changed, matches "**/*.py" +2. Detector: Create .queued entry for format rule +3. Evaluator: Execute "ruff format {file}" +4. Evaluator: Run git diff to check for changes +5. Evaluator: If changes made, re-run command (idempotency check) +6. Evaluator: If no additional changes, mark .passed +7. Evaluator: If changes keep occurring, mark .failed, alert user +``` + +## Agent Output Management + +### Problem + +When many rules trigger, the agent receives excessive output, degrading performance. + +### Solution + +**1. Output Batching** +Group related rules into concise sections: + +``` +The following rules require attention: + +## Source/Test Pairing +src/auth/login.py → tests/auth/login_test.py +src/api/users.py → tests/api/users_test.py + +## API Documentation +api/users.py → docs/api/users.md + +## README Accuracy +Source files changed. Verify README.md is accurate. +``` + +**2. Grouped by Rule Name** +Multiple violations of the same rule are grouped together under a single heading, keeping output compact. + +**3. Minimal Decoration** +Avoid excessive formatting, numbering, or emphasis. Use simple arrow notation for correspondence violations. + +## State Persistence + +### Directory Structure + +``` +.deepwork/ +├── rules/ # Rule definitions (frontmatter markdown) +│ ├── readme-accuracy.md +│ ├── source-test-pairing.md +│ ├── api-documentation.md +│ └── python-formatting.md +├── tmp/ # GITIGNORED - transient state +│ └── rules/ +│ ├── queue/ # Queue entries +│ │ ├── abc123.queued.json +│ │ └── def456.passed.json +│ ├── baselines/ # Cached baseline states +│ │ └── prompt_1705420800.json +│ └── cache/ # Pattern matching cache +│ └── patterns.json +└── rules_state.json # Session state summary +``` + +**Important:** The entire `.deepwork/tmp/` directory is gitignored. All queue entries, baselines, and caches are local transient state that is not committed. This means cleanup is not critical—files can accumulate and will be naturally cleaned when the directory is deleted or the repo is re-cloned. + +### Rule File Format + +Each rule is a markdown file with YAML frontmatter: + +```markdown +--- +name: README Accuracy +trigger: src/**/*.py +safety: README.md +--- +Instructions shown to the agent when this rule fires. + +These can be multi-line with full markdown formatting. +``` + +This format enables: +1. Code files to reference rules in comments +2. Human-readable rule documentation +3. Easy editing with any markdown editor +4. Clear separation of configuration and content + +### Baseline Management + +For `compare_to: prompt`, baselines are captured at prompt submission: + +```json +{ + "timestamp": "2024-01-16T12:00:00Z", + "commit": "abc123", + "staged_files": ["file1.py", "file2.py"], + "untracked_files": ["file3.py"] +} +``` + +Multiple baselines can exist for different prompts in a session. + +### Queue Lifecycle + +``` + ┌─────────┐ + │ Created │ + │ .queued │ + └────┬────┘ + │ + ┌─────────────┼─────────────┐ + │ │ │ + ▼ ▼ ▼ + ┌─────────┐ ┌─────────┐ ┌─────────┐ + │ .passed │ │ .failed │ │.skipped │ + └─────────┘ └─────────┘ └─────────┘ +``` + +Terminal states persist in `.deepwork/tmp/` (gitignored) until manually cleared or the directory is deleted. + +## Error Handling + +### Pattern Errors + +Invalid patterns are caught at rule load time: + +```python +class PatternError(RulesError): + """Invalid pattern syntax.""" + pass + +# Validation +def validate_pattern(pattern: str) -> None: + # Check for unbalanced braces + # Check for invalid variable names + # Check for unsupported syntax +``` + +### Command Errors + +Command execution errors are captured and reported: + +```json +{ + "status": "failed", + "action_result": { + "type": "command", + "command": "ruff format {file}", + "exit_code": 1, + "stdout": "", + "stderr": "error: invalid syntax in foo.py:10" + } +} +``` + +### Queue Corruption + +If queue entries become corrupted: +1. Log error with entry details +2. Remove corrupted entry +3. Re-detect triggers on next evaluation + +## Configuration + +### Rule Files + +Rules are stored in `.deepwork/rules/` as individual markdown files with YAML frontmatter. See `doc/rules_syntax.md` for complete syntax documentation. + +**Loading Order:** +1. All `.md` files in `.deepwork/rules/` are loaded +2. Files are processed in alphabetical order +3. Filename (without extension) becomes rule identifier + +**Rule Discovery:** +```python +def load_rules(rules_dir: Path) -> list[Rule]: + """Load all rules from the rules directory.""" + rules = [] + for path in sorted(rules_dir.glob("*.md")): + rule = parse_rule_file(path) + rule.name = path.stem # filename without .md + rules.append(rule) + return rules +``` + +### System Configuration + +In `.deepwork/config.yml`: + +```yaml +rules: + enabled: true + rules_dir: .deepwork/rules # Can be customized +``` + +## Performance Considerations + +### Caching + +- Pattern compilation is cached per-session +- Baseline diffs are cached by commit hash +- Queue lookups use hash-based O(1) access + +### Lazy Evaluation + +- Patterns only compiled when needed +- File lists only computed for triggered rules +- Instructions only loaded when rule fires + +### Parallel Processing + +- Multiple queue entries can be processed in parallel +- Command actions can run concurrently (with file locking) +- Pattern matching is parallelized across rules + +## Migration from Legacy System + +The legacy system used a single `.deepwork.rules.yml` file with array of rules. The new system uses individual markdown files in `.deepwork/rules/`. + +**Breaking Changes:** +- Single YAML file replaced with folder of markdown files +- Rule `name` field replaced with filename +- `instructions` / `instructions_file` replaced with markdown body +- New features: sets, pairs, commands, queue-based state + +**No backwards compatibility is provided.** Existing `.deepwork.rules.yml` files must be converted manually. + +**Conversion Example:** + +Old format (`.deepwork.rules.yml`): +```yaml +- name: "README Accuracy" + trigger: "src/**/*" + safety: "README.md" + instructions: | + Please verify README.md is accurate. +``` + +New format (`.deepwork/rules/readme-accuracy.md`): +```markdown +--- +trigger: src/**/* +safety: README.md +--- +Please verify README.md is accurate. +``` + +## Security Considerations + +### Command Execution + +- Commands run in sandboxed subprocess +- No shell expansion (arguments passed as array) +- Working directory is always repo root +- Environment variables are filtered + +### Queue File Permissions + +- Queue directory: 700 (owner only) +- Queue files: 600 (owner only) +- No sensitive data in queue entries + +### Input Validation + +- All rule files validated against schema +- Pattern variables sanitized before use +- File paths normalized and validated diff --git a/manual_tests/README.md b/manual_tests/README.md new file mode 100644 index 00000000..7baaddb4 --- /dev/null +++ b/manual_tests/README.md @@ -0,0 +1,66 @@ +# Manual Hook/Rule Tests for Claude + +This directory contains files designed to manually test different types of deepwork rules/hooks. +Each test must verify BOTH that the rule fires when it should AND does not fire when it shouldn't. + +## How to Run These Tests + +**The best way to run these tests is as sub-agents using a fast model (e.g., haiku).** + +This approach works because: +1. Sub-agents run in isolated contexts where changes can be detected +2. The Stop hook evaluates rules when the sub-agent completes +3. Using a fast model keeps test iterations quick and cheap + +After each sub-agent returns, run the hook to verify: +```bash +echo '{}' | python -m deepwork.hooks.rules_check +``` + +Then revert changes before the next test: +```bash +git checkout -- manual_tests/ +``` + +## Test Matrix + +Each test has two cases: one where the rule SHOULD fire, and one where it should NOT. + +| Test | Should Fire | Should NOT Fire | Rule Name | +|------|-------------|-----------------|-----------| +| **Trigger/Safety** | Edit `.py` only | Edit `.py` AND `_doc.md` | Manual Test: Trigger Safety | +| **Set Mode** | Edit `_source.py` only | Edit `_source.py` AND `_test.py` | Manual Test: Set Mode | +| **Pair Mode** | Edit `_trigger.py` only | Edit `_trigger.py` AND `_expected.md` | Manual Test: Pair Mode | +| **Pair Mode (reverse)** | — | Edit `_expected.md` only (should NOT fire) | Manual Test: Pair Mode | +| **Command Action** | Edit `.txt` → log appended | — (always runs) | Manual Test: Command Action | +| **Multi Safety** | Edit `.py` only | Edit `.py` AND any safety file | Manual Test: Multi Safety | + +## Test Results Tracking + +| Test Case | Fires When Should | Does NOT Fire When Shouldn't | +|-----------|:-----------------:|:----------------------------:| +| Trigger/Safety | ☐ | ☐ | +| Set Mode | ☐ | ☐ | +| Pair Mode (forward) | ☐ | ☐ | +| Pair Mode (reverse - expected only) | — | ☐ | +| Command Action | ☐ | — | +| Multi Safety | ☐ | ☐ | + +## Test Folders + +| Folder | Rule Type | Description | +|--------|-----------|-------------| +| `test_trigger_safety_mode/` | Trigger/Safety | Basic conditional: fires unless safety file also edited | +| `test_set_mode/` | Set (Bidirectional) | Files must change together (either direction) | +| `test_pair_mode/` | Pair (Directional) | One-way: trigger requires expected, but not vice versa | +| `test_command_action/` | Command Action | Automatically runs command on file change | +| `test_multi_safety/` | Multiple Safety | Fires unless ANY of the safety files also edited | + +## Corresponding Rules + +Rules are defined in `.deepwork/rules/`: +- `manual-test-trigger-safety.md` +- `manual-test-set-mode.md` +- `manual-test-pair-mode.md` +- `manual-test-command-action.md` +- `manual-test-multi-safety.md` diff --git a/manual_tests/test_command_action/test_command_action.txt b/manual_tests/test_command_action/test_command_action.txt new file mode 100644 index 00000000..f32315ab --- /dev/null +++ b/manual_tests/test_command_action/test_command_action.txt @@ -0,0 +1,25 @@ +MANUAL TEST: Command Action Rule + +=== WHAT THIS TESTS === +Tests the "command action" feature where a rule automatically +runs a shell command instead of prompting the agent. + +=== HOW TO TRIGGER === +Edit this file (add text, modify content, etc.) + +=== EXPECTED BEHAVIOR === +When this file is edited, the rule automatically runs a command +that appends a timestamped line to test_command_action_log.txt + +The command is idempotent: running it multiple times produces +consistent results (a log entry is appended). + +=== RULE LOCATION === +.deepwork/rules/manual-test-command-action.md + +=== LOG FILE === +Check test_command_action_log.txt for command execution results. + +--- +Edit below this line to trigger the command: +--- diff --git a/manual_tests/test_command_action/test_command_action_log.txt b/manual_tests/test_command_action/test_command_action_log.txt new file mode 100644 index 00000000..1ca155ed --- /dev/null +++ b/manual_tests/test_command_action/test_command_action_log.txt @@ -0,0 +1,3 @@ +# Command Action Log +# Lines below are added automatically when test_command_action.txt is edited +# --- diff --git a/manual_tests/test_multi_safety/test_multi_safety.py b/manual_tests/test_multi_safety/test_multi_safety.py new file mode 100644 index 00000000..40cd981c --- /dev/null +++ b/manual_tests/test_multi_safety/test_multi_safety.py @@ -0,0 +1,43 @@ +""" +MANUAL TEST: Multiple Safety Patterns + +=== WHAT THIS TESTS === +Tests trigger/safety mode with MULTIPLE safety patterns: +- Rule fires when this file is edited alone +- Rule is suppressed if ANY of the safety files are also edited: + - test_multi_safety_changelog.md + - test_multi_safety_version.txt + +=== TEST CASE 1: Rule SHOULD fire === +1. Edit this file (add a comment below the marker) +2. Do NOT edit any safety files +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Multi Safety" appears in output + +=== TEST CASE 2: Rule should NOT fire (changelog edited) === +1. Edit this file (add a comment below the marker) +2. ALSO edit test_multi_safety_changelog.md +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Multi Safety" does NOT appear + +=== TEST CASE 3: Rule should NOT fire (version edited) === +1. Edit this file (add a comment below the marker) +2. ALSO edit test_multi_safety_version.txt +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Multi Safety" does NOT appear + +=== RULE LOCATION === +.deepwork/rules/manual-test-multi-safety.md +""" + + +VERSION = "1.0.0" + + +def get_version(): + """Return the current version.""" + return VERSION + + +# Edit below this line to trigger the rule +# ------------------------------------------- diff --git a/manual_tests/test_multi_safety/test_multi_safety_changelog.md b/manual_tests/test_multi_safety/test_multi_safety_changelog.md new file mode 100644 index 00000000..d0a6e4f9 --- /dev/null +++ b/manual_tests/test_multi_safety/test_multi_safety_changelog.md @@ -0,0 +1,16 @@ +# Changelog (Multi-Safety Test) + +## What This File Does + +This is one of the "safety" files for the multi-safety test. +Editing this file suppresses the rule when the source is edited. + +## Changelog + +### v1.0.0 +- Initial release + +--- + +Edit below this line to suppress the multi-safety rule: + diff --git a/manual_tests/test_multi_safety/test_multi_safety_version.txt b/manual_tests/test_multi_safety/test_multi_safety_version.txt new file mode 100644 index 00000000..b9cf607d --- /dev/null +++ b/manual_tests/test_multi_safety/test_multi_safety_version.txt @@ -0,0 +1,10 @@ +Multi-Safety Version File + +This is one of the "safety" files for the multi-safety test. +Editing this file suppresses the rule when the source is edited. + +Current Version: 1.0.0 + +--- +Edit below this line to suppress the multi-safety rule: +--- diff --git a/manual_tests/test_pair_mode/test_pair_mode_expected.md b/manual_tests/test_pair_mode/test_pair_mode_expected.md new file mode 100644 index 00000000..b4f286bd --- /dev/null +++ b/manual_tests/test_pair_mode/test_pair_mode_expected.md @@ -0,0 +1,31 @@ +# API Documentation (Pair Mode Expected File) + +## What This File Does + +This is the "expected" file in a pair mode rule. + +## Pair Mode Behavior + +- When `test_pair_mode_trigger.py` changes, this file MUST also change +- When THIS file changes alone, NO rule fires (docs can update independently) + +## API Reference + +### `api_endpoint()` + +Returns a status response. + +**Returns:** `{"status": "ok", "message": "API response"}` + +--- + +## Testing Instructions + +1. To TRIGGER the rule: Edit only `test_pair_mode_trigger.py` +2. To verify ONE-WAY: Edit only this file (rule should NOT fire) +3. To SATISFY the rule: Edit both files together + +--- + +Edit below this line (editing here alone should NOT trigger the rule): + diff --git a/manual_tests/test_pair_mode/test_pair_mode_trigger.py b/manual_tests/test_pair_mode/test_pair_mode_trigger.py new file mode 100644 index 00000000..369dd18a --- /dev/null +++ b/manual_tests/test_pair_mode/test_pair_mode_trigger.py @@ -0,0 +1,47 @@ +""" +MANUAL TEST: Pair Mode (Directional Correspondence) + +=== WHAT THIS TESTS === +Tests the "pair" detection mode where there's a ONE-WAY relationship: +- This file is the TRIGGER +- test_pair_mode_expected.md is the EXPECTED file +- When THIS file changes, the expected file MUST also change +- But the expected file CAN change independently (no rule fires) + +=== TEST CASE 1: Rule SHOULD fire === +1. Edit this file (add a comment below the marker) +2. Do NOT edit test_pair_mode_expected.md +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Pair Mode" appears in output + +=== TEST CASE 2: Rule should NOT fire (both edited) === +1. Edit this file (add a comment below the marker) +2. ALSO edit test_pair_mode_expected.md +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Pair Mode" does NOT appear + +=== TEST CASE 3: Rule should NOT fire (expected only) === +1. Do NOT edit this file +2. Edit ONLY test_pair_mode_expected.md +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Pair Mode" does NOT appear + (This verifies the ONE-WAY nature of pair mode) + +=== RULE LOCATION === +.deepwork/rules/manual-test-pair-mode.md +""" + + +def api_endpoint(): + """ + An API endpoint that requires documentation. + + This simulates an API file where changes require + documentation updates, but docs can be updated + independently (for typos, clarifications, etc.) + """ + return {"status": "ok", "message": "API response"} + + +# Edit below this line to trigger the rule +# ------------------------------------------- diff --git a/manual_tests/test_set_mode/test_set_mode_source.py b/manual_tests/test_set_mode/test_set_mode_source.py new file mode 100644 index 00000000..6649e424 --- /dev/null +++ b/manual_tests/test_set_mode/test_set_mode_source.py @@ -0,0 +1,40 @@ +""" +MANUAL TEST: Set Mode (Bidirectional Correspondence) + +=== WHAT THIS TESTS === +Tests the "set" detection mode where files must change together: +- This source file and test_set_mode_test.py are in a "set" +- If EITHER file changes, the OTHER must also change +- This is BIDIRECTIONAL (works in both directions) + +=== TEST CASE 1: Rule SHOULD fire === +1. Edit this file (add a comment below the marker) +2. Do NOT edit test_set_mode_test.py +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Set Mode" appears in output + +=== TEST CASE 2: Rule should NOT fire === +1. Edit this file (add a comment below the marker) +2. ALSO edit test_set_mode_test.py +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Set Mode" does NOT appear + +=== RULE LOCATION === +.deepwork/rules/manual-test-set-mode.md +""" + + +class Calculator: + """A simple calculator for testing set mode.""" + + def add(self, a: int, b: int) -> int: + """Add two numbers.""" + return a + b + + def subtract(self, a: int, b: int) -> int: + """Subtract b from a.""" + return a - b + + +# Edit below this line to trigger the rule +# ------------------------------------------- diff --git a/manual_tests/test_set_mode/test_set_mode_test.py b/manual_tests/test_set_mode/test_set_mode_test.py new file mode 100644 index 00000000..3ef349e4 --- /dev/null +++ b/manual_tests/test_set_mode/test_set_mode_test.py @@ -0,0 +1,37 @@ +""" +MANUAL TEST: Set Mode - Test File (Bidirectional Correspondence) + +=== WHAT THIS TESTS === +This is the TEST file for the set mode test. +It must change together with test_set_mode_source.py. + +=== HOW TO TRIGGER === +Option A: Edit this file alone (without test_set_mode_source.py) +Option B: Edit test_set_mode_source.py alone (without this file) + +=== EXPECTED BEHAVIOR === +- Edit this file alone -> Rule fires, expects source file to also change +- Edit source file alone -> Rule fires, expects this file to also change +- Edit BOTH files -> Rule is satisfied (no fire) + +=== RULE LOCATION === +.deepwork/rules/manual-test-set-mode.md +""" + +from test_set_mode_source import Calculator + + +def test_add(): + """Test the add method.""" + calc = Calculator() + assert calc.add(2, 3) == 5 + + +def test_subtract(): + """Test the subtract method.""" + calc = Calculator() + assert calc.subtract(5, 3) == 2 + + +# Edit below this line to trigger the rule +# ------------------------------------------- diff --git a/manual_tests/test_trigger_safety_mode/test_trigger_safety_mode.py b/manual_tests/test_trigger_safety_mode/test_trigger_safety_mode.py new file mode 100644 index 00000000..68bf59b0 --- /dev/null +++ b/manual_tests/test_trigger_safety_mode/test_trigger_safety_mode.py @@ -0,0 +1,32 @@ +""" +MANUAL TEST: Trigger/Safety Mode Rule + +=== WHAT THIS TESTS === +Tests the basic trigger/safety detection mode where: +- Rule FIRES when this file is edited alone +- Rule is SUPPRESSED when test_trigger_safety_mode_doc.md is also edited + +=== TEST CASE 1: Rule SHOULD fire === +1. Edit this file (add a comment below the marker) +2. Do NOT edit test_trigger_safety_mode_doc.md +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Trigger Safety" appears in output + +=== TEST CASE 2: Rule should NOT fire === +1. Edit this file (add a comment below the marker) +2. ALSO edit test_trigger_safety_mode_doc.md +3. Run: echo '{}' | python -m deepwork.hooks.rules_check +4. Expected: "Manual Test: Trigger Safety" does NOT appear + +=== RULE LOCATION === +.deepwork/rules/manual-test-trigger-safety.md +""" + + +def example_function(): + """An example function to demonstrate the trigger.""" + return "Hello from trigger safety test" + + +# Edit below this line to trigger the rule +# ------------------------------------------- diff --git a/manual_tests/test_trigger_safety_mode/test_trigger_safety_mode_doc.md b/manual_tests/test_trigger_safety_mode/test_trigger_safety_mode_doc.md new file mode 100644 index 00000000..625cf0b5 --- /dev/null +++ b/manual_tests/test_trigger_safety_mode/test_trigger_safety_mode_doc.md @@ -0,0 +1,20 @@ +# Documentation for Trigger Safety Test + +## What This File Does + +This is the "safety" file for the trigger/safety mode test. + +## How It Works + +When this file is edited ALONGSIDE `test_trigger_safety_mode.py`, +the trigger/safety rule is suppressed (does not fire). + +## Testing + +1. To TRIGGER the rule: Edit only `test_trigger_safety_mode.py` +2. To SUPPRESS the rule: Edit both files together + +--- + +Edit below this line to suppress the trigger/safety rule: + diff --git a/pyproject.toml b/pyproject.toml index f3d38afd..d84e3edb 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "deepwork" -version = "0.3.0" +version = "0.4.0" description = "Framework for enabling AI agents to perform complex, multi-step work tasks" readme = "README.md" requires-python = ">=3.11" diff --git a/src/deepwork/cli/install.py b/src/deepwork/cli/install.py index d65c5a4e..1eaca748 100644 --- a/src/deepwork/cli/install.py +++ b/src/deepwork/cli/install.py @@ -73,9 +73,9 @@ def _inject_deepwork_jobs(jobs_dir: Path, project_path: Path) -> None: _inject_standard_job("deepwork_jobs", jobs_dir, project_path) -def _inject_deepwork_policy(jobs_dir: Path, project_path: Path) -> None: +def _inject_deepwork_rules(jobs_dir: Path, project_path: Path) -> None: """ - Inject the deepwork_policy job definition into the project. + Inject the deepwork_rules job definition into the project. Args: jobs_dir: Path to .deepwork/jobs directory @@ -84,7 +84,7 @@ def _inject_deepwork_policy(jobs_dir: Path, project_path: Path) -> None: Raises: InstallError: If injection fails """ - _inject_standard_job("deepwork_policy", jobs_dir, project_path) + _inject_standard_job("deepwork_rules", jobs_dir, project_path) def _create_deepwork_gitignore(deepwork_dir: Path) -> None: @@ -98,7 +98,7 @@ def _create_deepwork_gitignore(deepwork_dir: Path) -> None: """ gitignore_path = deepwork_dir / ".gitignore" gitignore_content = """# DeepWork temporary files -# These files are used for policy evaluation during sessions +# These files are used for rules evaluation during sessions .last_work_tree """ @@ -113,44 +113,83 @@ def _create_deepwork_gitignore(deepwork_dir: Path) -> None: gitignore_path.write_text(gitignore_content) -def _create_default_policy_file(project_path: Path) -> bool: +def _create_rules_directory(project_path: Path) -> bool: """ - Create a default policy file template in the project root. + Create the v2 rules directory structure with example templates. - Only creates the file if it doesn't already exist. + Creates .deepwork/rules/ with example rule files that users can customize. + Only creates the directory if it doesn't already exist. Args: project_path: Path to the project root Returns: - True if the file was created, False if it already existed + True if the directory was created, False if it already existed """ - policy_file = project_path / ".deepwork.policy.yml" + rules_dir = project_path / ".deepwork" / "rules" - if policy_file.exists(): + if rules_dir.exists(): return False - # Copy the template from the templates directory - template_path = Path(__file__).parent.parent / "templates" / "default_policy.yml" + # Create the rules directory + ensure_dir(rules_dir) - if template_path.exists(): - shutil.copy(template_path, policy_file) - else: - # Fallback: create a minimal template inline - policy_file.write_text( - """# DeepWork Policy Configuration -# -# Policies are automated guardrails that trigger when specific files change. -# Use /deepwork_policy.define to create new policies interactively. -# -# Format: -# - name: "Policy name" -# trigger: "glob/pattern/**/*" -# safety: "optional/pattern/**/*" -# instructions: | -# Instructions for the AI agent... + # Copy example rule templates from the deepwork_rules standard job + example_rules_dir = Path(__file__).parent.parent / "standard_jobs" / "deepwork_rules" / "rules" + + if example_rules_dir.exists(): + # Copy all .example files + for example_file in example_rules_dir.glob("*.md.example"): + dest_file = rules_dir / example_file.name + shutil.copy(example_file, dest_file) + + # Create a README file explaining the rules system + readme_content = """# DeepWork Rules + +Rules are automated guardrails that trigger when specific files change during +AI agent sessions. They help ensure documentation stays current, security reviews +happen, and team guidelines are followed. + +## Getting Started + +1. Copy an example file and rename it (remove the `.example` suffix): + ``` + cp readme-documentation.md.example readme-documentation.md + ``` + +2. Edit the file to match your project's patterns + +3. The rule will automatically trigger when matching files change + +## Rule Format + +Rules use YAML frontmatter in markdown files: + +```markdown +--- +name: Rule Name +trigger: "pattern/**/*" +safety: "optional/pattern" +--- +Instructions in markdown here. +``` + +## Detection Modes + +- **trigger/safety**: Fire when trigger matches, unless safety also matches +- **set**: Bidirectional file correspondence (e.g., source + test) +- **pair**: Directional correspondence (e.g., API code -> docs) + +## Documentation + +See `doc/rules_syntax.md` in the DeepWork repository for full syntax documentation. + +## Creating Rules Interactively + +Use `/deepwork_rules.define` to create new rules with guidance. """ - ) + readme_path = rules_dir / "README.md" + readme_path.write_text(readme_content) return True @@ -271,17 +310,17 @@ def _install_deepwork(platform_name: str | None, project_path: Path) -> None: # Step 3b: Inject standard jobs (core job definitions) console.print("[yellow]→[/yellow] Installing core job definitions...") _inject_deepwork_jobs(jobs_dir, project_path) - _inject_deepwork_policy(jobs_dir, project_path) + _inject_deepwork_rules(jobs_dir, project_path) # Step 3c: Create .gitignore for temporary files _create_deepwork_gitignore(deepwork_dir) console.print(" [green]✓[/green] Created .deepwork/.gitignore") - # Step 3d: Create default policy file template - if _create_default_policy_file(project_path): - console.print(" [green]✓[/green] Created .deepwork.policy.yml template") + # Step 3d: Create rules directory with v2 templates + if _create_rules_directory(project_path): + console.print(" [green]✓[/green] Created .deepwork/rules/ with example templates") else: - console.print(" [dim]•[/dim] .deepwork.policy.yml already exists") + console.print(" [dim]•[/dim] .deepwork/rules/ already exists") # Step 4: Load or create config.yml console.print("[yellow]→[/yellow] Updating configuration...") diff --git a/src/deepwork/core/command_executor.py b/src/deepwork/core/command_executor.py new file mode 100644 index 00000000..629b4f50 --- /dev/null +++ b/src/deepwork/core/command_executor.py @@ -0,0 +1,173 @@ +"""Execute command actions for rules.""" + +import shlex +import subprocess +from dataclasses import dataclass +from pathlib import Path + +from deepwork.core.rules_parser import CommandAction + + +@dataclass +class CommandResult: + """Result of executing a command.""" + + success: bool + exit_code: int + stdout: str + stderr: str + command: str # The actual command that was run + + +def substitute_command_variables( + command_template: str, + file: str | None = None, + files: list[str] | None = None, + repo_root: Path | None = None, +) -> str: + """ + Substitute template variables in a command string. + + Variables: + - {file} - Single file path + - {files} - Space-separated file paths + - {repo_root} - Repository root directory + + Args: + command_template: Command string with {var} placeholders + file: Single file path (for run_for: each_match) + files: List of file paths (for run_for: all_matches) + repo_root: Repository root path + + Returns: + Command string with variables substituted + """ + result = command_template + + if file is not None: + # Quote file path to prevent command injection + result = result.replace("{file}", shlex.quote(file)) + + if files is not None: + # Quote each file path individually + quoted_files = " ".join(shlex.quote(f) for f in files) + result = result.replace("{files}", quoted_files) + + if repo_root is not None: + result = result.replace("{repo_root}", shlex.quote(str(repo_root))) + + return result + + +def execute_command( + command: str, + cwd: Path | None = None, + timeout: int = 60, +) -> CommandResult: + """ + Execute a command and capture output. + + Args: + command: Command string to execute + cwd: Working directory (defaults to current directory) + timeout: Timeout in seconds + + Returns: + CommandResult with execution details + """ + try: + # Run command as shell to support pipes, etc. + result = subprocess.run( + command, + shell=True, + cwd=cwd, + capture_output=True, + text=True, + timeout=timeout, + ) + + return CommandResult( + success=result.returncode == 0, + exit_code=result.returncode, + stdout=result.stdout, + stderr=result.stderr, + command=command, + ) + + except subprocess.TimeoutExpired: + return CommandResult( + success=False, + exit_code=-1, + stdout="", + stderr=f"Command timed out after {timeout} seconds", + command=command, + ) + except Exception as e: + return CommandResult( + success=False, + exit_code=-1, + stdout="", + stderr=str(e), + command=command, + ) + + +def run_command_action( + action: CommandAction, + trigger_files: list[str], + repo_root: Path | None = None, +) -> list[CommandResult]: + """ + Run a command action for the given trigger files. + + Args: + action: CommandAction configuration + trigger_files: Files that triggered the rule + repo_root: Repository root path + + Returns: + List of CommandResult (one per command execution) + """ + results: list[CommandResult] = [] + + if action.run_for == "each_match": + # Run command for each file individually + for file_path in trigger_files: + command = substitute_command_variables( + action.command, + file=file_path, + repo_root=repo_root, + ) + result = execute_command(command, cwd=repo_root) + results.append(result) + + elif action.run_for == "all_matches": + # Run command once with all files + command = substitute_command_variables( + action.command, + files=trigger_files, + repo_root=repo_root, + ) + result = execute_command(command, cwd=repo_root) + results.append(result) + + return results + + +def all_commands_succeeded(results: list[CommandResult]) -> bool: + """Check if all command executions succeeded.""" + return all(r.success for r in results) + + +def format_command_errors(results: list[CommandResult]) -> str: + """Format error messages from failed commands.""" + errors: list[str] = [] + for result in results: + if not result.success: + msg = f"Command failed: {result.command}\n" + if result.stderr: + msg += f"Error: {result.stderr}\n" + if result.exit_code != 0: + msg += f"Exit code: {result.exit_code}\n" + errors.append(msg) + return "\n".join(errors) diff --git a/src/deepwork/core/hooks_syncer.py b/src/deepwork/core/hooks_syncer.py index 65257ec2..5df2e74f 100644 --- a/src/deepwork/core/hooks_syncer.py +++ b/src/deepwork/core/hooks_syncer.py @@ -19,27 +19,42 @@ class HooksSyncError(Exception): class HookEntry: """Represents a single hook entry for a lifecycle event.""" - script: str # Script filename job_name: str # Job that provides this hook job_dir: Path # Full path to job directory + script: str | None = None # Script filename (if script-based hook) + module: str | None = None # Python module (if module-based hook) - def get_script_path(self, project_path: Path) -> str: + def get_command(self, project_path: Path) -> str: """ - Get the script path relative to project root. + Get the command to run this hook. Args: project_path: Path to project root Returns: - Relative path to script from project root + Command string to execute """ - # Script path is: .deepwork/jobs/{job_name}/hooks/{script} - script_path = self.job_dir / "hooks" / self.script - try: - return str(script_path.relative_to(project_path)) - except ValueError: - # If not relative, return the full path - return str(script_path) + if self.module: + # Python module - run directly with python -m + return f"python -m {self.module}" + elif self.script: + # Script path is: .deepwork/jobs/{job_name}/hooks/{script} + script_path = self.job_dir / "hooks" / self.script + try: + return str(script_path.relative_to(project_path)) + except ValueError: + # If not relative, return the full path + return str(script_path) + else: + raise ValueError("HookEntry must have either script or module") + + +@dataclass +class HookSpec: + """Specification for a single hook (either script or module).""" + + script: str | None = None + module: str | None = None @dataclass @@ -48,7 +63,7 @@ class JobHooks: job_name: str job_dir: Path - hooks: dict[str, list[str]] = field(default_factory=dict) # event -> [scripts] + hooks: dict[str, list[HookSpec]] = field(default_factory=dict) # event -> [HookSpec] @classmethod def from_job_dir(cls, job_dir: Path) -> "JobHooks | None": @@ -74,13 +89,23 @@ def from_job_dir(cls, job_dir: Path) -> "JobHooks | None": if not data or not isinstance(data, dict): return None - # Parse hooks - each key is an event, value is list of scripts - hooks: dict[str, list[str]] = {} - for event, scripts in data.items(): - if isinstance(scripts, list): - hooks[event] = [str(s) for s in scripts] - elif isinstance(scripts, str): - hooks[event] = [scripts] + # Parse hooks - each key is an event, value is list of scripts or module specs + hooks: dict[str, list[HookSpec]] = {} + for event, entries in data.items(): + if not isinstance(entries, list): + entries = [entries] + + hook_specs: list[HookSpec] = [] + for entry in entries: + if isinstance(entry, str): + # Simple script filename + hook_specs.append(HookSpec(script=entry)) + elif isinstance(entry, dict) and "module" in entry: + # Python module specification + hook_specs.append(HookSpec(module=entry["module"])) + + if hook_specs: + hooks[event] = hook_specs if not hooks: return None @@ -134,17 +159,18 @@ def merge_hooks_for_platform( merged: dict[str, list[dict[str, Any]]] = {} for job_hooks in job_hooks_list: - for event, scripts in job_hooks.hooks.items(): + for event, hook_specs in job_hooks.hooks.items(): if event not in merged: merged[event] = [] - for script in scripts: + for spec in hook_specs: entry = HookEntry( - script=script, job_name=job_hooks.job_name, job_dir=job_hooks.job_dir, + script=spec.script, + module=spec.module, ) - script_path = entry.get_script_path(project_path) + command = entry.get_command(project_path) # Create hook configuration for Claude Code format hook_config = { @@ -152,13 +178,13 @@ def merge_hooks_for_platform( "hooks": [ { "type": "command", - "command": script_path, + "command": command, } ], } # Check if this hook is already present (avoid duplicates) - if not _hook_already_present(merged[event], script_path): + if not _hook_already_present(merged[event], command): merged[event].append(hook_config) return merged diff --git a/src/deepwork/core/pattern_matcher.py b/src/deepwork/core/pattern_matcher.py new file mode 100644 index 00000000..c82ec723 --- /dev/null +++ b/src/deepwork/core/pattern_matcher.py @@ -0,0 +1,271 @@ +"""Pattern matching with variable extraction for rule file correspondence.""" + +import re +from dataclasses import dataclass +from fnmatch import fnmatch + + +class PatternError(Exception): + """Exception raised for invalid pattern syntax.""" + + pass + + +@dataclass +class MatchResult: + """Result of matching a file against a pattern.""" + + matched: bool + variables: dict[str, str] # Captured variable values + + @classmethod + def no_match(cls) -> "MatchResult": + return cls(matched=False, variables={}) + + @classmethod + def match(cls, variables: dict[str, str] | None = None) -> "MatchResult": + return cls(matched=True, variables=variables or {}) + + +def validate_pattern(pattern: str) -> None: + """ + Validate pattern syntax. + + Raises: + PatternError: If pattern has invalid syntax + """ + # Check for unbalanced braces + brace_depth = 0 + for i, char in enumerate(pattern): + if char == "{": + brace_depth += 1 + elif char == "}": + brace_depth -= 1 + if brace_depth < 0: + raise PatternError(f"Unmatched closing brace at position {i}") + + if brace_depth > 0: + raise PatternError("Unclosed brace in pattern") + + # Extract and validate variable names + var_pattern = r"\{([^}]*)\}" + seen_vars: set[str] = set() + + for match in re.finditer(var_pattern, pattern): + var_name = match.group(1) + + # Check for empty variable name + if not var_name: + raise PatternError("Empty variable name in pattern") + + # Strip leading ** or * for validation + clean_name = var_name.lstrip("*") + if not clean_name: + # Just {*} or {**} is valid + continue + + # Check for invalid characters in variable name + if "/" in clean_name or "\\" in clean_name: + raise PatternError(f"Invalid character in variable name: {var_name}") + + # Check for duplicates (use clean name for comparison) + if clean_name in seen_vars: + raise PatternError(f"Duplicate variable: {clean_name}") + seen_vars.add(clean_name) + + +def pattern_to_regex(pattern: str) -> tuple[str, list[str]]: + """ + Convert a pattern with {var} placeholders to a regex. + + Variables: + - {path} or {**name} - Matches multiple path segments (.+) + - {name} or {*name} - Matches single path segment ([^/]+) + + Args: + pattern: Pattern string like "src/{path}.py" + + Returns: + Tuple of (regex_pattern, list_of_variable_names) + + Raises: + PatternError: If pattern has invalid syntax + """ + validate_pattern(pattern) + + # Normalize path separators + pattern = pattern.replace("\\", "/") + + result: list[str] = [] + var_names: list[str] = [] + pos = 0 + + # Parse pattern segments + while pos < len(pattern): + # Look for next variable + brace_start = pattern.find("{", pos) + + if brace_start == -1: + # No more variables, escape the rest + result.append(re.escape(pattern[pos:])) + break + + # Escape literal part before variable + if brace_start > pos: + result.append(re.escape(pattern[pos:brace_start])) + + # Find end of variable + brace_end = pattern.find("}", brace_start) + if brace_end == -1: + raise PatternError("Unclosed brace in pattern") + + var_spec = pattern[brace_start + 1 : brace_end] + + # Determine variable type and name + if var_spec.startswith("**"): + # Explicit multi-segment: {**name} + var_name = var_spec[2:] or "path" + regex_part = f"(?P<{re.escape(var_name)}>.+)" + elif var_spec.startswith("*"): + # Explicit single-segment: {*name} + var_name = var_spec[1:] or "name" + regex_part = f"(?P<{re.escape(var_name)}>[^/]+)" + elif var_spec == "path": + # Conventional multi-segment + var_name = "path" + regex_part = "(?P.+)" + else: + # Default single-segment (including custom names) + var_name = var_spec + regex_part = f"(?P<{re.escape(var_name)}>[^/]+)" + + result.append(regex_part) + var_names.append(var_name) + pos = brace_end + 1 + + return "^" + "".join(result) + "$", var_names + + +def match_pattern(pattern: str, filepath: str) -> MatchResult: + """ + Match a filepath against a pattern, extracting variables. + + Args: + pattern: Pattern with {var} placeholders + filepath: File path to match + + Returns: + MatchResult with matched=True and captured variables, or matched=False + """ + # Normalize path separators + filepath = filepath.replace("\\", "/") + + try: + regex, _ = pattern_to_regex(pattern) + except PatternError: + return MatchResult.no_match() + + match = re.fullmatch(regex, filepath) + if match: + return MatchResult.match(match.groupdict()) + return MatchResult.no_match() + + +def resolve_pattern(pattern: str, variables: dict[str, str]) -> str: + """ + Substitute variables into a pattern to generate a filepath. + + Args: + pattern: Pattern with {var} placeholders + variables: Dict of variable name -> value + + Returns: + Resolved filepath string + """ + result = pattern + for name, value in variables.items(): + # Handle both {name} and {*name} / {**name} forms + result = result.replace(f"{{{name}}}", value) + result = result.replace(f"{{*{name}}}", value) + result = result.replace(f"{{**{name}}}", value) + return result + + +def matches_glob(file_path: str, pattern: str) -> bool: + """ + Match a file path against a glob pattern, supporting ** for recursive matching. + + This is for simple glob patterns without variable capture. + + Args: + file_path: File path to check + pattern: Glob pattern (supports *, **, ?) + + Returns: + True if matches + """ + # Normalize path separators + file_path = file_path.replace("\\", "/") + pattern = pattern.replace("\\", "/") + + # Handle ** patterns (recursive directory matching) + if "**" in pattern: + # Split pattern by ** + parts = pattern.split("**") + + if len(parts) == 2: + prefix, suffix = parts[0], parts[1] + + # Remove leading/trailing slashes from suffix + suffix = suffix.lstrip("/") + + # Check if prefix matches the start of the path + if prefix: + prefix = prefix.rstrip("/") + if not file_path.startswith(prefix + "/") and file_path != prefix: + return False + # Get the remaining path after prefix + remaining = file_path[len(prefix) :].lstrip("/") + else: + remaining = file_path + + # If no suffix, any remaining path matches + if not suffix: + return True + + # Check if suffix matches the end of any remaining path segment + remaining_parts = remaining.split("/") + for i in range(len(remaining_parts)): + test_path = "/".join(remaining_parts[i:]) + if fnmatch(test_path, suffix): + return True + # Also try just the filename + if fnmatch(remaining_parts[-1], suffix): + return True + + return False + + # Simple pattern without ** + return fnmatch(file_path, pattern) + + +def matches_any_pattern(file_path: str, patterns: list[str]) -> bool: + """ + Check if a file path matches any of the given glob patterns. + + Args: + file_path: File path to check (relative path) + patterns: List of glob patterns to match against + + Returns: + True if the file matches any pattern + """ + for pattern in patterns: + if matches_glob(file_path, pattern): + return True + return False + + +def has_variables(pattern: str) -> bool: + """Check if a pattern contains variable placeholders.""" + return "{" in pattern and "}" in pattern diff --git a/src/deepwork/core/policy_parser.py b/src/deepwork/core/policy_parser.py deleted file mode 100644 index b6ade990..00000000 --- a/src/deepwork/core/policy_parser.py +++ /dev/null @@ -1,295 +0,0 @@ -"""Policy definition parser.""" - -from dataclasses import dataclass, field -from fnmatch import fnmatch -from pathlib import Path -from typing import Any - -import yaml - -from deepwork.schemas.policy_schema import POLICY_SCHEMA -from deepwork.utils.validation import ValidationError, validate_against_schema - - -class PolicyParseError(Exception): - """Exception raised for policy parsing errors.""" - - pass - - -# Valid compare_to values -COMPARE_TO_VALUES = frozenset({"base", "default_tip", "prompt"}) -DEFAULT_COMPARE_TO = "base" - - -@dataclass -class Policy: - """Represents a single policy definition.""" - - name: str - triggers: list[str] # Normalized to list - safety: list[str] = field(default_factory=list) # Normalized to list, empty if not specified - instructions: str = "" # Resolved content (either inline or from file) - compare_to: str = DEFAULT_COMPARE_TO # What to compare against: base, default_tip, or prompt - - @classmethod - def from_dict(cls, data: dict[str, Any], base_dir: Path | None = None) -> "Policy": - """ - Create Policy from dictionary. - - Args: - data: Parsed YAML data for a single policy - base_dir: Base directory for resolving instructions_file paths - - Returns: - Policy instance - - Raises: - PolicyParseError: If instructions cannot be resolved - """ - # Normalize trigger to list - trigger = data["trigger"] - triggers = [trigger] if isinstance(trigger, str) else list(trigger) - - # Normalize safety to list (empty if not present) - safety_data = data.get("safety", []) - safety = [safety_data] if isinstance(safety_data, str) else list(safety_data) - - # Resolve instructions - if "instructions" in data: - instructions = data["instructions"] - elif "instructions_file" in data: - if base_dir is None: - raise PolicyParseError( - f"Policy '{data['name']}' uses instructions_file but no base_dir provided" - ) - instructions_path = base_dir / data["instructions_file"] - if not instructions_path.exists(): - raise PolicyParseError( - f"Policy '{data['name']}' instructions file not found: {instructions_path}" - ) - try: - instructions = instructions_path.read_text() - except Exception as e: - raise PolicyParseError( - f"Policy '{data['name']}' failed to read instructions file: {e}" - ) from e - else: - # Schema should catch this, but be defensive - raise PolicyParseError( - f"Policy '{data['name']}' must have either 'instructions' or 'instructions_file'" - ) - - # Get compare_to (defaults to DEFAULT_COMPARE_TO) - compare_to = data.get("compare_to", DEFAULT_COMPARE_TO) - - return cls( - name=data["name"], - triggers=triggers, - safety=safety, - instructions=instructions, - compare_to=compare_to, - ) - - -def matches_pattern(file_path: str, patterns: list[str]) -> bool: - """ - Check if a file path matches any of the given glob patterns. - - Args: - file_path: File path to check (relative path) - patterns: List of glob patterns to match against - - Returns: - True if the file matches any pattern - """ - for pattern in patterns: - if _matches_glob(file_path, pattern): - return True - return False - - -def _matches_glob(file_path: str, pattern: str) -> bool: - """ - Match a file path against a glob pattern, supporting ** for recursive matching. - - Args: - file_path: File path to check - pattern: Glob pattern (supports *, **, ?) - - Returns: - True if matches - """ - # Normalize path separators - file_path = file_path.replace("\\", "/") - pattern = pattern.replace("\\", "/") - - # Handle ** patterns (recursive directory matching) - if "**" in pattern: - # Split pattern by ** - parts = pattern.split("**") - - if len(parts) == 2: - prefix, suffix = parts[0], parts[1] - - # Remove leading/trailing slashes from suffix - suffix = suffix.lstrip("/") - - # Check if prefix matches the start of the path - if prefix: - prefix = prefix.rstrip("/") - if not file_path.startswith(prefix + "/") and file_path != prefix: - return False - # Get the remaining path after prefix - remaining = file_path[len(prefix) :].lstrip("/") - else: - remaining = file_path - - # If no suffix, any remaining path matches - if not suffix: - return True - - # Check if suffix matches the end of any remaining path segment - # For pattern "src/**/*.py", suffix is "*.py" - # We need to match *.py against the filename portion - remaining_parts = remaining.split("/") - for i in range(len(remaining_parts)): - test_path = "/".join(remaining_parts[i:]) - if fnmatch(test_path, suffix): - return True - # Also try just the filename - if fnmatch(remaining_parts[-1], suffix): - return True - - return False - - # Simple pattern without ** - return fnmatch(file_path, pattern) - - -def evaluate_policy(policy: Policy, changed_files: list[str]) -> bool: - """ - Evaluate whether a policy should fire based on changed files. - - A policy fires if: - - At least one changed file matches a trigger pattern - - AND no changed file matches a safety pattern - - Args: - policy: Policy to evaluate - changed_files: List of changed file paths (relative) - - Returns: - True if the policy should fire - """ - # Check if any trigger matches - trigger_matched = False - for file_path in changed_files: - if matches_pattern(file_path, policy.triggers): - trigger_matched = True - break - - if not trigger_matched: - return False - - # Check if any safety pattern matches - if policy.safety: - for file_path in changed_files: - if matches_pattern(file_path, policy.safety): - # Safety file was also changed, don't fire - return False - - return True - - -def evaluate_policies( - policies: list[Policy], - changed_files: list[str], - promised_policies: set[str] | None = None, -) -> list[Policy]: - """ - Evaluate which policies should fire. - - Args: - policies: List of policies to evaluate - changed_files: List of changed file paths (relative) - promised_policies: Set of policy names that have been marked as addressed - via tags (these are skipped) - - Returns: - List of policies that should fire (trigger matches, no safety match, not promised) - """ - if promised_policies is None: - promised_policies = set() - - fired_policies = [] - for policy in policies: - # Skip if already promised/addressed - if policy.name in promised_policies: - continue - - if evaluate_policy(policy, changed_files): - fired_policies.append(policy) - - return fired_policies - - -def parse_policy_file(policy_path: Path | str, base_dir: Path | None = None) -> list[Policy]: - """ - Parse policy definitions from a YAML file. - - Args: - policy_path: Path to .deepwork.policy.yml file - base_dir: Base directory for resolving instructions_file paths. - Defaults to the directory containing the policy file. - - Returns: - List of parsed Policy objects - - Raises: - PolicyParseError: If parsing fails or validation errors occur - """ - policy_path = Path(policy_path) - - if not policy_path.exists(): - raise PolicyParseError(f"Policy file does not exist: {policy_path}") - - if not policy_path.is_file(): - raise PolicyParseError(f"Policy path is not a file: {policy_path}") - - # Default base_dir to policy file's directory - if base_dir is None: - base_dir = policy_path.parent - - # Load YAML (policies are stored as a list, not a dict) - try: - with open(policy_path, encoding="utf-8") as f: - policy_data = yaml.safe_load(f) - except yaml.YAMLError as e: - raise PolicyParseError(f"Failed to parse policy YAML: {e}") from e - except OSError as e: - raise PolicyParseError(f"Failed to read policy file: {e}") from e - - # Handle empty file or null content - if policy_data is None: - return [] - - # Validate it's a list (schema expects array) - if not isinstance(policy_data, list): - raise PolicyParseError( - f"Policy file must contain a list of policies, got {type(policy_data).__name__}" - ) - - # Validate against schema - try: - validate_against_schema(policy_data, POLICY_SCHEMA) - except ValidationError as e: - raise PolicyParseError(f"Policy definition validation failed: {e}") from e - - # Parse into dataclasses - policies = [] - for policy_item in policy_data: - policy = Policy.from_dict(policy_item, base_dir) - policies.append(policy) - - return policies diff --git a/src/deepwork/core/rules_parser.py b/src/deepwork/core/rules_parser.py new file mode 100644 index 00000000..1de83a6c --- /dev/null +++ b/src/deepwork/core/rules_parser.py @@ -0,0 +1,511 @@ +"""Rule definition parser (v2 - frontmatter markdown format).""" + +from dataclasses import dataclass, field +from enum import Enum +from pathlib import Path +from typing import Any + +import yaml + +from deepwork.core.pattern_matcher import ( + has_variables, + match_pattern, + matches_any_pattern, + resolve_pattern, +) +from deepwork.schemas.rules_schema import RULES_FRONTMATTER_SCHEMA +from deepwork.utils.validation import ValidationError, validate_against_schema + + +class RulesParseError(Exception): + """Exception raised for rule parsing errors.""" + + pass + + +class DetectionMode(Enum): + """How the rule detects when to fire.""" + + TRIGGER_SAFETY = "trigger_safety" # Fire when trigger matches, safety doesn't + SET = "set" # Bidirectional file correspondence + PAIR = "pair" # Directional file correspondence + + +class ActionType(Enum): + """What happens when the rule fires.""" + + PROMPT = "prompt" # Show instructions to agent (default) + COMMAND = "command" # Run an idempotent command + + +# Valid compare_to values +COMPARE_TO_VALUES = frozenset({"base", "default_tip", "prompt"}) +DEFAULT_COMPARE_TO = "base" + + +@dataclass +class CommandAction: + """Configuration for command action.""" + + command: str # Command template (supports {file}, {files}, {repo_root}) + run_for: str = "each_match" # "each_match" or "all_matches" + + +@dataclass +class PairConfig: + """Configuration for pair detection mode.""" + + trigger: str # Pattern that triggers + expects: list[str] # Patterns for expected corresponding files + + +@dataclass +class Rule: + """Represents a single rule definition (v2 format).""" + + # Identity + name: str # Human-friendly name (displayed in promise tags) + filename: str # Filename without .md extension (used for queue) + + # Detection mode (exactly one must be set) + detection_mode: DetectionMode + triggers: list[str] = field(default_factory=list) # For TRIGGER_SAFETY mode + safety: list[str] = field(default_factory=list) # For TRIGGER_SAFETY mode + set_patterns: list[str] = field(default_factory=list) # For SET mode + pair_config: PairConfig | None = None # For PAIR mode + + # Action type + action_type: ActionType = ActionType.PROMPT + instructions: str = "" # For PROMPT action (markdown body) + command_action: CommandAction | None = None # For COMMAND action + + # Common options + compare_to: str = DEFAULT_COMPARE_TO + + @classmethod + def from_frontmatter( + cls, + frontmatter: dict[str, Any], + markdown_body: str, + filename: str, + ) -> "Rule": + """ + Create Rule from parsed frontmatter and markdown body. + + Args: + frontmatter: Parsed YAML frontmatter + markdown_body: Markdown content after frontmatter + filename: Filename without .md extension + + Returns: + Rule instance + + Raises: + RulesParseError: If validation fails + """ + # Get name (required) + name = frontmatter.get("name", "") + if not name: + raise RulesParseError(f"Rule '{filename}' missing required 'name' field") + + # Determine detection mode + has_trigger = "trigger" in frontmatter + has_set = "set" in frontmatter + has_pair = "pair" in frontmatter + + mode_count = sum([has_trigger, has_set, has_pair]) + if mode_count == 0: + raise RulesParseError(f"Rule '{name}' must have 'trigger', 'set', or 'pair'") + if mode_count > 1: + raise RulesParseError(f"Rule '{name}' has multiple detection modes - use only one") + + # Parse based on detection mode + detection_mode: DetectionMode + triggers: list[str] = [] + safety: list[str] = [] + set_patterns: list[str] = [] + pair_config: PairConfig | None = None + + if has_trigger: + detection_mode = DetectionMode.TRIGGER_SAFETY + trigger = frontmatter["trigger"] + triggers = [trigger] if isinstance(trigger, str) else list(trigger) + safety_data = frontmatter.get("safety", []) + safety = [safety_data] if isinstance(safety_data, str) else list(safety_data) + + elif has_set: + detection_mode = DetectionMode.SET + set_patterns = list(frontmatter["set"]) + if len(set_patterns) < 2: + raise RulesParseError(f"Rule '{name}' set requires at least 2 patterns") + + elif has_pair: + detection_mode = DetectionMode.PAIR + pair_data = frontmatter["pair"] + expects = pair_data["expects"] + expects_list = [expects] if isinstance(expects, str) else list(expects) + pair_config = PairConfig( + trigger=pair_data["trigger"], + expects=expects_list, + ) + + # Determine action type + action_type: ActionType + command_action: CommandAction | None = None + + if "action" in frontmatter: + action_type = ActionType.COMMAND + action_data = frontmatter["action"] + command_action = CommandAction( + command=action_data["command"], + run_for=action_data.get("run_for", "each_match"), + ) + else: + action_type = ActionType.PROMPT + # Markdown body is the instructions + if not markdown_body.strip(): + raise RulesParseError(f"Rule '{name}' with prompt action requires markdown body") + + # Get compare_to + compare_to = frontmatter.get("compare_to", DEFAULT_COMPARE_TO) + + return cls( + name=name, + filename=filename, + detection_mode=detection_mode, + triggers=triggers, + safety=safety, + set_patterns=set_patterns, + pair_config=pair_config, + action_type=action_type, + instructions=markdown_body.strip(), + command_action=command_action, + compare_to=compare_to, + ) + + +def parse_frontmatter_file(filepath: Path) -> tuple[dict[str, Any], str]: + """ + Parse a markdown file with YAML frontmatter. + + Args: + filepath: Path to .md file + + Returns: + Tuple of (frontmatter_dict, markdown_body) + + Raises: + RulesParseError: If parsing fails + """ + try: + content = filepath.read_text(encoding="utf-8") + except OSError as e: + raise RulesParseError(f"Failed to read rule file: {e}") from e + + # Split frontmatter from body + if not content.startswith("---"): + raise RulesParseError( + f"Rule file '{filepath.name}' must start with '---' frontmatter delimiter" + ) + + # Find end of frontmatter + end_marker = content.find("\n---", 3) + if end_marker == -1: + raise RulesParseError( + f"Rule file '{filepath.name}' missing closing '---' frontmatter delimiter" + ) + + frontmatter_str = content[4:end_marker] # Skip initial "---\n" + markdown_body = content[end_marker + 4 :] # Skip "\n---\n" or "\n---" + + # Parse YAML frontmatter + try: + frontmatter = yaml.safe_load(frontmatter_str) + except yaml.YAMLError as e: + raise RulesParseError(f"Invalid YAML frontmatter in '{filepath.name}': {e}") from e + + if frontmatter is None: + frontmatter = {} + + if not isinstance(frontmatter, dict): + raise RulesParseError( + f"Frontmatter in '{filepath.name}' must be a mapping, got {type(frontmatter).__name__}" + ) + + return frontmatter, markdown_body + + +def parse_rule_file(filepath: Path) -> Rule: + """ + Parse a single rule from a frontmatter markdown file. + + Args: + filepath: Path to .md file in .deepwork/rules/ + + Returns: + Parsed Rule object + + Raises: + RulesParseError: If parsing or validation fails + """ + if not filepath.exists(): + raise RulesParseError(f"Rule file does not exist: {filepath}") + + if not filepath.is_file(): + raise RulesParseError(f"Rule path is not a file: {filepath}") + + frontmatter, markdown_body = parse_frontmatter_file(filepath) + + # Validate against schema + try: + validate_against_schema(frontmatter, RULES_FRONTMATTER_SCHEMA) + except ValidationError as e: + raise RulesParseError(f"Rule '{filepath.name}' validation failed: {e}") from e + + # Create Rule object + filename = filepath.stem # filename without .md extension + return Rule.from_frontmatter(frontmatter, markdown_body, filename) + + +def load_rules_from_directory(rules_dir: Path) -> list[Rule]: + """ + Load all rules from a directory. + + Args: + rules_dir: Path to .deepwork/rules/ directory + + Returns: + List of parsed Rule objects (sorted by filename) + + Raises: + RulesParseError: If any rule file fails to parse + """ + if not rules_dir.exists(): + return [] + + if not rules_dir.is_dir(): + raise RulesParseError(f"Rules path is not a directory: {rules_dir}") + + rules = [] + for filepath in sorted(rules_dir.glob("*.md")): + rule = parse_rule_file(filepath) + rules.append(rule) + + return rules + + +# ============================================================================= +# Evaluation Logic +# ============================================================================= + + +def evaluate_trigger_safety( + rule: Rule, + changed_files: list[str], +) -> bool: + """ + Evaluate a trigger/safety mode rule. + + Returns True if rule should fire: + - At least one changed file matches a trigger pattern + - AND no changed file matches a safety pattern + """ + # Check if any trigger matches + trigger_matched = False + for file_path in changed_files: + if matches_any_pattern(file_path, rule.triggers): + trigger_matched = True + break + + if not trigger_matched: + return False + + # Check if any safety pattern matches + if rule.safety: + for file_path in changed_files: + if matches_any_pattern(file_path, rule.safety): + return False + + return True + + +def evaluate_set_correspondence( + rule: Rule, + changed_files: list[str], +) -> tuple[bool, list[str], list[str]]: + """ + Evaluate a set (bidirectional correspondence) rule. + + Returns: + Tuple of (should_fire, trigger_files, missing_files) + - should_fire: True if correspondence is incomplete + - trigger_files: Files that triggered (matched a pattern) + - missing_files: Expected files that didn't change + """ + trigger_files: list[str] = [] + missing_files: list[str] = [] + changed_set = set(changed_files) + + for file_path in changed_files: + # Check each pattern in the set + for pattern in rule.set_patterns: + result = match_pattern(pattern, file_path) + if result.matched: + trigger_files.append(file_path) + + # Check if all other corresponding files also changed + for other_pattern in rule.set_patterns: + if other_pattern == pattern: + continue + + if has_variables(other_pattern): + expected = resolve_pattern(other_pattern, result.variables) + else: + expected = other_pattern + + if expected not in changed_set: + if expected not in missing_files: + missing_files.append(expected) + + break # Only match one pattern per file + + # Rule fires if there are trigger files with missing correspondences + should_fire = len(trigger_files) > 0 and len(missing_files) > 0 + return should_fire, trigger_files, missing_files + + +def evaluate_pair_correspondence( + rule: Rule, + changed_files: list[str], +) -> tuple[bool, list[str], list[str]]: + """ + Evaluate a pair (directional correspondence) rule. + + Only trigger-side changes require corresponding expected files. + Expected-side changes alone do not trigger. + + Returns: + Tuple of (should_fire, trigger_files, missing_files) + """ + if rule.pair_config is None: + return False, [], [] + + trigger_files: list[str] = [] + missing_files: list[str] = [] + changed_set = set(changed_files) + + trigger_pattern = rule.pair_config.trigger + expects_patterns = rule.pair_config.expects + + for file_path in changed_files: + # Only check trigger pattern (directional) + result = match_pattern(trigger_pattern, file_path) + if result.matched: + trigger_files.append(file_path) + + # Check if all expected files also changed + for expects_pattern in expects_patterns: + if has_variables(expects_pattern): + expected = resolve_pattern(expects_pattern, result.variables) + else: + expected = expects_pattern + + if expected not in changed_set: + if expected not in missing_files: + missing_files.append(expected) + + should_fire = len(trigger_files) > 0 and len(missing_files) > 0 + return should_fire, trigger_files, missing_files + + +@dataclass +class RuleEvaluationResult: + """Result of evaluating a single rule.""" + + rule: Rule + should_fire: bool + trigger_files: list[str] = field(default_factory=list) + missing_files: list[str] = field(default_factory=list) # For set/pair modes + + +def evaluate_rule(rule: Rule, changed_files: list[str]) -> RuleEvaluationResult: + """ + Evaluate whether a rule should fire based on changed files. + + Args: + rule: Rule to evaluate + changed_files: List of changed file paths (relative) + + Returns: + RuleEvaluationResult with evaluation details + """ + if rule.detection_mode == DetectionMode.TRIGGER_SAFETY: + should_fire = evaluate_trigger_safety(rule, changed_files) + trigger_files = ( + [f for f in changed_files if matches_any_pattern(f, rule.triggers)] + if should_fire + else [] + ) + return RuleEvaluationResult( + rule=rule, + should_fire=should_fire, + trigger_files=trigger_files, + ) + + elif rule.detection_mode == DetectionMode.SET: + should_fire, trigger_files, missing_files = evaluate_set_correspondence(rule, changed_files) + return RuleEvaluationResult( + rule=rule, + should_fire=should_fire, + trigger_files=trigger_files, + missing_files=missing_files, + ) + + elif rule.detection_mode == DetectionMode.PAIR: + should_fire, trigger_files, missing_files = evaluate_pair_correspondence( + rule, changed_files + ) + return RuleEvaluationResult( + rule=rule, + should_fire=should_fire, + trigger_files=trigger_files, + missing_files=missing_files, + ) + + return RuleEvaluationResult(rule=rule, should_fire=False) + + +def evaluate_rules( + rules: list[Rule], + changed_files: list[str], + promised_rules: set[str] | None = None, +) -> list[RuleEvaluationResult]: + """ + Evaluate which rules should fire. + + Args: + rules: List of rules to evaluate + changed_files: List of changed file paths (relative) + promised_rules: Set of rule names that have been marked as addressed + via tags (case-insensitive) + + Returns: + List of RuleEvaluationResult for rules that should fire + """ + if promised_rules is None: + promised_rules = set() + + # Normalize promised names for case-insensitive comparison + promised_lower = {name.lower() for name in promised_rules} + + results = [] + for rule in rules: + # Skip if already promised/addressed (case-insensitive) + if rule.name.lower() in promised_lower: + continue + + result = evaluate_rule(rule, changed_files) + if result.should_fire: + results.append(result) + + return results diff --git a/src/deepwork/core/rules_queue.py b/src/deepwork/core/rules_queue.py new file mode 100644 index 00000000..4f49a4fe --- /dev/null +++ b/src/deepwork/core/rules_queue.py @@ -0,0 +1,321 @@ +"""Queue system for tracking rule state in .deepwork/tmp/rules/queue/.""" + +import hashlib +import json +from dataclasses import asdict, dataclass, field +from datetime import UTC, datetime +from enum import Enum +from pathlib import Path +from typing import Any + + +class QueueEntryStatus(Enum): + """Status of a queue entry.""" + + QUEUED = "queued" # Detected, awaiting evaluation + PASSED = "passed" # Evaluated, rule satisfied (promise found or action succeeded) + FAILED = "failed" # Evaluated, rule not satisfied + SKIPPED = "skipped" # Safety pattern matched, skipped + + +@dataclass +class ActionResult: + """Result of executing a rule action.""" + + type: str # "prompt" or "command" + output: str | None = None # Command stdout or prompt message shown + exit_code: int | None = None # Command exit code (None for prompt) + + +@dataclass +class QueueEntry: + """A single entry in the rules queue.""" + + # Identity + rule_name: str # Human-friendly name + rule_file: str # Filename (e.g., "source-test-pairing.md") + trigger_hash: str # Hash for deduplication + + # State + status: QueueEntryStatus = QueueEntryStatus.QUEUED + created_at: str = "" # ISO8601 timestamp + evaluated_at: str | None = None # ISO8601 timestamp + + # Context + baseline_ref: str = "" # Commit hash or timestamp used as baseline + trigger_files: list[str] = field(default_factory=list) + expected_files: list[str] = field(default_factory=list) # For set/pair modes + matched_files: list[str] = field(default_factory=list) # Files that also changed + + # Result + action_result: ActionResult | None = None + + def __post_init__(self) -> None: + if not self.created_at: + self.created_at = datetime.now(UTC).isoformat() + + def to_dict(self) -> dict[str, Any]: + """Convert to dictionary for JSON serialization.""" + data = asdict(self) + data["status"] = self.status.value + if self.action_result: + data["action_result"] = asdict(self.action_result) + return data + + @classmethod + def from_dict(cls, data: dict[str, Any]) -> "QueueEntry": + """Create from dictionary.""" + action_result = None + if data.get("action_result"): + action_result = ActionResult(**data["action_result"]) + + return cls( + rule_name=data.get("rule_name", data.get("policy_name", "")), + rule_file=data.get("rule_file", data.get("policy_file", "")), + trigger_hash=data["trigger_hash"], + status=QueueEntryStatus(data["status"]), + created_at=data.get("created_at", ""), + evaluated_at=data.get("evaluated_at"), + baseline_ref=data.get("baseline_ref", ""), + trigger_files=data.get("trigger_files", []), + expected_files=data.get("expected_files", []), + matched_files=data.get("matched_files", []), + action_result=action_result, + ) + + +def compute_trigger_hash( + rule_name: str, + trigger_files: list[str], + baseline_ref: str, +) -> str: + """ + Compute a hash for deduplication. + + The hash is based on: + - Rule name + - Sorted list of trigger files + - Baseline reference (commit hash or timestamp) + + Returns: + 12-character hex hash + """ + hash_input = f"{rule_name}:{sorted(trigger_files)}:{baseline_ref}" + return hashlib.sha256(hash_input.encode()).hexdigest()[:12] + + +class RulesQueue: + """ + Manages the rules queue in .deepwork/tmp/rules/queue/. + + Queue entries are stored as JSON files named {hash}.{status}.json + """ + + def __init__(self, queue_dir: Path | None = None): + """ + Initialize the queue. + + Args: + queue_dir: Path to queue directory. Defaults to .deepwork/tmp/rules/queue/ + """ + if queue_dir is None: + queue_dir = Path(".deepwork/tmp/rules/queue") + self.queue_dir = queue_dir + + def _ensure_dir(self) -> None: + """Ensure queue directory exists.""" + self.queue_dir.mkdir(parents=True, exist_ok=True) + + def _get_entry_path(self, trigger_hash: str, status: QueueEntryStatus) -> Path: + """Get path for an entry file.""" + return self.queue_dir / f"{trigger_hash}.{status.value}.json" + + def _find_entry_path(self, trigger_hash: str) -> Path | None: + """Find existing entry file for a hash (any status).""" + for status in QueueEntryStatus: + path = self._get_entry_path(trigger_hash, status) + if path.exists(): + return path + return None + + def has_entry(self, trigger_hash: str) -> bool: + """Check if an entry exists for this hash.""" + return self._find_entry_path(trigger_hash) is not None + + def get_entry(self, trigger_hash: str) -> QueueEntry | None: + """Get an entry by hash.""" + path = self._find_entry_path(trigger_hash) + if path is None: + return None + + try: + with open(path, encoding="utf-8") as f: + data = json.load(f) + return QueueEntry.from_dict(data) + except (json.JSONDecodeError, OSError, KeyError): + return None + + def create_entry( + self, + rule_name: str, + rule_file: str, + trigger_files: list[str], + baseline_ref: str, + expected_files: list[str] | None = None, + ) -> QueueEntry | None: + """ + Create a new queue entry if one doesn't already exist. + + Args: + rule_name: Human-friendly rule name + rule_file: Rule filename (e.g., "source-test-pairing.md") + trigger_files: Files that triggered the rule + baseline_ref: Baseline reference for change detection + expected_files: Expected corresponding files (for set/pair) + + Returns: + Created QueueEntry, or None if entry already exists + """ + trigger_hash = compute_trigger_hash(rule_name, trigger_files, baseline_ref) + + # Check if already exists + if self.has_entry(trigger_hash): + return None + + self._ensure_dir() + + entry = QueueEntry( + rule_name=rule_name, + rule_file=rule_file, + trigger_hash=trigger_hash, + status=QueueEntryStatus.QUEUED, + baseline_ref=baseline_ref, + trigger_files=trigger_files, + expected_files=expected_files or [], + ) + + path = self._get_entry_path(trigger_hash, QueueEntryStatus.QUEUED) + with open(path, "w", encoding="utf-8") as f: + json.dump(entry.to_dict(), f, indent=2) + + return entry + + def update_status( + self, + trigger_hash: str, + new_status: QueueEntryStatus, + action_result: ActionResult | None = None, + ) -> bool: + """ + Update the status of an entry. + + This renames the file to reflect the new status. + + Args: + trigger_hash: Hash of the entry to update + new_status: New status + action_result: Optional result of action execution + + Returns: + True if updated, False if entry not found + """ + old_path = self._find_entry_path(trigger_hash) + if old_path is None: + return False + + # Load existing entry + try: + with open(old_path, encoding="utf-8") as f: + data = json.load(f) + except (json.JSONDecodeError, OSError): + return False + + # Update fields + data["status"] = new_status.value + data["evaluated_at"] = datetime.now(UTC).isoformat() + if action_result: + data["action_result"] = asdict(action_result) + + # Write to new path + new_path = self._get_entry_path(trigger_hash, new_status) + + # If status didn't change, just update in place + if old_path == new_path: + with open(new_path, "w", encoding="utf-8") as f: + json.dump(data, f, indent=2) + else: + # Write new file then delete old + with open(new_path, "w", encoding="utf-8") as f: + json.dump(data, f, indent=2) + old_path.unlink() + + return True + + def get_queued_entries(self) -> list[QueueEntry]: + """Get all entries with QUEUED status.""" + if not self.queue_dir.exists(): + return [] + + entries = [] + for path in self.queue_dir.glob("*.queued.json"): + try: + with open(path, encoding="utf-8") as f: + data = json.load(f) + entries.append(QueueEntry.from_dict(data)) + except (json.JSONDecodeError, OSError, KeyError): + continue + + return entries + + def get_all_entries(self) -> list[QueueEntry]: + """Get all entries regardless of status.""" + if not self.queue_dir.exists(): + return [] + + entries = [] + for path in self.queue_dir.glob("*.json"): + try: + with open(path, encoding="utf-8") as f: + data = json.load(f) + entries.append(QueueEntry.from_dict(data)) + except (json.JSONDecodeError, OSError, KeyError): + continue + + return entries + + def clear(self) -> int: + """ + Clear all entries from the queue. + + Returns: + Number of entries removed + """ + if not self.queue_dir.exists(): + return 0 + + count = 0 + for path in self.queue_dir.glob("*.json"): + try: + path.unlink() + count += 1 + except OSError: + continue + + return count + + def remove_entry(self, trigger_hash: str) -> bool: + """ + Remove an entry by hash. + + Returns: + True if removed, False if not found + """ + path = self._find_entry_path(trigger_hash) + if path is None: + return False + + try: + path.unlink() + return True + except OSError: + return False diff --git a/src/deepwork/hooks/README.md b/src/deepwork/hooks/README.md index 7cf51559..9c3dd887 100644 --- a/src/deepwork/hooks/README.md +++ b/src/deepwork/hooks/README.md @@ -16,8 +16,7 @@ The hook system provides: - Cross-platform compatibility 3. **Hook implementations**: - - `policy_check.py` - Evaluates DeepWork policies on `after_agent` events - - `evaluate_policies.py` - Legacy Claude-specific policy evaluation + - `rules_check.py` - Evaluates DeepWork rules on `after_agent` events ## Usage @@ -33,7 +32,7 @@ The hook system provides: "hooks": [ { "type": "command", - "command": "path/to/claude_hook.sh deepwork.hooks.policy_check" + "command": "path/to/claude_hook.sh deepwork.hooks.rules_check" } ] } @@ -52,7 +51,7 @@ The hook system provides: "hooks": [ { "type": "command", - "command": "path/to/gemini_hook.sh deepwork.hooks.policy_check" + "command": "path/to/gemini_hook.sh deepwork.hooks.rules_check" } ] } @@ -179,5 +178,4 @@ pytest tests/shell_script_tests/test_hook_wrappers.py -v | `wrapper.py` | Cross-platform input/output normalization | | `claude_hook.sh` | Shell wrapper for Claude Code | | `gemini_hook.sh` | Shell wrapper for Gemini CLI | -| `policy_check.py` | Cross-platform policy evaluation hook | -| `evaluate_policies.py` | Legacy Claude-specific policy evaluation | +| `rules_check.py` | Cross-platform rule evaluation hook | diff --git a/src/deepwork/hooks/__init__.py b/src/deepwork/hooks/__init__.py index 277080b6..c64dcfc4 100644 --- a/src/deepwork/hooks/__init__.py +++ b/src/deepwork/hooks/__init__.py @@ -1,4 +1,4 @@ -"""DeepWork hooks package for policy enforcement and lifecycle events. +"""DeepWork hooks package for rules enforcement and lifecycle events. This package provides: @@ -8,8 +8,7 @@ - gemini_hook.sh: Shell wrapper for Gemini CLI hooks 2. Hook implementations: - - policy_check.py: Evaluates policies on after_agent events - - evaluate_policies.py: Legacy policy evaluation (Claude-specific) + - rules_check.py: Evaluates rules on after_agent events Usage with wrapper system: # Register hook in .claude/settings.json: @@ -18,7 +17,7 @@ "Stop": [{ "hooks": [{ "type": "command", - "command": ".deepwork/hooks/claude_hook.sh deepwork.hooks.policy_check" + "command": ".deepwork/hooks/claude_hook.sh deepwork.hooks.rules_check" }] }] } @@ -30,7 +29,7 @@ "AfterAgent": [{ "hooks": [{ "type": "command", - "command": ".gemini/hooks/gemini_hook.sh deepwork.hooks.policy_check" + "command": ".gemini/hooks/gemini_hook.sh deepwork.hooks.rules_check" }] }] } diff --git a/src/deepwork/hooks/claude_hook.sh b/src/deepwork/hooks/claude_hook.sh index b9c4fd39..7e13ad44 100755 --- a/src/deepwork/hooks/claude_hook.sh +++ b/src/deepwork/hooks/claude_hook.sh @@ -9,7 +9,7 @@ # claude_hook.sh # # Example: -# claude_hook.sh deepwork.hooks.policy_check +# claude_hook.sh deepwork.hooks.rules_check # # The Python module should implement a main() function that: # 1. Calls deepwork.hooks.wrapper.run_hook() with a hook function @@ -31,7 +31,7 @@ PYTHON_MODULE="${1:-}" if [ -z "${PYTHON_MODULE}" ]; then echo "Usage: claude_hook.sh " >&2 - echo "Example: claude_hook.sh deepwork.hooks.policy_check" >&2 + echo "Example: claude_hook.sh deepwork.hooks.rules_check" >&2 exit 1 fi diff --git a/src/deepwork/hooks/evaluate_policies.py b/src/deepwork/hooks/evaluate_policies.py deleted file mode 100644 index 07ac3845..00000000 --- a/src/deepwork/hooks/evaluate_policies.py +++ /dev/null @@ -1,376 +0,0 @@ -""" -Policy evaluation module for DeepWork hooks. - -This module is called by the policy_stop_hook.sh script to evaluate which policies -should fire based on changed files and conversation context. - -Usage: - python -m deepwork.hooks.evaluate_policies \ - --policy-file .deepwork.policy.yml - -The conversation context is read from stdin and checked for tags -that indicate policies have already been addressed. - -Changed files are computed based on each policy's compare_to setting: -- base: Compare to merge-base with default branch (default) -- default_tip: Two-dot diff against default branch tip -- prompt: Compare to state captured at prompt submission - -Output is JSON suitable for Claude Code Stop hooks: - {"decision": "block", "reason": "..."} # Block stop, policies need attention - {} # No policies fired, allow stop -""" - -import argparse -import json -import re -import subprocess -import sys -from pathlib import Path - -from deepwork.core.policy_parser import ( - Policy, - PolicyParseError, - evaluate_policy, - parse_policy_file, -) - - -def get_default_branch() -> str: - """ - Get the default branch name (main or master). - - Returns: - Default branch name, or "main" if cannot be determined. - """ - # Try to get the default branch from remote HEAD - try: - result = subprocess.run( - ["git", "symbolic-ref", "refs/remotes/origin/HEAD"], - capture_output=True, - text=True, - check=True, - ) - # Output is like "refs/remotes/origin/main" - return result.stdout.strip().split("/")[-1] - except subprocess.CalledProcessError: - pass - - # Try common default branch names - for branch in ["main", "master"]: - try: - subprocess.run( - ["git", "rev-parse", "--verify", f"origin/{branch}"], - capture_output=True, - check=True, - ) - return branch - except subprocess.CalledProcessError: - continue - - # Fall back to main - return "main" - - -def get_changed_files_base() -> list[str]: - """ - Get files changed relative to the base of the current branch. - - This finds the merge-base between the current branch and the default branch, - then returns all files changed since that point. - - Returns: - List of changed file paths. - """ - default_branch = get_default_branch() - - try: - # Get the merge-base (where current branch diverged from default) - result = subprocess.run( - ["git", "merge-base", "HEAD", f"origin/{default_branch}"], - capture_output=True, - text=True, - check=True, - ) - merge_base = result.stdout.strip() - - # Stage all changes so they appear in diff - subprocess.run(["git", "add", "-A"], capture_output=True, check=False) - - # Get files changed since merge-base (including staged) - result = subprocess.run( - ["git", "diff", "--name-only", merge_base, "HEAD"], - capture_output=True, - text=True, - check=True, - ) - committed_files = set(result.stdout.strip().split("\n")) if result.stdout.strip() else set() - - # Also get staged changes not yet committed - result = subprocess.run( - ["git", "diff", "--name-only", "--cached"], - capture_output=True, - text=True, - check=False, - ) - staged_files = set(result.stdout.strip().split("\n")) if result.stdout.strip() else set() - - # Get untracked files - result = subprocess.run( - ["git", "ls-files", "--others", "--exclude-standard"], - capture_output=True, - text=True, - check=False, - ) - untracked_files = set(result.stdout.strip().split("\n")) if result.stdout.strip() else set() - - all_files = committed_files | staged_files | untracked_files - return sorted([f for f in all_files if f]) - - except subprocess.CalledProcessError: - return [] - - -def get_changed_files_default_tip() -> list[str]: - """ - Get files changed compared to the tip of the default branch. - - This does a two-dot diff: what's different between HEAD and origin/default. - - Returns: - List of changed file paths. - """ - default_branch = get_default_branch() - - try: - # Stage all changes so they appear in diff - subprocess.run(["git", "add", "-A"], capture_output=True, check=False) - - # Two-dot diff against default branch tip - result = subprocess.run( - ["git", "diff", "--name-only", f"origin/{default_branch}..HEAD"], - capture_output=True, - text=True, - check=True, - ) - committed_files = set(result.stdout.strip().split("\n")) if result.stdout.strip() else set() - - # Also get staged changes not yet committed - result = subprocess.run( - ["git", "diff", "--name-only", "--cached"], - capture_output=True, - text=True, - check=False, - ) - staged_files = set(result.stdout.strip().split("\n")) if result.stdout.strip() else set() - - # Get untracked files - result = subprocess.run( - ["git", "ls-files", "--others", "--exclude-standard"], - capture_output=True, - text=True, - check=False, - ) - untracked_files = set(result.stdout.strip().split("\n")) if result.stdout.strip() else set() - - all_files = committed_files | staged_files | untracked_files - return sorted([f for f in all_files if f]) - - except subprocess.CalledProcessError: - return [] - - -def get_changed_files_prompt() -> list[str]: - """ - Get files changed since the prompt was submitted. - - This compares against the baseline captured by capture_prompt_work_tree.sh. - - Returns: - List of changed file paths. - """ - baseline_path = Path(".deepwork/.last_work_tree") - - try: - # Stage all changes so we can see them with --cached - subprocess.run(["git", "add", "-A"], capture_output=True, check=False) - - # Get all staged files (includes what was just staged) - result = subprocess.run( - ["git", "diff", "--name-only", "--cached"], - capture_output=True, - text=True, - check=False, - ) - current_files = set(result.stdout.strip().split("\n")) if result.stdout.strip() else set() - current_files = {f for f in current_files if f} - - if baseline_path.exists(): - # Read baseline and find new files - baseline_files = set(baseline_path.read_text().strip().split("\n")) - baseline_files = {f for f in baseline_files if f} - # Return files that are in current but not in baseline - new_files = current_files - baseline_files - return sorted(new_files) - else: - # No baseline, return all current changes - return sorted(current_files) - - except (subprocess.CalledProcessError, OSError): - return [] - - -def get_changed_files_for_mode(mode: str) -> list[str]: - """ - Get changed files for a specific compare_to mode. - - Args: - mode: One of 'base', 'default_tip', or 'prompt' - - Returns: - List of changed file paths. - """ - if mode == "base": - return get_changed_files_base() - elif mode == "default_tip": - return get_changed_files_default_tip() - elif mode == "prompt": - return get_changed_files_prompt() - else: - # Unknown mode, fall back to base - return get_changed_files_base() - - -def extract_promise_tags(text: str) -> set[str]: - """ - Extract policy names from tags in text. - - Supported format: - - ✓ Policy Name - - Args: - text: Text to search for promise tags - - Returns: - Set of policy names that have been promised/addressed - """ - # Match ✓ Policy Name and extract the policy name - pattern = r"✓\s*([^<]+)" - matches = re.findall(pattern, text, re.IGNORECASE | re.DOTALL) - return {m.strip() for m in matches} - - -def format_policy_message(policies: list) -> str: - """ - Format triggered policies into a message for the agent. - - Args: - policies: List of Policy objects that fired - - Returns: - Formatted message with all policy instructions - """ - lines = ["## DeepWork Policies Triggered", ""] - lines.append( - "Comply with the following policies. " - "To mark a policy as addressed, include `✓ Policy Name` " - "in your response (replace Policy Name with the actual policy name)." - ) - lines.append("") - - for policy in policies: - lines.append(f"### Policy: {policy.name}") - lines.append("") - lines.append(policy.instructions.strip()) - lines.append("") - - return "\n".join(lines) - - -def main() -> None: - """Main entry point for policy evaluation CLI.""" - parser = argparse.ArgumentParser( - description="Evaluate DeepWork policies based on changed files" - ) - parser.add_argument( - "--policy-file", - type=str, - required=True, - help="Path to .deepwork.policy.yml file", - ) - - args = parser.parse_args() - - # Check if policy file exists - policy_path = Path(args.policy_file) - if not policy_path.exists(): - # No policy file, nothing to evaluate - print("{}") - return - - # Read conversation context from stdin (if available) - conversation_context = "" - if not sys.stdin.isatty(): - try: - conversation_context = sys.stdin.read() - except Exception: - pass - - # Extract promise tags from conversation - promised_policies = extract_promise_tags(conversation_context) - - # Parse policies - try: - policies = parse_policy_file(policy_path) - except PolicyParseError as e: - # Log error to stderr, return empty result - print(f"Error parsing policy file: {e}", file=sys.stderr) - print("{}") - return - - if not policies: - # No policies defined - print("{}") - return - - # Group policies by compare_to mode to minimize git calls - policies_by_mode: dict[str, list[Policy]] = {} - for policy in policies: - mode = policy.compare_to - if mode not in policies_by_mode: - policies_by_mode[mode] = [] - policies_by_mode[mode].append(policy) - - # Get changed files for each mode and evaluate policies - fired_policies: list[Policy] = [] - for mode, mode_policies in policies_by_mode.items(): - changed_files = get_changed_files_for_mode(mode) - if not changed_files: - continue - - for policy in mode_policies: - # Skip if already promised - if policy.name in promised_policies: - continue - # Evaluate this policy - if evaluate_policy(policy, changed_files): - fired_policies.append(policy) - - if not fired_policies: - # No policies fired - print("{}") - return - - # Format output for Claude Code Stop hooks - # Use "decision": "block" to prevent Claude from stopping - message = format_policy_message(fired_policies) - result = { - "decision": "block", - "reason": message, - } - - print(json.dumps(result)) - - -if __name__ == "__main__": - main() diff --git a/src/deepwork/hooks/gemini_hook.sh b/src/deepwork/hooks/gemini_hook.sh index add66dfc..a2bb09da 100755 --- a/src/deepwork/hooks/gemini_hook.sh +++ b/src/deepwork/hooks/gemini_hook.sh @@ -9,7 +9,7 @@ # gemini_hook.sh # # Example: -# gemini_hook.sh deepwork.hooks.policy_check +# gemini_hook.sh deepwork.hooks.rules_check # # The Python module should implement a main() function that: # 1. Calls deepwork.hooks.wrapper.run_hook() with a hook function @@ -31,7 +31,7 @@ PYTHON_MODULE="${1:-}" if [ -z "${PYTHON_MODULE}" ]; then echo "Usage: gemini_hook.sh " >&2 - echo "Example: gemini_hook.sh deepwork.hooks.policy_check" >&2 + echo "Example: gemini_hook.sh deepwork.hooks.rules_check" >&2 exit 1 fi diff --git a/src/deepwork/hooks/policy_check.py b/src/deepwork/hooks/rules_check.py similarity index 51% rename from src/deepwork/hooks/policy_check.py rename to src/deepwork/hooks/rules_check.py index 287852bd..1d43d12e 100644 --- a/src/deepwork/hooks/policy_check.py +++ b/src/deepwork/hooks/rules_check.py @@ -1,15 +1,17 @@ """ -Policy check hook for DeepWork. +Rules check hook for DeepWork (v2). -This hook evaluates policies when the agent finishes (after_agent event). +This hook evaluates rules when the agent finishes (after_agent event). It uses the wrapper system for cross-platform compatibility. +Rule files are loaded from .deepwork/rules/ directory as frontmatter markdown files. + Usage (via shell wrapper): - claude_hook.sh deepwork.hooks.policy_check - gemini_hook.sh deepwork.hooks.policy_check + claude_hook.sh deepwork.hooks.rules_check + gemini_hook.sh deepwork.hooks.rules_check Or directly with platform environment variable: - DEEPWORK_HOOK_PLATFORM=claude python -m deepwork.hooks.policy_check + DEEPWORK_HOOK_PLATFORM=claude python -m deepwork.hooks.rules_check """ from __future__ import annotations @@ -21,11 +23,25 @@ import sys from pathlib import Path -from deepwork.core.policy_parser import ( - Policy, - PolicyParseError, - evaluate_policy, - parse_policy_file, +from deepwork.core.command_executor import ( + all_commands_succeeded, + format_command_errors, + run_command_action, +) +from deepwork.core.rules_parser import ( + ActionType, + DetectionMode, + Rule, + RuleEvaluationResult, + RulesParseError, + evaluate_rules, + load_rules_from_directory, +) +from deepwork.core.rules_queue import ( + ActionResult, + QueueEntryStatus, + RulesQueue, + compute_trigger_hash, ) from deepwork.hooks.wrapper import ( HookInput, @@ -63,6 +79,41 @@ def get_default_branch() -> str: return "main" +def get_baseline_ref(mode: str) -> str: + """Get the baseline reference for a compare_to mode.""" + if mode == "base": + try: + default_branch = get_default_branch() + result = subprocess.run( + ["git", "merge-base", "HEAD", f"origin/{default_branch}"], + capture_output=True, + text=True, + check=True, + ) + return result.stdout.strip() + except subprocess.CalledProcessError: + return "base" + elif mode == "default_tip": + try: + default_branch = get_default_branch() + result = subprocess.run( + ["git", "rev-parse", f"origin/{default_branch}"], + capture_output=True, + text=True, + check=True, + ) + return result.stdout.strip() + except subprocess.CalledProcessError: + return "default_tip" + elif mode == "prompt": + baseline_path = Path(".deepwork/.last_work_tree") + if baseline_path.exists(): + # Use file modification time as reference + return str(int(baseline_path.stat().st_mtime)) + return "prompt" + return mode + + def get_changed_files_base() -> list[str]: """Get files changed relative to branch base.""" default_branch = get_default_branch() @@ -188,8 +239,15 @@ def get_changed_files_for_mode(mode: str) -> list[str]: def extract_promise_tags(text: str) -> set[str]: - """Extract policy names from tags in text.""" - pattern = r"✓\s*([^<]+)" + """ + Extract rule names from tags in text. + + Supports both: + - Rule Name + - Rule Name + """ + # Match with or without checkmark + pattern = r"(?:\s*)?([^<]+)" matches = re.findall(pattern, text, re.IGNORECASE | re.DOTALL) return {m.strip() for m in matches} @@ -247,39 +305,63 @@ def extract_conversation_from_transcript(transcript_path: str, platform: Platfor return "" -def format_policy_message(policies: list[Policy]) -> str: - """Format triggered policies into a message for the agent.""" - lines = ["## DeepWork Policies Triggered", ""] +def format_rules_message(results: list[RuleEvaluationResult]) -> str: + """ + Format triggered rules into a concise message for the agent. + + Groups rules by name and uses minimal formatting. + """ + lines = ["## DeepWork Rules Triggered", ""] lines.append( - "Comply with the following policies. " - "To mark a policy as addressed, include `✓ Policy Name` " - "in your response (replace Policy Name with the actual policy name)." + "Comply with the following rules. " + "To mark a rule as addressed, include `Rule Name` " + "in your response." ) lines.append("") - for policy in policies: - lines.append(f"### Policy: {policy.name}") - lines.append("") - lines.append(policy.instructions.strip()) + # Group results by rule name + by_name: dict[str, list[RuleEvaluationResult]] = {} + for result in results: + name = result.rule.name + if name not in by_name: + by_name[name] = [] + by_name[name].append(result) + + for name, rule_results in by_name.items(): + rule = rule_results[0].rule + lines.append(f"## {name}") lines.append("") + # For set/pair modes, show the correspondence violations concisely + if rule.detection_mode in (DetectionMode.SET, DetectionMode.PAIR): + for result in rule_results: + for trigger_file in result.trigger_files: + for missing_file in result.missing_files: + lines.append(f"{trigger_file} -> {missing_file}") + lines.append("") + + # Show instructions + if rule.instructions: + lines.append(rule.instructions.strip()) + lines.append("") + return "\n".join(lines) -def policy_check_hook(hook_input: HookInput) -> HookOutput: +def rules_check_hook(hook_input: HookInput) -> HookOutput: """ - Main hook logic for policy evaluation. + Main hook logic for rules evaluation (v2). - This is called for after_agent events to check if policies need attention + This is called for after_agent events to check if rules need attention before allowing the agent to complete. """ # Only process after_agent events if hook_input.event != NormalizedEvent.AFTER_AGENT: return HookOutput() - # Check if policy file exists - policy_path = Path(".deepwork.policy.yml") - if not policy_path.exists(): + # Check if rules directory exists + rules_dir = Path(".deepwork/rules") + if not rules_dir.exists(): return HookOutput() # Extract conversation context from transcript @@ -287,50 +369,133 @@ def policy_check_hook(hook_input: HookInput) -> HookOutput: hook_input.transcript_path, hook_input.platform ) - # Extract promise tags - promised_policies = extract_promise_tags(conversation_context) + # Extract promise tags (case-insensitive) + promised_rules = extract_promise_tags(conversation_context) - # Parse policies + # Load rules try: - policies = parse_policy_file(policy_path) - except PolicyParseError as e: - print(f"Error parsing policy file: {e}", file=sys.stderr) + rules = load_rules_from_directory(rules_dir) + except RulesParseError as e: + print(f"Error loading rules: {e}", file=sys.stderr) return HookOutput() - if not policies: + if not rules: return HookOutput() - # Group policies by compare_to mode - policies_by_mode: dict[str, list[Policy]] = {} - for policy in policies: - mode = policy.compare_to - if mode not in policies_by_mode: - policies_by_mode[mode] = [] - policies_by_mode[mode].append(policy) - - # Evaluate policies - fired_policies: list[Policy] = [] - for mode, mode_policies in policies_by_mode.items(): + # Initialize queue + queue = RulesQueue() + + # Group rules by compare_to mode + rules_by_mode: dict[str, list[Rule]] = {} + for rule in rules: + mode = rule.compare_to + if mode not in rules_by_mode: + rules_by_mode[mode] = [] + rules_by_mode[mode].append(rule) + + # Evaluate rules and collect results + prompt_results: list[RuleEvaluationResult] = [] + command_errors: list[str] = [] + + for mode, mode_rules in rules_by_mode.items(): changed_files = get_changed_files_for_mode(mode) if not changed_files: continue - for policy in mode_policies: - if policy.name in promised_policies: - continue - if evaluate_policy(policy, changed_files): - fired_policies.append(policy) + baseline_ref = get_baseline_ref(mode) - if not fired_policies: - return HookOutput() + # Evaluate which rules fire + results = evaluate_rules(mode_rules, changed_files, promised_rules) + + for result in results: + rule = result.rule + + # Compute trigger hash for queue deduplication + trigger_hash = compute_trigger_hash( + rule.name, + result.trigger_files, + baseline_ref, + ) + + # Check if already in queue (passed/skipped) + existing = queue.get_entry(trigger_hash) + if existing and existing.status in ( + QueueEntryStatus.PASSED, + QueueEntryStatus.SKIPPED, + ): + continue - # Format message and return blocking response - message = format_policy_message(fired_policies) - return HookOutput(decision="block", reason=message) + # Create queue entry if new + if not existing: + queue.create_entry( + rule_name=rule.name, + rule_file=f"{rule.filename}.md", + trigger_files=result.trigger_files, + baseline_ref=baseline_ref, + expected_files=result.missing_files, + ) + + # Handle based on action type + if rule.action_type == ActionType.COMMAND: + # Run command action + if rule.command_action: + repo_root = Path.cwd() + cmd_results = run_command_action( + rule.command_action, + result.trigger_files, + repo_root, + ) + + if all_commands_succeeded(cmd_results): + # Command succeeded, mark as passed + queue.update_status( + trigger_hash, + QueueEntryStatus.PASSED, + ActionResult( + type="command", + output=cmd_results[0].stdout if cmd_results else None, + exit_code=0, + ), + ) + else: + # Command failed + error_msg = format_command_errors(cmd_results) + command_errors.append(f"## {rule.name}\n{error_msg}") + queue.update_status( + trigger_hash, + QueueEntryStatus.FAILED, + ActionResult( + type="command", + output=error_msg, + exit_code=cmd_results[0].exit_code if cmd_results else -1, + ), + ) + + elif rule.action_type == ActionType.PROMPT: + # Collect for prompt output + prompt_results.append(result) + + # Build response + messages: list[str] = [] + + # Add command errors if any + if command_errors: + messages.append("## Command Rule Errors\n") + messages.extend(command_errors) + messages.append("") + + # Add prompt rules if any + if prompt_results: + messages.append(format_rules_message(prompt_results)) + + if messages: + return HookOutput(decision="block", reason="\n".join(messages)) + + return HookOutput() def main() -> None: - """Entry point for the policy check hook.""" + """Entry point for the rules check hook.""" # Determine platform from environment platform_str = os.environ.get("DEEPWORK_HOOK_PLATFORM", "claude") try: @@ -339,7 +504,7 @@ def main() -> None: platform = Platform.CLAUDE # Run the hook with the wrapper - exit_code = run_hook(policy_check_hook, platform) + exit_code = run_hook(rules_check_hook, platform) sys.exit(exit_code) diff --git a/src/deepwork/hooks/wrapper.py b/src/deepwork/hooks/wrapper.py index 4733b5fb..ef20899c 100644 --- a/src/deepwork/hooks/wrapper.py +++ b/src/deepwork/hooks/wrapper.py @@ -358,7 +358,6 @@ def run_hook( output_json = denormalize_output(hook_output, platform, hook_input.event) write_stdout(output_json) - # Return exit code based on decision - if hook_output.decision in ("block", "deny"): - return 2 + # Always return 0 when using JSON output format + # The decision field in the JSON controls blocking behavior return 0 diff --git a/src/deepwork/schemas/policy_schema.py b/src/deepwork/schemas/policy_schema.py deleted file mode 100644 index 5aa6ae89..00000000 --- a/src/deepwork/schemas/policy_schema.py +++ /dev/null @@ -1,78 +0,0 @@ -"""JSON Schema definition for policy definitions.""" - -from typing import Any - -# JSON Schema for .deepwork.policy.yml files -# Policies are defined as an array of policy objects -POLICY_SCHEMA: dict[str, Any] = { - "$schema": "http://json-schema.org/draft-07/schema#", - "type": "array", - "description": "List of policies that trigger based on file changes", - "items": { - "type": "object", - "required": ["name", "trigger"], - "properties": { - "name": { - "type": "string", - "minLength": 1, - "description": "Friendly name for the policy", - }, - "trigger": { - "oneOf": [ - { - "type": "string", - "minLength": 1, - "description": "Glob pattern for files that trigger this policy", - }, - { - "type": "array", - "items": {"type": "string", "minLength": 1}, - "minItems": 1, - "description": "List of glob patterns for files that trigger this policy", - }, - ], - "description": "Glob pattern(s) for files that, if changed, should trigger this policy", - }, - "safety": { - "oneOf": [ - { - "type": "string", - "minLength": 1, - "description": "Glob pattern for safety files", - }, - { - "type": "array", - "items": {"type": "string", "minLength": 1}, - "description": "List of glob patterns for safety files", - }, - ], - "description": "Glob pattern(s) for files that, if also changed, mean the policy doesn't need to trigger", - }, - "instructions": { - "type": "string", - "minLength": 1, - "description": "Instructions to give the agent when this policy triggers", - }, - "instructions_file": { - "type": "string", - "minLength": 1, - "description": "Path to a file containing instructions (alternative to inline instructions)", - }, - "compare_to": { - "type": "string", - "enum": ["base", "default_tip", "prompt"], - "description": ( - "What to compare against when detecting changed files. " - "'base' (default) compares to the base of the current branch. " - "'default_tip' compares to the tip of the default branch. " - "'prompt' compares to the state at the start of the prompt." - ), - }, - }, - "oneOf": [ - {"required": ["instructions"]}, - {"required": ["instructions_file"]}, - ], - "additionalProperties": False, - }, -} diff --git a/src/deepwork/schemas/rules_schema.py b/src/deepwork/schemas/rules_schema.py new file mode 100644 index 00000000..3112dd0f --- /dev/null +++ b/src/deepwork/schemas/rules_schema.py @@ -0,0 +1,103 @@ +"""JSON Schema definition for rule definitions (v2 - frontmatter format).""" + +from typing import Any + +# Pattern for string or array of strings +STRING_OR_ARRAY: dict[str, Any] = { + "oneOf": [ + {"type": "string", "minLength": 1}, + {"type": "array", "items": {"type": "string", "minLength": 1}, "minItems": 1}, + ] +} + +# JSON Schema for rule frontmatter (YAML between --- delimiters) +# Rules are stored as individual .md files in .deepwork/rules/ +RULES_FRONTMATTER_SCHEMA: dict[str, Any] = { + "$schema": "http://json-schema.org/draft-07/schema#", + "type": "object", + "required": ["name"], + "properties": { + "name": { + "type": "string", + "minLength": 1, + "description": "Human-friendly name for the rule (displayed in promise tags)", + }, + # Detection mode: trigger/safety (mutually exclusive with set/pair) + "trigger": { + **STRING_OR_ARRAY, + "description": "Glob pattern(s) for files that trigger this rule", + }, + "safety": { + **STRING_OR_ARRAY, + "description": "Glob pattern(s) that suppress the rule if changed", + }, + # Detection mode: set (bidirectional correspondence) + "set": { + "type": "array", + "items": {"type": "string", "minLength": 1}, + "minItems": 2, + "description": "Patterns defining bidirectional file correspondence", + }, + # Detection mode: pair (directional correspondence) + "pair": { + "type": "object", + "required": ["trigger", "expects"], + "properties": { + "trigger": { + "type": "string", + "minLength": 1, + "description": "Pattern that triggers the rule", + }, + "expects": { + **STRING_OR_ARRAY, + "description": "Pattern(s) for expected corresponding files", + }, + }, + "additionalProperties": False, + "description": "Directional file correspondence (trigger -> expects)", + }, + # Action type: command (default is prompt using markdown body) + "action": { + "type": "object", + "required": ["command"], + "properties": { + "command": { + "type": "string", + "minLength": 1, + "description": "Command to run (supports {file}, {files}, {repo_root})", + }, + "run_for": { + "type": "string", + "enum": ["each_match", "all_matches"], + "default": "each_match", + "description": "Run command for each file or all files at once", + }, + }, + "additionalProperties": False, + "description": "Command action to run instead of prompting", + }, + # Common options + "compare_to": { + "type": "string", + "enum": ["base", "default_tip", "prompt"], + "default": "base", + "description": "Baseline for detecting file changes", + }, + }, + "additionalProperties": False, + # Detection mode must be exactly one of: trigger, set, or pair + "oneOf": [ + { + "required": ["trigger"], + "not": {"anyOf": [{"required": ["set"]}, {"required": ["pair"]}]}, + }, + { + "required": ["set"], + "not": {"anyOf": [{"required": ["trigger"]}, {"required": ["pair"]}]}, + }, + { + "required": ["pair"], + "not": {"anyOf": [{"required": ["trigger"]}, {"required": ["set"]}]}, + }, + ], +} diff --git a/src/deepwork/standard_jobs/deepwork_jobs/job.yml b/src/deepwork/standard_jobs/deepwork_jobs/job.yml index e1afa5ee..e95aa2c0 100644 --- a/src/deepwork/standard_jobs/deepwork_jobs/job.yml +++ b/src/deepwork/standard_jobs/deepwork_jobs/job.yml @@ -77,9 +77,9 @@ steps: 6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"? 7. **Sync Complete**: Has `deepwork sync` been run successfully? 8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`? - 9. **Policies Considered**: Have you thought about whether policies would benefit this job? - - If relevant policies were identified, did you explain them and offer to run `/deepwork_policy.define`? - - Not every job needs policies - only suggest when genuinely helpful. + 9. **Rules Considered**: Have you thought about whether rules would benefit this job? + - If relevant rules were identified, did you explain them and offer to run `/deepwork_rules.define`? + - Not every job needs rules - only suggest when genuinely helpful. If ANY criterion is not met, continue working to address it. If ALL criteria are satisfied, include `✓ Quality Criteria Met` in your response. diff --git a/src/deepwork/standard_jobs/deepwork_jobs/steps/implement.md b/src/deepwork/standard_jobs/deepwork_jobs/steps/implement.md index a3a790f6..7771eaee 100644 --- a/src/deepwork/standard_jobs/deepwork_jobs/steps/implement.md +++ b/src/deepwork/standard_jobs/deepwork_jobs/steps/implement.md @@ -130,19 +130,19 @@ This will: After running `deepwork sync`, look at the "To use the new commands" section in the output. **Relay these exact reload instructions to the user** so they know how to pick up the new commands. Don't just reference the sync output - tell them directly what they need to do (e.g., "Type 'exit' then run 'claude --resume'" for Claude Code, or "Run '/memory refresh'" for Gemini CLI). -### Step 7: Consider Policies for the New Job +### Step 7: Consider Rules for the New Job -After implementing the job, consider whether there are **policies** that would help enforce quality or consistency when working with this job's domain. +After implementing the job, consider whether there are **rules** that would help enforce quality or consistency when working with this job's domain. -**What are policies?** +**What are rules?** -Policies are automated guardrails defined in `.deepwork.policy.yml` that trigger when certain files change during an AI session. They help ensure: +Rules are automated guardrails stored as markdown files in `.deepwork/rules/` that trigger when certain files change during an AI session. They help ensure: - Documentation stays in sync with code - Team guidelines are followed - Architectural decisions are respected - Quality standards are maintained -**When to suggest policies:** +**When to suggest rules:** Think about the job you just implemented and ask: - Does this job produce outputs that other files depend on? @@ -150,28 +150,28 @@ Think about the job you just implemented and ask: - Are there quality checks or reviews that should happen when certain files in this domain change? - Could changes to the job's output files impact other parts of the project? -**Examples of policies that might make sense:** +**Examples of rules that might make sense:** -| Job Type | Potential Policy | -|----------|------------------| +| Job Type | Potential Rule | +|----------|----------------| | API Design | "Update API docs when endpoint definitions change" | | Database Schema | "Review migrations when schema files change" | | Competitive Research | "Update strategy docs when competitor analysis changes" | | Feature Development | "Update changelog when feature files change" | | Configuration Management | "Update install guide when config files change" | -**How to offer policy creation:** +**How to offer rule creation:** -If you identify one or more policies that would benefit the user, explain: -1. **What the policy would do** - What triggers it and what action it prompts +If you identify one or more rules that would benefit the user, explain: +1. **What the rule would do** - What triggers it and what action it prompts 2. **Why it would help** - How it prevents common mistakes or keeps things in sync 3. **What files it would watch** - The trigger patterns Then ask the user: -> "Would you like me to create this policy for you? I can run `/deepwork_policy.define` to set it up." +> "Would you like me to create this rule for you? I can run `/deepwork_rules.define` to set it up." -If the user agrees, invoke the `/deepwork_policy.define` command to guide them through creating the policy. +If the user agrees, invoke the `/deepwork_rules.define` command to guide them through creating the rule. **Example dialogue:** @@ -180,15 +180,15 @@ Based on the competitive_research job you just created, I noticed that when competitor analysis files change, it would be helpful to remind you to update your strategy documentation. -I'd suggest a policy like: +I'd suggest a rule like: - **Name**: "Update strategy when competitor analysis changes" - **Trigger**: `**/positioning_report.md` - **Action**: Prompt to review and update `docs/strategy.md` -Would you like me to create this policy? I can run `/deepwork_policy.define` to set it up. +Would you like me to create this rule? I can run `/deepwork_rules.define` to set it up. ``` -**Note:** Not every job needs policies. Only suggest them when they would genuinely help maintain consistency or quality. Don't force policies where they don't make sense. +**Note:** Not every job needs rules. Only suggest them when they would genuinely help maintain consistency or quality. Don't force rules where they don't make sense. ## Example Implementation @@ -222,8 +222,8 @@ Before marking this step complete, ensure: - [ ] `deepwork sync` executed successfully - [ ] Commands generated in platform directory - [ ] User informed to follow reload instructions from `deepwork sync` -- [ ] Considered whether policies would benefit this job (Step 7) -- [ ] If policies suggested, offered to run `/deepwork_policy.define` +- [ ] Considered whether rules would benefit this job (Step 7) +- [ ] If rules suggested, offered to run `/deepwork_rules.define` ## Quality Criteria @@ -235,4 +235,4 @@ Before marking this step complete, ensure: - Steps with user inputs explicitly use "ask structured questions" phrasing - Sync completed successfully - Commands available for use -- Thoughtfully considered relevant policies for the job domain +- Thoughtfully considered relevant rules for the job domain diff --git a/src/deepwork/standard_jobs/deepwork_policy/hooks/global_hooks.yml b/src/deepwork/standard_jobs/deepwork_policy/hooks/global_hooks.yml deleted file mode 100644 index 0e024fc7..00000000 --- a/src/deepwork/standard_jobs/deepwork_policy/hooks/global_hooks.yml +++ /dev/null @@ -1,8 +0,0 @@ -# DeepWork Policy Hooks Configuration -# Maps Claude Code lifecycle events to hook scripts - -UserPromptSubmit: - - user_prompt_submit.sh - -Stop: - - policy_stop_hook.sh diff --git a/src/deepwork/standard_jobs/deepwork_policy/hooks/policy_stop_hook.sh b/src/deepwork/standard_jobs/deepwork_policy/hooks/policy_stop_hook.sh deleted file mode 100755 index b12d456c..00000000 --- a/src/deepwork/standard_jobs/deepwork_policy/hooks/policy_stop_hook.sh +++ /dev/null @@ -1,56 +0,0 @@ -#!/bin/bash -# policy_stop_hook.sh - Evaluates policies when the agent stops -# -# This script is called as a Claude Code Stop hook. It: -# 1. Evaluates policies from .deepwork.policy.yml -# 2. Computes changed files based on each policy's compare_to setting -# 3. Checks for tags in the conversation transcript -# 4. Returns JSON to block stop if policies need attention - -set -e - -# Check if policy file exists -if [ ! -f .deepwork.policy.yml ]; then - # No policies defined, nothing to do - exit 0 -fi - -# Read the hook input JSON from stdin -HOOK_INPUT="" -if [ ! -t 0 ]; then - HOOK_INPUT=$(cat) -fi - -# Extract transcript_path from the hook input JSON using jq -# Claude Code passes: {"session_id": "...", "transcript_path": "...", ...} -TRANSCRIPT_PATH="" -if [ -n "${HOOK_INPUT}" ]; then - TRANSCRIPT_PATH=$(echo "${HOOK_INPUT}" | jq -r '.transcript_path // empty' 2>/dev/null || echo "") -fi - -# Extract conversation text from the JSONL transcript -# The transcript is JSONL format - each line is a JSON object -# We need to extract the text content from assistant messages -conversation_context="" -if [ -n "${TRANSCRIPT_PATH}" ] && [ -f "${TRANSCRIPT_PATH}" ]; then - # Extract text content from all assistant messages in the transcript - # Each line is a JSON object; we extract .message.content[].text for assistant messages - conversation_context=$(cat "${TRANSCRIPT_PATH}" | \ - grep -E '"role"\s*:\s*"assistant"' | \ - jq -r '.message.content // [] | map(select(.type == "text")) | map(.text) | join("\n")' 2>/dev/null | \ - tr -d '\0' || echo "") -fi - -# Call the Python evaluator -# The Python module handles: -# - Parsing the policy file -# - Computing changed files based on each policy's compare_to setting -# - Matching changed files against triggers/safety patterns -# - Checking for promise tags in the conversation context -# - Generating appropriate JSON output -result=$(echo "${conversation_context}" | python -m deepwork.hooks.evaluate_policies \ - --policy-file .deepwork.policy.yml \ - 2>/dev/null || echo '{}') - -# Output the result (JSON for Claude Code hooks) -echo "${result}" diff --git a/src/deepwork/standard_jobs/deepwork_policy/job.yml b/src/deepwork/standard_jobs/deepwork_policy/job.yml deleted file mode 100644 index 777894ed..00000000 --- a/src/deepwork/standard_jobs/deepwork_policy/job.yml +++ /dev/null @@ -1,37 +0,0 @@ -name: deepwork_policy -version: "0.2.0" -summary: "Policy enforcement for AI agent sessions" -description: | - Manages policies that automatically trigger when certain files change during an AI agent session. - Policies help ensure that code changes follow team guidelines, documentation is updated, - and architectural decisions are respected. - - Policies are defined in a `.deepwork.policy.yml` file at the root of your project. Each policy - specifies: - - Trigger patterns: Glob patterns for files that, when changed, should trigger the policy - - Safety patterns: Glob patterns for files that, if also changed, mean the policy doesn't need to fire - - Instructions: What the agent should do when the policy triggers - - Example use cases: - - Update installation docs when configuration files change - - Require security review when authentication code is modified - - Ensure API documentation stays in sync with API code - - Remind developers to update changelogs - -changelog: - - version: "0.1.0" - changes: "Initial version" - - version: "0.2.0" - changes: "Standardized on 'ask structured questions' phrasing for user input" - -steps: - - id: define - name: "Define Policy" - description: "Create or update policy entries in .deepwork.policy.yml" - instructions_file: steps/define.md - inputs: - - name: policy_purpose - description: "What guideline or constraint should this policy enforce?" - outputs: - - .deepwork.policy.yml - dependencies: [] diff --git a/src/deepwork/standard_jobs/deepwork_policy/steps/define.md b/src/deepwork/standard_jobs/deepwork_policy/steps/define.md deleted file mode 100644 index 302eda7f..00000000 --- a/src/deepwork/standard_jobs/deepwork_policy/steps/define.md +++ /dev/null @@ -1,198 +0,0 @@ -# Define Policy - -## Objective - -Create or update policy entries in the `.deepwork.policy.yml` file to enforce team guidelines, documentation requirements, or other constraints when specific files change. - -## Task - -Guide the user through defining a new policy by asking structured questions. **Do not create the policy without first understanding what they want to enforce.** - -**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. - -### Step 1: Understand the Policy Purpose - -Start by asking structured questions to understand what the user wants to enforce: - -1. **What guideline or constraint should this policy enforce?** - - What situation triggers the need for action? - - What files or directories, when changed, should trigger this policy? - - Examples: "When config files change", "When API code changes", "When database schema changes" - -2. **What action should be taken?** - - What should the agent do when the policy triggers? - - Update documentation? Perform a security review? Update tests? - - Is there a specific file or process that needs attention? - -3. **Are there any "safety" conditions?** - - Are there files that, if also changed, mean the policy doesn't need to fire? - - For example: If config changes AND install_guide.md changes, assume docs are already updated - - This prevents redundant prompts when the user has already done the right thing - -### Step 2: Define the Trigger Patterns - -Help the user define glob patterns for files that should trigger the policy: - -**Common patterns:** -- `src/**/*.py` - All Python files in src directory (recursive) -- `app/config/**/*` - All files in app/config directory -- `*.md` - All markdown files in root -- `src/api/**/*` - All files in the API directory -- `migrations/**/*.sql` - All SQL migrations - -**Pattern syntax:** -- `*` - Matches any characters within a single path segment -- `**` - Matches any characters across multiple path segments (recursive) -- `?` - Matches a single character - -### Step 3: Define Safety Patterns (Optional) - -If there are files that, when also changed, mean the policy shouldn't fire: - -**Examples:** -- Policy: "Update install guide when config changes" - - Trigger: `app/config/**/*` - - Safety: `docs/install_guide.md` (if already updated, don't prompt) - -- Policy: "Security review for auth changes" - - Trigger: `src/auth/**/*` - - Safety: `SECURITY.md`, `docs/security_review.md` - -### Step 3b: Choose the Comparison Mode (Optional) - -The `compare_to` field controls what baseline is used when detecting "changed files": - -**Options:** -- `base` (default) - Compares to the base of the current branch (merge-base with main/master). This is the most common choice for feature branches, as it shows all changes made on the branch. -- `default_tip` - Compares to the current tip of the default branch (main/master). Useful when you want to see the difference from what's currently in production. -- `prompt` - Compares to the state at the start of each prompt. Useful for policies that should only fire based on changes made during a single agent response. - -**When to use each:** -- **base**: Best for most policies. "Did this branch change config files?" → trigger docs review -- **default_tip**: For policies about what's different from production/main -- **prompt**: For policies that should only consider very recent changes within the current session - -Most policies should use the default (`base`) and don't need to specify `compare_to`. - -### Step 4: Write the Instructions - -Create clear, actionable instructions for what the agent should do when the policy fires. - -**Good instructions include:** -- What to check or review -- What files might need updating -- Specific actions to take -- Quality criteria for completion - -**Example:** -``` -Configuration files have changed. Please: -1. Review docs/install_guide.md for accuracy -2. Update any installation steps that reference changed config -3. Verify environment variable documentation is current -4. Test that installation instructions still work -``` - -### Step 5: Create the Policy Entry - -Create or update `.deepwork.policy.yml` in the project root. - -**File Location**: `.deepwork.policy.yml` (root of project) - -**Format**: -```yaml -- name: "[Friendly name for the policy]" - trigger: "[glob pattern]" # or array: ["pattern1", "pattern2"] - safety: "[glob pattern]" # optional, or array - compare_to: "base" # optional: "base" (default), "default_tip", or "prompt" - instructions: | - [Multi-line instructions for the agent...] -``` - -**Alternative with instructions_file**: -```yaml -- name: "[Friendly name for the policy]" - trigger: "[glob pattern]" - safety: "[glob pattern]" - compare_to: "base" # optional - instructions_file: "path/to/instructions.md" -``` - -### Step 6: Verify the Policy - -After creating the policy: - -1. **Check the YAML syntax** - Ensure valid YAML formatting -2. **Test trigger patterns** - Verify patterns match intended files -3. **Review instructions** - Ensure they're clear and actionable -4. **Check for conflicts** - Ensure the policy doesn't conflict with existing ones - -## Example Policies - -### Update Documentation on Config Changes -```yaml -- name: "Update install guide on config changes" - trigger: "app/config/**/*" - safety: "docs/install_guide.md" - instructions: | - Configuration files have been modified. Please review docs/install_guide.md - and update it if any installation instructions need to change based on the - new configuration. -``` - -### Security Review for Auth Code -```yaml -- name: "Security review for authentication changes" - trigger: - - "src/auth/**/*" - - "src/security/**/*" - safety: - - "SECURITY.md" - - "docs/security_audit.md" - instructions: | - Authentication or security code has been changed. Please: - 1. Review for hardcoded credentials or secrets - 2. Check input validation on user inputs - 3. Verify access control logic is correct - 4. Update security documentation if needed -``` - -### API Documentation Sync -```yaml -- name: "API documentation update" - trigger: "src/api/**/*.py" - safety: "docs/api/**/*.md" - instructions: | - API code has changed. Please verify that API documentation in docs/api/ - is up to date with the code changes. Pay special attention to: - - New or changed endpoints - - Modified request/response schemas - - Updated authentication requirements -``` - -## Output Format - -### .deepwork.policy.yml -Create or update this file at the project root with the new policy entry. - -## Quality Criteria - -- Asked structured questions to understand user requirements -- Policy name is clear and descriptive -- Trigger patterns accurately match the intended files -- Safety patterns prevent unnecessary triggering -- Instructions are actionable and specific -- YAML is valid and properly formatted - -## Context - -Policies are evaluated automatically when you finish working on a task. The system: -1. Determines which files have changed based on each policy's `compare_to` setting: - - `base` (default): Files changed since the branch diverged from main/master - - `default_tip`: Files different from the current main/master branch - - `prompt`: Files changed since the last prompt submission -2. Checks if any changes match policy trigger patterns -3. Skips policies where safety patterns also matched -4. Prompts you with instructions for any triggered policies - -You can mark a policy as addressed by including `✓ Policy Name` in your response (replace Policy Name with the actual policy name). This tells the system you've already handled that policy's requirements. diff --git a/src/deepwork/standard_jobs/deepwork_policy/hooks/capture_prompt_work_tree.sh b/src/deepwork/standard_jobs/deepwork_rules/hooks/capture_prompt_work_tree.sh similarity index 100% rename from src/deepwork/standard_jobs/deepwork_policy/hooks/capture_prompt_work_tree.sh rename to src/deepwork/standard_jobs/deepwork_rules/hooks/capture_prompt_work_tree.sh diff --git a/src/deepwork/standard_jobs/deepwork_rules/hooks/global_hooks.yml b/src/deepwork/standard_jobs/deepwork_rules/hooks/global_hooks.yml new file mode 100644 index 00000000..a310d31a --- /dev/null +++ b/src/deepwork/standard_jobs/deepwork_rules/hooks/global_hooks.yml @@ -0,0 +1,8 @@ +# DeepWork Rules Hooks Configuration +# Maps lifecycle events to hook scripts or Python modules + +UserPromptSubmit: + - user_prompt_submit.sh + +Stop: + - module: deepwork.hooks.rules_check diff --git a/src/deepwork/standard_jobs/deepwork_policy/hooks/user_prompt_submit.sh b/src/deepwork/standard_jobs/deepwork_rules/hooks/user_prompt_submit.sh similarity index 100% rename from src/deepwork/standard_jobs/deepwork_policy/hooks/user_prompt_submit.sh rename to src/deepwork/standard_jobs/deepwork_rules/hooks/user_prompt_submit.sh diff --git a/src/deepwork/standard_jobs/deepwork_rules/job.yml b/src/deepwork/standard_jobs/deepwork_rules/job.yml new file mode 100644 index 00000000..af540bc4 --- /dev/null +++ b/src/deepwork/standard_jobs/deepwork_rules/job.yml @@ -0,0 +1,39 @@ +name: deepwork_rules +version: "0.3.0" +summary: "Rules enforcement for AI agent sessions" +description: | + Manages rules that automatically trigger when certain files change during an AI agent session. + Rules help ensure that code changes follow team guidelines, documentation is updated, + and architectural decisions are respected. + + Rules are stored as individual markdown files with YAML frontmatter in the `.deepwork/rules/` + directory. Each rule file specifies: + - Detection mode: trigger/safety, set (bidirectional), or pair (directional) + - Patterns: Glob patterns for matching files, with optional variable capture + - Instructions: Markdown content describing what the agent should do + + Example use cases: + - Update installation docs when configuration files change + - Require security review when authentication code is modified + - Ensure API documentation stays in sync with API code + - Enforce source/test file pairing + +changelog: + - version: "0.1.0" + changes: "Initial version" + - version: "0.2.0" + changes: "Standardized on 'ask structured questions' phrasing for user input" + - version: "0.3.0" + changes: "Migrated to v2 format - individual markdown files in .deepwork/rules/" + +steps: + - id: define + name: "Define Rule" + description: "Create a new rule file in .deepwork/rules/" + instructions_file: steps/define.md + inputs: + - name: rule_purpose + description: "What guideline or constraint should this rule enforce?" + outputs: + - .deepwork/rules/{rule-name}.md + dependencies: [] diff --git a/src/deepwork/standard_jobs/deepwork_rules/rules/.gitkeep b/src/deepwork/standard_jobs/deepwork_rules/rules/.gitkeep new file mode 100644 index 00000000..429162b4 --- /dev/null +++ b/src/deepwork/standard_jobs/deepwork_rules/rules/.gitkeep @@ -0,0 +1,13 @@ +# This directory contains example rule templates. +# Copy and customize these files to create your own rules. +# +# Rule files use YAML frontmatter in markdown format: +# +# --- +# name: Rule Name +# trigger: "pattern/**/*" +# safety: "optional/pattern" +# --- +# Instructions in markdown here. +# +# See doc/rules_syntax.md for full documentation. diff --git a/src/deepwork/standard_jobs/deepwork_rules/rules/api-documentation-sync.md.example b/src/deepwork/standard_jobs/deepwork_rules/rules/api-documentation-sync.md.example new file mode 100644 index 00000000..427da7ae --- /dev/null +++ b/src/deepwork/standard_jobs/deepwork_rules/rules/api-documentation-sync.md.example @@ -0,0 +1,10 @@ +--- +name: API Documentation Sync +trigger: src/api/**/* +safety: docs/api/**/*.md +--- +API code has changed. Please verify that API documentation is up to date: + +- New or changed endpoints +- Modified request/response schemas +- Updated authentication requirements diff --git a/src/deepwork/standard_jobs/deepwork_rules/rules/readme-documentation.md.example b/src/deepwork/standard_jobs/deepwork_rules/rules/readme-documentation.md.example new file mode 100644 index 00000000..6be90c83 --- /dev/null +++ b/src/deepwork/standard_jobs/deepwork_rules/rules/readme-documentation.md.example @@ -0,0 +1,10 @@ +--- +name: README Documentation +trigger: src/**/* +safety: README.md +--- +Source code has been modified. Please review README.md for accuracy: + +1. Verify the project overview reflects current functionality +2. Check that usage examples are still correct +3. Ensure installation/setup instructions remain valid diff --git a/src/deepwork/standard_jobs/deepwork_rules/rules/security-review.md.example b/src/deepwork/standard_jobs/deepwork_rules/rules/security-review.md.example new file mode 100644 index 00000000..abce3194 --- /dev/null +++ b/src/deepwork/standard_jobs/deepwork_rules/rules/security-review.md.example @@ -0,0 +1,11 @@ +--- +name: Security Review for Auth Changes +trigger: + - src/auth/**/* + - src/security/**/* +--- +Authentication or security code has been changed. Please: + +1. Review for hardcoded credentials or secrets +2. Check input validation on user inputs +3. Verify access control logic is correct diff --git a/src/deepwork/standard_jobs/deepwork_rules/rules/source-test-pairing.md.example b/src/deepwork/standard_jobs/deepwork_rules/rules/source-test-pairing.md.example new file mode 100644 index 00000000..3ebd6968 --- /dev/null +++ b/src/deepwork/standard_jobs/deepwork_rules/rules/source-test-pairing.md.example @@ -0,0 +1,13 @@ +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +When modifying source code, ensure corresponding tests are updated. +When adding tests, ensure they test actual source code. + +Modified source: {trigger_files} +Expected tests: {expected_files} diff --git a/src/deepwork/standard_jobs/deepwork_rules/steps/define.md b/src/deepwork/standard_jobs/deepwork_rules/steps/define.md new file mode 100644 index 00000000..1e38a5e6 --- /dev/null +++ b/src/deepwork/standard_jobs/deepwork_rules/steps/define.md @@ -0,0 +1,249 @@ +# Define Rule + +## Objective + +Create a new rule file in the `.deepwork/rules/` directory to enforce team guidelines, documentation requirements, or other constraints when specific files change. + +## Task + +Guide the user through defining a new rule by asking structured questions. **Do not create the rule without first understanding what they want to enforce.** + +**Important**: Use the AskUserQuestion tool to ask structured questions when gathering information from the user. This provides a better user experience with clear options and guided choices. + +### Step 1: Understand the Rule Purpose + +Start by asking structured questions to understand what the user wants to enforce: + +1. **What guideline or constraint should this rule enforce?** + - What situation triggers the need for action? + - What files or directories, when changed, should trigger this rule? + - Examples: "When config files change", "When API code changes", "When database schema changes" + +2. **What action should be taken?** + - What should the agent do when the rule triggers? + - Update documentation? Perform a security review? Update tests? + - Is there a specific file or process that needs attention? + +3. **Are there any "safety" conditions?** + - Are there files that, if also changed, mean the rule doesn't need to fire? + - For example: If config changes AND install_guide.md changes, assume docs are already updated + - This prevents redundant prompts when the user has already done the right thing + +### Step 2: Choose the Detection Mode + +Help the user select the appropriate detection mode: + +**Trigger/Safety Mode** (most common): +- Fires when trigger patterns match AND no safety patterns match +- Use for: "When X changes, check Y" rules +- Example: When config changes, verify install docs + +**Set Mode** (bidirectional correspondence): +- Fires when files that should change together don't all change +- Use for: Source/test pairing, model/migration sync +- Example: `src/foo.py` and `tests/foo_test.py` should change together + +**Pair Mode** (directional correspondence): +- Fires when a trigger file changes but expected files don't +- Changes to expected files alone do NOT trigger +- Use for: API code requires documentation updates (but docs can update independently) + +### Step 3: Define the Patterns + +Help the user define glob patterns for files. + +**Common patterns:** +- `src/**/*.py` - All Python files in src directory (recursive) +- `app/config/**/*` - All files in app/config directory +- `*.md` - All markdown files in root +- `src/api/**/*` - All files in the API directory +- `migrations/**/*.sql` - All SQL migrations + +**Variable patterns (for set/pair modes):** +- `src/{path}.py` - Captures path variable (e.g., `foo/bar` from `src/foo/bar.py`) +- `tests/{path}_test.py` - Uses same path variable in corresponding file +- `{name}` matches single segment, `{path}` matches multiple segments + +**Pattern syntax:** +- `*` - Matches any characters within a single path segment +- `**` - Matches any characters across multiple path segments (recursive) +- `?` - Matches a single character + +### Step 4: Choose the Comparison Mode (Optional) + +The `compare_to` field controls what baseline is used when detecting "changed files": + +**Options:** +- `base` (default) - Compares to the base of the current branch (merge-base with main/master). Best for feature branches. +- `default_tip` - Compares to the current tip of the default branch. Useful for seeing difference from production. +- `prompt` - Compares to the state at the start of each prompt. For rules about very recent changes. + +Most rules should use the default (`base`) and don't need to specify `compare_to`. + +### Step 5: Write the Instructions + +Create clear, actionable instructions for what the agent should do when the rule fires. + +**Good instructions include:** +- What to check or review +- What files might need updating +- Specific actions to take +- Quality criteria for completion + +**Template variables available in instructions:** +- `{trigger_files}` - Files that triggered the rule +- `{expected_files}` - Expected corresponding files (for set/pair modes) + +### Step 6: Create the Rule File + +Create a new file in `.deepwork/rules/` with a kebab-case filename: + +**File Location**: `.deepwork/rules/{rule-name}.md` + +**Format for Trigger/Safety Mode:** +```markdown +--- +name: Friendly Name for the Rule +trigger: "glob/pattern/**/*" # or array: ["pattern1", "pattern2"] +safety: "optional/pattern" # optional, or array +compare_to: base # optional: "base" (default), "default_tip", or "prompt" +--- +Instructions for the agent when this rule fires. + +Multi-line markdown content is supported. +``` + +**Format for Set Mode (bidirectional):** +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +Modified: {trigger_files} +Expected: {expected_files} +``` + +**Format for Pair Mode (directional):** +```markdown +--- +name: API Documentation +pair: + trigger: api/{path}.py + expects: docs/api/{path}.md +--- +API code requires documentation updates. + +Changed API: {trigger_files} +Update docs: {expected_files} +``` + +### Step 7: Verify the Rule + +After creating the rule: + +1. **Check the YAML frontmatter** - Ensure valid YAML formatting +2. **Test trigger patterns** - Verify patterns match intended files +3. **Review instructions** - Ensure they're clear and actionable +4. **Check for conflicts** - Ensure the rule doesn't conflict with existing ones + +## Example Rules + +### Update Documentation on Config Changes +`.deepwork/rules/config-docs.md`: +```markdown +--- +name: Update Install Guide on Config Changes +trigger: app/config/**/* +safety: docs/install_guide.md +--- +Configuration files have been modified. Please review docs/install_guide.md +and update it if any installation instructions need to change based on the +new configuration. +``` + +### Security Review for Auth Code +`.deepwork/rules/security-review.md`: +```markdown +--- +name: Security Review for Authentication Changes +trigger: + - src/auth/**/* + - src/security/**/* +safety: + - SECURITY.md + - docs/security_audit.md +--- +Authentication or security code has been changed. Please: + +1. Review for hardcoded credentials or secrets +2. Check input validation on user inputs +3. Verify access control logic is correct +4. Update security documentation if needed +``` + +### Source/Test Pairing +`.deepwork/rules/source-test-pairing.md`: +```markdown +--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. + +When modifying source code, ensure corresponding tests are updated. +When adding tests, ensure they test actual source code. + +Modified: {trigger_files} +Expected: {expected_files} +``` + +### API Documentation Sync +`.deepwork/rules/api-docs.md`: +```markdown +--- +name: API Documentation Update +pair: + trigger: src/api/{path}.py + expects: docs/api/{path}.md +--- +API code has changed. Please verify that API documentation in docs/api/ +is up to date with the code changes. Pay special attention to: + +- New or changed endpoints +- Modified request/response schemas +- Updated authentication requirements + +Changed API: {trigger_files} +Update: {expected_files} +``` + +## Output Format + +### .deepwork/rules/{rule-name}.md +Create a new file with the rule definition using YAML frontmatter and markdown body. + +## Quality Criteria + +- Asked structured questions to understand user requirements +- Rule name is clear and descriptive (used in promise tags) +- Correct detection mode selected for the use case +- Patterns accurately match the intended files +- Safety patterns prevent unnecessary triggering (if applicable) +- Instructions are actionable and specific +- YAML frontmatter is valid + +## Context + +Rules are evaluated automatically when the agent finishes a task. The system: +1. Determines which files have changed based on each rule's `compare_to` setting +2. Evaluates rules based on their detection mode (trigger/safety, set, or pair) +3. Skips rules where the correspondence is satisfied (for set/pair) or safety matched +4. Prompts you with instructions for any triggered rules + +You can mark a rule as addressed by including `Rule Name` in your response (replace Rule Name with the actual rule name from the `name` field). This tells the system you've already handled that rule's requirements. diff --git a/src/deepwork/templates/default_policy.yml b/src/deepwork/templates/default_policy.yml deleted file mode 100644 index 2f895bde..00000000 --- a/src/deepwork/templates/default_policy.yml +++ /dev/null @@ -1,53 +0,0 @@ -# DeepWork Policy Configuration -# -# Policies are automated guardrails that trigger when specific files change. -# They help ensure documentation stays current, security reviews happen, etc. -# -# Use /deepwork_policy.define to create new policies interactively. -# -# Format: -# - name: "Friendly name for the policy" -# trigger: "glob/pattern/**/*" # or array: ["pattern1", "pattern2"] -# safety: "pattern/**/*" # optional - if these also changed, skip the policy -# compare_to: "base" # optional: "base" (default), "default_tip", or "prompt" -# instructions: | -# Multi-line instructions for the AI agent... -# -# Example policies (uncomment and customize): -# -# - name: "README Documentation" -# trigger: "src/**/*" -# safety: "README.md" -# instructions: | -# Source code has been modified. Please review README.md for accuracy: -# 1. Verify the project overview reflects current functionality -# 2. Check that usage examples are still correct -# 3. Ensure installation/setup instructions remain valid -# -# - name: "API Documentation Sync" -# trigger: "src/api/**/*" -# safety: "docs/api/**/*.md" -# instructions: | -# API code has changed. Please verify that API documentation is up to date: -# - New or changed endpoints -# - Modified request/response schemas -# - Updated authentication requirements -# -# - name: "Security Review for Auth Changes" -# trigger: -# - "src/auth/**/*" -# - "src/security/**/*" -# instructions: | -# Authentication or security code has been changed. Please: -# 1. Review for hardcoded credentials or secrets -# 2. Check input validation on user inputs -# 3. Verify access control logic is correct -# -# - name: "Test Coverage for New Code" -# trigger: "src/**/*.py" -# safety: "tests/**/*.py" -# instructions: | -# New source code was added. Please ensure appropriate test coverage: -# 1. Add unit tests for new functions/methods -# 2. Update integration tests if behavior changed -# 3. Verify all new code paths are tested diff --git a/tests/fixtures/policies/empty_policy.yml b/tests/fixtures/policies/empty_policy.yml deleted file mode 100644 index c8faa07a..00000000 --- a/tests/fixtures/policies/empty_policy.yml +++ /dev/null @@ -1 +0,0 @@ -# Empty policy file diff --git a/tests/fixtures/policies/instructions/security_review.md b/tests/fixtures/policies/instructions/security_review.md deleted file mode 100644 index b64978bc..00000000 --- a/tests/fixtures/policies/instructions/security_review.md +++ /dev/null @@ -1,8 +0,0 @@ -## Security Review Required - -Authentication code has been modified. Please: - -1. Check for hardcoded credentials -2. Verify input validation -3. Review access control logic -4. Update security documentation diff --git a/tests/fixtures/policies/invalid_missing_instructions.yml b/tests/fixtures/policies/invalid_missing_instructions.yml deleted file mode 100644 index 6c47934a..00000000 --- a/tests/fixtures/policies/invalid_missing_instructions.yml +++ /dev/null @@ -1,2 +0,0 @@ -- name: "Invalid policy" - trigger: "src/**/*" diff --git a/tests/fixtures/policies/invalid_missing_trigger.yml b/tests/fixtures/policies/invalid_missing_trigger.yml deleted file mode 100644 index a5c89493..00000000 --- a/tests/fixtures/policies/invalid_missing_trigger.yml +++ /dev/null @@ -1,3 +0,0 @@ -- name: "Invalid policy" - safety: "some/file.md" - instructions: "This policy is missing a trigger" diff --git a/tests/fixtures/policies/multiple_policies.yml b/tests/fixtures/policies/multiple_policies.yml deleted file mode 100644 index da292317..00000000 --- a/tests/fixtures/policies/multiple_policies.yml +++ /dev/null @@ -1,21 +0,0 @@ -- name: "Update install guide on config changes" - trigger: "app/config/**/*" - safety: "docs/install_guide.md" - instructions: "Update docs/install_guide.md if needed." - -- name: "Security review for auth changes" - trigger: - - "src/auth/**/*" - - "src/security/**/*" - safety: - - "SECURITY.md" - - "docs/security_review.md" - instructions: | - Authentication or security code has changed. - Please ensure: - 1. No secrets are exposed - 2. Security review documentation is updated - -- name: "API documentation update" - trigger: "src/api/**/*.py" - instructions: "API code changed. Update API documentation." diff --git a/tests/fixtures/policies/policy_with_instructions_file.yml b/tests/fixtures/policies/policy_with_instructions_file.yml deleted file mode 100644 index 267bfc66..00000000 --- a/tests/fixtures/policies/policy_with_instructions_file.yml +++ /dev/null @@ -1,3 +0,0 @@ -- name: "Security review" - trigger: "src/auth/**/*" - instructions_file: "instructions/security_review.md" diff --git a/tests/fixtures/policies/valid_policy.yml b/tests/fixtures/policies/valid_policy.yml deleted file mode 100644 index a2b0b6be..00000000 --- a/tests/fixtures/policies/valid_policy.yml +++ /dev/null @@ -1,6 +0,0 @@ -- name: "Update install guide on config changes" - trigger: "app/config/**/*" - safety: "docs/install_guide.md" - instructions: | - Configuration files have changed. Please review docs/install_guide.md - and update it if the installation instructions need to change. diff --git a/tests/integration/test_install_flow.py b/tests/integration/test_install_flow.py index 4f42353f..23037f65 100644 --- a/tests/integration/test_install_flow.py +++ b/tests/integration/test_install_flow.py @@ -152,8 +152,8 @@ def test_install_is_idempotent(self, mock_claude_project: Path) -> None: assert (claude_dir / "deepwork_jobs.define.md").exists() assert (claude_dir / "deepwork_jobs.learn.md").exists() - def test_install_creates_policy_template(self, mock_claude_project: Path) -> None: - """Test that install creates a policy template file.""" + def test_install_creates_rules_directory(self, mock_claude_project: Path) -> None: + """Test that install creates the v2 rules directory with example templates.""" runner = CliRunner() result = runner.invoke( @@ -163,34 +163,38 @@ def test_install_creates_policy_template(self, mock_claude_project: Path) -> Non ) assert result.exit_code == 0 - assert ".deepwork.policy.yml template" in result.output + assert ".deepwork/rules/ with example templates" in result.output - # Verify policy file was created - policy_file = mock_claude_project / ".deepwork.policy.yml" - assert policy_file.exists() + # Verify rules directory was created + rules_dir = mock_claude_project / ".deepwork" / "rules" + assert rules_dir.exists() - # Verify it's the template (has comment header, no active policies) - content = policy_file.read_text() - assert "# DeepWork Policy Configuration" in content - assert "# Use /deepwork_policy.define" in content + # Verify README was created + readme_file = rules_dir / "README.md" + assert readme_file.exists() + content = readme_file.read_text() + assert "DeepWork Rules" in content + assert "YAML frontmatter" in content - # Verify it does NOT contain deepwork-specific policies - assert "Standard Jobs Source of Truth" not in content - assert "Version and Changelog Update" not in content - assert "pyproject.toml" not in content + # Verify example templates were copied + example_files = list(rules_dir.glob("*.md.example")) + assert len(example_files) >= 1 # At least one example template - def test_install_preserves_existing_policy_file(self, mock_claude_project: Path) -> None: - """Test that install doesn't overwrite existing policy file.""" + def test_install_preserves_existing_rules_directory(self, mock_claude_project: Path) -> None: + """Test that install doesn't overwrite existing rules directory.""" runner = CliRunner() - # Create a custom policy file before install - policy_file = mock_claude_project / ".deepwork.policy.yml" - custom_content = """- name: "My Custom Policy" - trigger: "src/**/*" - instructions: | - Custom instructions here. + # Create a custom rules directory before install + rules_dir = mock_claude_project / ".deepwork" / "rules" + rules_dir.mkdir(parents=True) + custom_rule = rules_dir / "my-custom-rule.md" + custom_content = """--- +name: My Custom Rule +trigger: "src/**/*" +--- +Custom instructions here. """ - policy_file.write_text(custom_content) + custom_rule.write_text(custom_content) result = runner.invoke( cli, @@ -199,10 +203,10 @@ def test_install_preserves_existing_policy_file(self, mock_claude_project: Path) ) assert result.exit_code == 0 - assert ".deepwork.policy.yml already exists" in result.output + assert ".deepwork/rules/ already exists" in result.output # Verify original content is preserved - assert policy_file.read_text() == custom_content + assert custom_rule.read_text() == custom_content class TestCLIEntryPoint: diff --git a/tests/shell_script_tests/README.md b/tests/shell_script_tests/README.md index 95bf0468..76cd8f05 100644 --- a/tests/shell_script_tests/README.md +++ b/tests/shell_script_tests/README.md @@ -1,14 +1,14 @@ # Shell Script Tests -Automated tests for DeepWork shell scripts, with a focus on validating Claude Code hooks JSON response formats. +Automated tests for DeepWork shell scripts and hooks, with a focus on validating Claude Code hooks JSON response formats. -## Scripts Tested +## Hooks and Scripts Tested -| Script | Type | Description | -|--------|------|-------------| -| `policy_stop_hook.sh` | Stop Hook | Evaluates policies and blocks agent stop if policies are triggered | +| Hook/Script | Type | Description | +|-------------|------|-------------| +| `deepwork.hooks.rules_check` | Stop Hook (Python) | Evaluates rules and blocks agent stop if rules are triggered | | `user_prompt_submit.sh` | UserPromptSubmit Hook | Captures work tree state when user submits a prompt | -| `capture_prompt_work_tree.sh` | Helper | Records current git state for `compare_to: prompt` policies | +| `capture_prompt_work_tree.sh` | Helper | Records current git state for `compare_to: prompt` rules | | `make_new_job.sh` | Utility | Creates directory structure for new DeepWork jobs | ## Claude Code Hooks JSON Format @@ -38,7 +38,7 @@ Hook scripts must return valid JSON responses. The tests enforce these formats: uv run pytest tests/shell_script_tests/ -v # Run tests for a specific script -uv run pytest tests/shell_script_tests/test_policy_stop_hook.py -v +uv run pytest tests/shell_script_tests/test_rules_stop_hook.py -v # Run with coverage uv run pytest tests/shell_script_tests/ --cov=src/deepwork @@ -49,10 +49,10 @@ uv run pytest tests/shell_script_tests/ --cov=src/deepwork ``` tests/shell_script_tests/ ├── conftest.py # Shared fixtures and helpers -├── test_policy_stop_hook.py # Stop hook blocking/allowing tests +├── test_hooks.py # Consolidated hook tests (JSON format, exit codes) +├── test_rules_stop_hook.py # Stop hook blocking/allowing tests ├── test_user_prompt_submit.py # Prompt submission hook tests ├── test_capture_prompt_work_tree.py # Work tree capture tests -├── test_hooks_json_format.py # JSON format validation tests └── test_make_new_job.py # Job directory creation tests ``` @@ -63,8 +63,8 @@ Available in `conftest.py`: | Fixture | Description | |---------|-------------| | `git_repo` | Basic git repo with initial commit | -| `git_repo_with_policy` | Git repo with a Python file policy | -| `policy_hooks_dir` | Path to policy hooks scripts | +| `git_repo_with_rule` | Git repo with a Python file rule | +| `rules_hooks_dir` | Path to rules hooks scripts | | `jobs_scripts_dir` | Path to job management scripts | ## Adding New Tests diff --git a/tests/shell_script_tests/conftest.py b/tests/shell_script_tests/conftest.py index 085cf2ff..3ac15822 100644 --- a/tests/shell_script_tests/conftest.py +++ b/tests/shell_script_tests/conftest.py @@ -23,8 +23,8 @@ def git_repo(tmp_path: Path) -> Path: @pytest.fixture -def git_repo_with_policy(tmp_path: Path) -> Path: - """Create a git repo with policy that will fire.""" +def git_repo_with_rule(tmp_path: Path) -> Path: + """Create a git repo with rule that will fire.""" repo = Repo.init(tmp_path) readme = tmp_path / "README.md" @@ -32,38 +32,54 @@ def git_repo_with_policy(tmp_path: Path) -> Path: repo.index.add(["README.md"]) repo.index.commit("Initial commit") - # Policy that triggers on any Python file - policy_file = tmp_path / ".deepwork.policy.yml" - policy_file.write_text( - """- name: "Python File Policy" - trigger: "**/*.py" - compare_to: prompt - instructions: | - Review Python files for quality. + # Create v2 rules directory and file + rules_dir = tmp_path / ".deepwork" / "rules" + rules_dir.mkdir(parents=True, exist_ok=True) + + # Rule that triggers on any Python file (v2 format) + rule_file = rules_dir / "python-file-rule.md" + rule_file.write_text( + """--- +name: Python File Rule +trigger: "**/*.py" +compare_to: prompt +--- +Review Python files for quality. """ ) # Empty baseline so new files trigger deepwork_dir = tmp_path / ".deepwork" - deepwork_dir.mkdir(exist_ok=True) (deepwork_dir / ".last_work_tree").write_text("") return tmp_path @pytest.fixture -def policy_hooks_dir() -> Path: - """Return the path to the policy hooks scripts directory.""" +def rules_hooks_dir() -> Path: + """Return the path to the rules hooks scripts directory.""" return ( Path(__file__).parent.parent.parent / "src" / "deepwork" / "standard_jobs" - / "deepwork_policy" + / "deepwork_rules" / "hooks" ) +@pytest.fixture +def hooks_dir() -> Path: + """Return the path to the main hooks directory (platform wrappers).""" + return Path(__file__).parent.parent.parent / "src" / "deepwork" / "hooks" + + +@pytest.fixture +def src_dir() -> Path: + """Return the path to the src directory for PYTHONPATH.""" + return Path(__file__).parent.parent.parent / "src" + + @pytest.fixture def jobs_scripts_dir() -> Path: """Return the path to the jobs scripts directory.""" diff --git a/tests/shell_script_tests/test_capture_prompt_work_tree.py b/tests/shell_script_tests/test_capture_prompt_work_tree.py index 4f187b13..6f0435b1 100644 --- a/tests/shell_script_tests/test_capture_prompt_work_tree.py +++ b/tests/shell_script_tests/test_capture_prompt_work_tree.py @@ -1,7 +1,7 @@ """Tests for capture_prompt_work_tree.sh helper script. This script captures the git work tree state for use with -compare_to: prompt policies. It should: +compare_to: prompt rules. It should: 1. Create .deepwork directory if needed 2. Stage all changes with git add -A 3. Record changed files to .deepwork/.last_work_tree @@ -35,36 +35,36 @@ def run_capture_script(script_path: Path, cwd: Path) -> tuple[str, str, int]: class TestCapturePromptWorkTreeBasic: """Basic functionality tests for capture_prompt_work_tree.sh.""" - def test_exits_successfully(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_exits_successfully(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the script exits with code 0.""" - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) assert code == 0, f"Expected exit code 0, got {code}. stderr: {stderr}" - def test_creates_deepwork_directory(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_creates_deepwork_directory(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the script creates .deepwork directory.""" deepwork_dir = git_repo / ".deepwork" assert not deepwork_dir.exists(), "Precondition: .deepwork should not exist" - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) assert code == 0, f"Script failed with stderr: {stderr}" assert deepwork_dir.exists(), "Script should create .deepwork directory" - def test_creates_last_work_tree_file(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_creates_last_work_tree_file(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the script creates .last_work_tree file.""" - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) work_tree_file = git_repo / ".deepwork" / ".last_work_tree" assert code == 0, f"Script failed with stderr: {stderr}" assert work_tree_file.exists(), "Script should create .last_work_tree file" - def test_empty_repo_produces_empty_file(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_empty_repo_produces_empty_file(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that a clean repo produces an empty work tree file.""" - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) # Clean repo should have empty or minimal content @@ -75,7 +75,7 @@ def test_empty_repo_produces_empty_file(self, policy_hooks_dir: Path, git_repo: class TestCapturePromptWorkTreeFileTracking: """Tests for file tracking behavior in capture_prompt_work_tree.sh.""" - def test_captures_staged_files(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_captures_staged_files(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that staged files are captured.""" # Create and stage a file new_file = git_repo / "staged.py" @@ -83,7 +83,7 @@ def test_captures_staged_files(self, policy_hooks_dir: Path, git_repo: Path) -> repo = Repo(git_repo) repo.index.add(["staged.py"]) - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) work_tree_file = git_repo / ".deepwork" / ".last_work_tree" @@ -92,13 +92,13 @@ def test_captures_staged_files(self, policy_hooks_dir: Path, git_repo: Path) -> assert code == 0, f"Script failed with stderr: {stderr}" assert "staged.py" in content, "Staged file should be in work tree" - def test_captures_unstaged_changes(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_captures_unstaged_changes(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that unstaged changes are captured (after staging by script).""" # Create an unstaged file unstaged = git_repo / "unstaged.py" unstaged.write_text("# Unstaged file\n") - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) work_tree_file = git_repo / ".deepwork" / ".last_work_tree" @@ -107,14 +107,14 @@ def test_captures_unstaged_changes(self, policy_hooks_dir: Path, git_repo: Path) assert code == 0, f"Script failed with stderr: {stderr}" assert "unstaged.py" in content, "Unstaged file should be captured" - def test_captures_files_in_subdirectories(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_captures_files_in_subdirectories(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that files in subdirectories are captured.""" # Create files in nested directories src_dir = git_repo / "src" / "components" src_dir.mkdir(parents=True) (src_dir / "button.py").write_text("# Button component\n") - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) work_tree_file = git_repo / ".deepwork" / ".last_work_tree" @@ -124,10 +124,10 @@ def test_captures_files_in_subdirectories(self, policy_hooks_dir: Path, git_repo assert "src/components/button.py" in content, "Nested file should be captured" def test_captures_multiple_files( - self, policy_hooks_dir: Path, git_repo_with_changes: Path + self, rules_hooks_dir: Path, git_repo_with_changes: Path ) -> None: """Test that multiple files are captured.""" - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo_with_changes) work_tree_file = git_repo_with_changes / ".deepwork" / ".last_work_tree" @@ -137,14 +137,14 @@ def test_captures_multiple_files( assert "modified.py" in content, "Modified file should be captured" assert "src/main.py" in content, "File in src/ should be captured" - def test_file_list_is_sorted_and_unique(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_file_list_is_sorted_and_unique(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the file list is sorted and deduplicated.""" # Create multiple files (git_repo / "z_file.py").write_text("# Z file\n") (git_repo / "a_file.py").write_text("# A file\n") (git_repo / "m_file.py").write_text("# M file\n") - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) work_tree_file = git_repo / ".deepwork" / ".last_work_tree" @@ -161,7 +161,7 @@ def test_file_list_is_sorted_and_unique(self, policy_hooks_dir: Path, git_repo: class TestCapturePromptWorkTreeGitStates: """Tests for handling various git states in capture_prompt_work_tree.sh.""" - def test_handles_deleted_files(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_handles_deleted_files(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that deleted files are handled gracefully.""" # Create and commit a file, then delete it to_delete = git_repo / "to_delete.py" @@ -173,12 +173,12 @@ def test_handles_deleted_files(self, policy_hooks_dir: Path, git_repo: Path) -> # Now delete it to_delete.unlink() - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) assert code == 0, f"Script should handle deletions. stderr: {stderr}" - def test_handles_renamed_files(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_handles_renamed_files(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that renamed files are tracked.""" # Create and commit a file old_name = git_repo / "old_name.py" @@ -191,7 +191,7 @@ def test_handles_renamed_files(self, policy_hooks_dir: Path, git_repo: Path) -> new_name = git_repo / "new_name.py" old_name.rename(new_name) - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) work_tree_file = git_repo / ".deepwork" / ".last_work_tree" @@ -201,13 +201,13 @@ def test_handles_renamed_files(self, policy_hooks_dir: Path, git_repo: Path) -> # Both old (deleted) and new should appear as changes assert "new_name.py" in content, "New filename should be captured" - def test_handles_modified_files(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_handles_modified_files(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that modified committed files are tracked.""" # Modify an existing committed file readme = git_repo / "README.md" readme.write_text("# Modified content\n") - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) work_tree_file = git_repo / ".deepwork" / ".last_work_tree" @@ -220,17 +220,17 @@ def test_handles_modified_files(self, policy_hooks_dir: Path, git_repo: Path) -> class TestCapturePromptWorkTreeIdempotence: """Tests for idempotent behavior of capture_prompt_work_tree.sh.""" - def test_multiple_runs_succeed(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_multiple_runs_succeed(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the script can be run multiple times.""" - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" for i in range(3): stdout, stderr, code = run_capture_script(script_path, git_repo) assert code == 0, f"Run {i + 1} failed with stderr: {stderr}" - def test_updates_on_new_changes(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_updates_on_new_changes(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that subsequent runs capture new changes.""" - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" # First run run_capture_script(script_path, git_repo) @@ -246,12 +246,12 @@ def test_updates_on_new_changes(self, policy_hooks_dir: Path, git_repo: Path) -> assert "new_file.py" in content, "New file should be captured" - def test_existing_deepwork_dir_not_error(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_existing_deepwork_dir_not_error(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that existing .deepwork directory is not an error.""" # Pre-create the directory (git_repo / ".deepwork").mkdir() - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" stdout, stderr, code = run_capture_script(script_path, git_repo) assert code == 0, f"Should handle existing .deepwork dir. stderr: {stderr}" diff --git a/tests/shell_script_tests/test_hook_wrappers.py b/tests/shell_script_tests/test_hook_wrappers.py deleted file mode 100644 index 7b3b1436..00000000 --- a/tests/shell_script_tests/test_hook_wrappers.py +++ /dev/null @@ -1,311 +0,0 @@ -"""Tests for the platform hook wrapper shell scripts. - -These tests verify that claude_hook.sh and gemini_hook.sh correctly -invoke Python hooks and handle input/output. -""" - -import json -import os -import subprocess -from pathlib import Path - -import pytest - - -@pytest.fixture -def hooks_dir() -> Path: - """Return the path to the hooks directory.""" - return Path(__file__).parent.parent.parent / "src" / "deepwork" / "hooks" - - -@pytest.fixture -def src_dir() -> Path: - """Return the path to the src directory for PYTHONPATH.""" - return Path(__file__).parent.parent.parent / "src" - - -def run_hook_script( - script_path: Path, - python_module: str, - hook_input: dict, - platform: str, - src_dir: Path, -) -> tuple[str, str, int]: - """ - Run a hook wrapper script with the given input. - - Args: - script_path: Path to the wrapper script (claude_hook.sh or gemini_hook.sh) - python_module: Python module to invoke - hook_input: JSON input to pass via stdin - platform: Platform identifier for env var - src_dir: Path to src directory for PYTHONPATH - - Returns: - Tuple of (stdout, stderr, return_code) - """ - env = os.environ.copy() - env["PYTHONPATH"] = str(src_dir) - env["DEEPWORK_HOOK_PLATFORM"] = platform - - result = subprocess.run( - ["bash", str(script_path), python_module], - capture_output=True, - text=True, - input=json.dumps(hook_input), - env=env, - ) - - return result.stdout, result.stderr, result.returncode - - -class TestClaudeHookWrapper: - """Tests for claude_hook.sh wrapper script.""" - - def test_script_exists_and_is_executable(self, hooks_dir: Path) -> None: - """Test that the Claude hook script exists and is executable.""" - script_path = hooks_dir / "claude_hook.sh" - assert script_path.exists(), "claude_hook.sh should exist" - assert os.access(script_path, os.X_OK), "claude_hook.sh should be executable" - - def test_usage_error_without_module(self, hooks_dir: Path, src_dir: Path) -> None: - """Test that script shows usage error when no module provided.""" - script_path = hooks_dir / "claude_hook.sh" - env = os.environ.copy() - env["PYTHONPATH"] = str(src_dir) - - result = subprocess.run( - ["bash", str(script_path)], - capture_output=True, - text=True, - env=env, - ) - - assert result.returncode == 1 - assert "Usage:" in result.stderr - - def test_sets_platform_environment_variable(self, hooks_dir: Path, src_dir: Path) -> None: - """Test that the script sets DEEPWORK_HOOK_PLATFORM correctly.""" - # Create a simple test module that outputs the platform env var - # We'll use a Python one-liner via -c - script_path = hooks_dir / "claude_hook.sh" - env = os.environ.copy() - env["PYTHONPATH"] = str(src_dir) - - # We can't easily test this without a real module, so we'll verify - # the script exists and has the right content - content = script_path.read_text() - assert 'DEEPWORK_HOOK_PLATFORM="claude"' in content - - -class TestGeminiHookWrapper: - """Tests for gemini_hook.sh wrapper script.""" - - def test_script_exists_and_is_executable(self, hooks_dir: Path) -> None: - """Test that the Gemini hook script exists and is executable.""" - script_path = hooks_dir / "gemini_hook.sh" - assert script_path.exists(), "gemini_hook.sh should exist" - assert os.access(script_path, os.X_OK), "gemini_hook.sh should be executable" - - def test_usage_error_without_module(self, hooks_dir: Path, src_dir: Path) -> None: - """Test that script shows usage error when no module provided.""" - script_path = hooks_dir / "gemini_hook.sh" - env = os.environ.copy() - env["PYTHONPATH"] = str(src_dir) - - result = subprocess.run( - ["bash", str(script_path)], - capture_output=True, - text=True, - env=env, - ) - - assert result.returncode == 1 - assert "Usage:" in result.stderr - - def test_sets_platform_environment_variable(self, hooks_dir: Path, src_dir: Path) -> None: - """Test that the script sets DEEPWORK_HOOK_PLATFORM correctly.""" - script_path = hooks_dir / "gemini_hook.sh" - content = script_path.read_text() - assert 'DEEPWORK_HOOK_PLATFORM="gemini"' in content - - -class TestHookWrapperIntegration: - """Integration tests for hook wrappers with actual Python hooks.""" - - @pytest.fixture - def test_hook_module(self, tmp_path: Path) -> tuple[Path, str]: - """Create a temporary test hook module.""" - module_dir = tmp_path / "test_hooks" - module_dir.mkdir(parents=True) - - # Create __init__.py - (module_dir / "__init__.py").write_text("") - - # Create the hook module - hook_code = ''' -"""Test hook module.""" -import os -import sys - -from deepwork.hooks.wrapper import ( - HookInput, - HookOutput, - NormalizedEvent, - Platform, - run_hook, -) - - -def test_hook(hook_input: HookInput) -> HookOutput: - """Test hook that blocks for after_agent events.""" - if hook_input.event == NormalizedEvent.AFTER_AGENT: - return HookOutput(decision="block", reason="Test block reason") - return HookOutput() - - -def main() -> None: - platform_str = os.environ.get("DEEPWORK_HOOK_PLATFORM", "claude") - try: - platform = Platform(platform_str) - except ValueError: - platform = Platform.CLAUDE - - exit_code = run_hook(test_hook, platform) - sys.exit(exit_code) - - -if __name__ == "__main__": - main() -''' - (module_dir / "test_hook.py").write_text(hook_code) - - return tmp_path, "test_hooks.test_hook" - - def test_claude_wrapper_with_stop_event( - self, - hooks_dir: Path, - src_dir: Path, - test_hook_module: tuple[Path, str], - ) -> None: - """Test Claude wrapper processes Stop event correctly.""" - tmp_path, module_name = test_hook_module - script_path = hooks_dir / "claude_hook.sh" - - hook_input = { - "session_id": "test123", - "hook_event_name": "Stop", - "cwd": "/project", - } - - env = os.environ.copy() - env["PYTHONPATH"] = f"{src_dir}:{tmp_path}" - - result = subprocess.run( - ["bash", str(script_path), module_name], - capture_output=True, - text=True, - input=json.dumps(hook_input), - env=env, - ) - - assert result.returncode == 2, f"Expected exit code 2 for blocking. stderr: {result.stderr}" - - output = json.loads(result.stdout.strip()) - assert output["decision"] == "block" - assert "Test block reason" in output["reason"] - - def test_gemini_wrapper_with_afteragent_event( - self, - hooks_dir: Path, - src_dir: Path, - test_hook_module: tuple[Path, str], - ) -> None: - """Test Gemini wrapper processes AfterAgent event correctly.""" - tmp_path, module_name = test_hook_module - script_path = hooks_dir / "gemini_hook.sh" - - hook_input = { - "session_id": "test456", - "hook_event_name": "AfterAgent", - "cwd": "/project", - } - - env = os.environ.copy() - env["PYTHONPATH"] = f"{src_dir}:{tmp_path}" - - result = subprocess.run( - ["bash", str(script_path), module_name], - capture_output=True, - text=True, - input=json.dumps(hook_input), - env=env, - ) - - assert result.returncode == 2, f"Expected exit code 2 for blocking. stderr: {result.stderr}" - - output = json.loads(result.stdout.strip()) - # Gemini should get "deny" instead of "block" - assert output["decision"] == "deny" - assert "Test block reason" in output["reason"] - - def test_non_blocking_event( - self, - hooks_dir: Path, - src_dir: Path, - test_hook_module: tuple[Path, str], - ) -> None: - """Test that non-blocking events return exit code 0.""" - tmp_path, module_name = test_hook_module - script_path = hooks_dir / "claude_hook.sh" - - # SessionStart is not blocked by the test hook - hook_input = { - "session_id": "test789", - "hook_event_name": "SessionStart", - "cwd": "/project", - } - - env = os.environ.copy() - env["PYTHONPATH"] = f"{src_dir}:{tmp_path}" - - result = subprocess.run( - ["bash", str(script_path), module_name], - capture_output=True, - text=True, - input=json.dumps(hook_input), - env=env, - ) - - assert result.returncode == 0, f"Expected exit code 0. stderr: {result.stderr}" - output = json.loads(result.stdout.strip()) - assert output == {} or output.get("decision", "") not in ("block", "deny") - - -class TestPolicyCheckHook: - """Tests for the policy_check hook module.""" - - def test_module_imports(self) -> None: - """Test that the policy_check module can be imported.""" - from deepwork.hooks import policy_check - - assert hasattr(policy_check, "main") - assert hasattr(policy_check, "policy_check_hook") - - def test_hook_function_returns_output(self) -> None: - """Test that policy_check_hook returns a HookOutput.""" - from deepwork.hooks.policy_check import policy_check_hook - from deepwork.hooks.wrapper import HookInput, HookOutput, NormalizedEvent, Platform - - # Create a minimal hook input - hook_input = HookInput( - platform=Platform.CLAUDE, - event=NormalizedEvent.BEFORE_PROMPT, # Not after_agent, so no blocking - session_id="test", - ) - - output = policy_check_hook(hook_input) - - assert isinstance(output, HookOutput) - # Should not block for before_prompt event - assert output.decision != "block" diff --git a/tests/shell_script_tests/test_hooks.py b/tests/shell_script_tests/test_hooks.py new file mode 100644 index 00000000..4f6f8e32 --- /dev/null +++ b/tests/shell_script_tests/test_hooks.py @@ -0,0 +1,746 @@ +"""Tests for hook shell scripts and JSON format compliance. + +# ****************************************************************************** +# *** CRITICAL CONTRACT TESTS *** +# ****************************************************************************** +# +# These tests verify the EXACT format required by Claude Code hooks as +# documented in: doc/platforms/claude/hooks_system.md +# +# DO NOT MODIFY these tests without first consulting the official Claude Code +# documentation at: https://docs.anthropic.com/en/docs/claude-code/hooks +# +# Hook Contract Summary: +# - Exit code 0: Success, stdout parsed as JSON +# - Exit code 2: Blocking error, stderr shown (NOT used for JSON format) +# - Allow response: {} (empty JSON object) +# - Block response: {"decision": "block", "reason": "..."} +# +# CRITICAL: Hooks using JSON output format MUST return exit code 0. +# The "decision" field in the JSON controls blocking behavior, NOT the exit code. +# +# ****************************************************************************** + +Claude Code hooks have specific JSON response formats that must be followed: + +Stop hooks (hooks.after_agent): + - {} - Allow stop (empty object) + - {"decision": "block", "reason": "..."} - Block stop with reason + +UserPromptSubmit hooks (hooks.before_prompt): + - {} - No response needed (empty object) + - No output - Also acceptable + +BeforeTool hooks (hooks.before_tool): + - {} - Allow tool execution + - {"decision": "block", "reason": "..."} - Block tool execution + +All hooks: + - Must return valid JSON if producing output + - Must not contain non-JSON output on stdout (stderr is ok) + - Exit code 0 indicates success +""" + +import json +import os +import subprocess +import tempfile +from pathlib import Path + +import pytest +from git import Repo + +from .conftest import run_shell_script + +# ============================================================================= +# Helper Functions +# ============================================================================= + + +def run_rules_hook_script( + script_path: Path, + cwd: Path, + hook_input: dict | None = None, +) -> tuple[str, str, int]: + """Run a rules hook script and return its output.""" + return run_shell_script(script_path, cwd, hook_input=hook_input) + + +def run_rules_check_module( + cwd: Path, + hook_input: dict | None = None, + src_dir: Path | None = None, +) -> tuple[str, str, int]: + """Run the rules_check Python module directly and return its output.""" + env = os.environ.copy() + env["DEEPWORK_HOOK_PLATFORM"] = "claude" + if src_dir: + env["PYTHONPATH"] = str(src_dir) + + stdin_data = json.dumps(hook_input) if hook_input else "" + + result = subprocess.run( + ["python", "-m", "deepwork.hooks.rules_check"], + cwd=cwd, + capture_output=True, + text=True, + input=stdin_data, + env=env, + ) + + return result.stdout, result.stderr, result.returncode + + +def run_platform_wrapper_script( + script_path: Path, + python_module: str, + hook_input: dict, + src_dir: Path, +) -> tuple[str, str, int]: + """ + Run a platform hook wrapper script with the given input. + + Args: + script_path: Path to the wrapper script (claude_hook.sh or gemini_hook.sh) + python_module: Python module to invoke + hook_input: JSON input to pass via stdin + src_dir: Path to src directory for PYTHONPATH + + Returns: + Tuple of (stdout, stderr, return_code) + """ + env = os.environ.copy() + env["PYTHONPATH"] = str(src_dir) + + result = subprocess.run( + ["bash", str(script_path), python_module], + capture_output=True, + text=True, + input=json.dumps(hook_input), + env=env, + ) + + return result.stdout, result.stderr, result.returncode + + +def validate_json_output(output: str) -> dict | None: + """ + Validate that output is valid JSON or empty. + + Args: + output: The stdout from a hook script + + Returns: + Parsed JSON dict, or None if empty/no output + + Raises: + AssertionError: If output is invalid JSON + """ + stripped = output.strip() + + if not stripped: + return None + + try: + result = json.loads(stripped) + assert isinstance(result, dict), "Hook output must be a JSON object" + return result + except json.JSONDecodeError as e: + pytest.fail(f"Invalid JSON output: {stripped!r}. Error: {e}") + + +# ****************************************************************************** +# *** DO NOT EDIT THIS FUNCTION! *** +# As documented in doc/platforms/claude/hooks_system.md, Stop hooks must return: +# - {} (empty object) to allow +# - {"decision": "block", "reason": "..."} to block +# Any other format will cause undefined behavior in Claude Code. +# ****************************************************************************** +def validate_stop_hook_response(response: dict | None) -> None: + """ + Validate a Stop hook response follows Claude Code format. + + Args: + response: Parsed JSON response or None + + Raises: + AssertionError: If response format is invalid + """ + if response is None: + # No output is acceptable for stop hooks + return + + if response == {}: + # Empty object means allow stop + return + + # Must have decision and reason for blocking + assert "decision" in response, ( + f"Stop hook blocking response must have 'decision' key: {response}" + ) + assert response["decision"] == "block", ( + f"Stop hook decision must be 'block', got: {response['decision']}" + ) + assert "reason" in response, f"Stop hook blocking response must have 'reason' key: {response}" + assert isinstance(response["reason"], str), f"Stop hook reason must be a string: {response}" + + # Reason should not be empty when blocking + assert response["reason"].strip(), "Stop hook blocking reason should not be empty" + + +def validate_prompt_hook_response(response: dict | None) -> None: + """ + Validate a UserPromptSubmit hook response. + + Args: + response: Parsed JSON response or None + + Raises: + AssertionError: If response format is invalid + """ + if response is None: + # No output is acceptable + return + + # Empty object or valid JSON object is fine + assert isinstance(response, dict), f"Prompt hook output must be a JSON object: {response}" + + +# ============================================================================= +# Platform Wrapper Script Tests +# ============================================================================= + + +class TestClaudeHookWrapper: + """Tests for claude_hook.sh wrapper script.""" + + def test_script_exists_and_is_executable(self, hooks_dir: Path) -> None: + """Test that the Claude hook script exists and is executable.""" + script_path = hooks_dir / "claude_hook.sh" + assert script_path.exists(), "claude_hook.sh should exist" + assert os.access(script_path, os.X_OK), "claude_hook.sh should be executable" + + def test_usage_error_without_module(self, hooks_dir: Path, src_dir: Path) -> None: + """Test that script shows usage error when no module provided.""" + script_path = hooks_dir / "claude_hook.sh" + env = os.environ.copy() + env["PYTHONPATH"] = str(src_dir) + + result = subprocess.run( + ["bash", str(script_path)], + capture_output=True, + text=True, + env=env, + ) + + assert result.returncode == 1 + assert "Usage:" in result.stderr + + def test_sets_platform_environment_variable(self, hooks_dir: Path, src_dir: Path) -> None: + """Test that the script sets DEEPWORK_HOOK_PLATFORM correctly.""" + script_path = hooks_dir / "claude_hook.sh" + content = script_path.read_text() + assert 'DEEPWORK_HOOK_PLATFORM="claude"' in content + + +class TestGeminiHookWrapper: + """Tests for gemini_hook.sh wrapper script.""" + + def test_script_exists_and_is_executable(self, hooks_dir: Path) -> None: + """Test that the Gemini hook script exists and is executable.""" + script_path = hooks_dir / "gemini_hook.sh" + assert script_path.exists(), "gemini_hook.sh should exist" + assert os.access(script_path, os.X_OK), "gemini_hook.sh should be executable" + + def test_usage_error_without_module(self, hooks_dir: Path, src_dir: Path) -> None: + """Test that script shows usage error when no module provided.""" + script_path = hooks_dir / "gemini_hook.sh" + env = os.environ.copy() + env["PYTHONPATH"] = str(src_dir) + + result = subprocess.run( + ["bash", str(script_path)], + capture_output=True, + text=True, + env=env, + ) + + assert result.returncode == 1 + assert "Usage:" in result.stderr + + def test_sets_platform_environment_variable(self, hooks_dir: Path, src_dir: Path) -> None: + """Test that the script sets DEEPWORK_HOOK_PLATFORM correctly.""" + script_path = hooks_dir / "gemini_hook.sh" + content = script_path.read_text() + assert 'DEEPWORK_HOOK_PLATFORM="gemini"' in content + + +# ============================================================================= +# Rules Hook Script Tests +# ============================================================================= + + +class TestRulesStopHook: + """Tests for rules stop hook (deepwork.hooks.rules_check) JSON format compliance.""" + + def test_allow_response_is_empty_json(self, src_dir: Path, git_repo: Path) -> None: + """Test that allow response is empty JSON object.""" + stdout, stderr, code = run_rules_check_module(git_repo, src_dir=src_dir) + + response = validate_json_output(stdout) + validate_stop_hook_response(response) + + if response is not None: + assert response == {}, f"Allow response should be empty: {response}" + + def test_block_response_has_required_fields( + self, src_dir: Path, git_repo_with_rule: Path + ) -> None: + """Test that block response has decision and reason.""" + # Create a file that triggers the rule + py_file = git_repo_with_rule / "test.py" + py_file.write_text("# Python file\n") + repo = Repo(git_repo_with_rule) + repo.index.add(["test.py"]) + + stdout, stderr, code = run_rules_check_module(git_repo_with_rule, src_dir=src_dir) + + response = validate_json_output(stdout) + validate_stop_hook_response(response) + + # Should be blocking + assert response is not None, "Expected blocking response" + assert response.get("decision") == "block", "Expected block decision" + assert "reason" in response, "Expected reason field" + + def test_block_reason_contains_rule_info(self, src_dir: Path, git_repo_with_rule: Path) -> None: + """Test that block reason contains rule information.""" + py_file = git_repo_with_rule / "test.py" + py_file.write_text("# Python file\n") + repo = Repo(git_repo_with_rule) + repo.index.add(["test.py"]) + + stdout, stderr, code = run_rules_check_module(git_repo_with_rule, src_dir=src_dir) + + response = validate_json_output(stdout) + + assert response is not None, "Expected blocking response" + reason = response.get("reason", "") + + # Should contain useful rule information + assert "Rule" in reason or "rule" in reason, f"Reason should mention rule: {reason}" + + def test_no_extraneous_keys_in_response(self, src_dir: Path, git_repo_with_rule: Path) -> None: + """Test that response only contains expected keys.""" + py_file = git_repo_with_rule / "test.py" + py_file.write_text("# Python file\n") + repo = Repo(git_repo_with_rule) + repo.index.add(["test.py"]) + + stdout, stderr, code = run_rules_check_module(git_repo_with_rule, src_dir=src_dir) + + response = validate_json_output(stdout) + + if response and response != {}: + # Only decision and reason are valid keys for stop hooks + valid_keys = {"decision", "reason"} + actual_keys = set(response.keys()) + assert actual_keys <= valid_keys, ( + f"Unexpected keys in response: {actual_keys - valid_keys}" + ) + + def test_output_is_single_line_json(self, src_dir: Path, git_repo_with_rule: Path) -> None: + """Test that JSON output is single-line (no pretty printing).""" + py_file = git_repo_with_rule / "test.py" + py_file.write_text("# Python file\n") + repo = Repo(git_repo_with_rule) + repo.index.add(["test.py"]) + + stdout, stderr, code = run_rules_check_module(git_repo_with_rule, src_dir=src_dir) + + # Remove trailing newline and check for internal newlines + output = stdout.strip() + if output: + # JSON output should ideally be single line + # Multiple lines could indicate print statements or logging + lines = output.split("\n") + # Only the last line should be JSON + json_line = lines[-1] + # Verify the JSON is parseable + json.loads(json_line) + + +class TestUserPromptSubmitHook: + """Tests for user_prompt_submit.sh JSON format compliance.""" + + def test_output_is_valid_json_or_empty(self, rules_hooks_dir: Path, git_repo: Path) -> None: + """Test that output is valid JSON or empty.""" + script_path = rules_hooks_dir / "user_prompt_submit.sh" + stdout, stderr, code = run_rules_hook_script(script_path, git_repo) + + response = validate_json_output(stdout) + validate_prompt_hook_response(response) + + def test_does_not_block_prompt_submission(self, rules_hooks_dir: Path, git_repo: Path) -> None: + """Test that hook does not block prompt submission.""" + script_path = rules_hooks_dir / "user_prompt_submit.sh" + stdout, stderr, code = run_rules_hook_script(script_path, git_repo) + + response = validate_json_output(stdout) + + # UserPromptSubmit hooks should not block + if response: + assert response.get("decision") != "block", ( + "UserPromptSubmit hook should not return block decision" + ) + + +class TestHooksWithTranscript: + """Tests for hook JSON format when using transcript input.""" + + def test_stop_hook_with_transcript_input(self, src_dir: Path, git_repo_with_rule: Path) -> None: + """Test stop hook JSON format when transcript is provided.""" + py_file = git_repo_with_rule / "test.py" + py_file.write_text("# Python file\n") + repo = Repo(git_repo_with_rule) + repo.index.add(["test.py"]) + + # Create mock transcript + with tempfile.NamedTemporaryFile(mode="w", suffix=".jsonl", delete=False) as f: + transcript_path = f.name + f.write( + json.dumps( + { + "role": "assistant", + "message": {"content": [{"type": "text", "text": "Hello"}]}, + } + ) + ) + f.write("\n") + + try: + hook_input = {"transcript_path": transcript_path} + stdout, stderr, code = run_rules_check_module( + git_repo_with_rule, hook_input, src_dir=src_dir + ) + + response = validate_json_output(stdout) + validate_stop_hook_response(response) + + finally: + os.unlink(transcript_path) + + def test_stop_hook_with_promise_returns_empty( + self, src_dir: Path, git_repo_with_rule: Path + ) -> None: + """Test that promised rules return empty JSON.""" + py_file = git_repo_with_rule / "test.py" + py_file.write_text("# Python file\n") + repo = Repo(git_repo_with_rule) + repo.index.add(["test.py"]) + + # Create transcript with promise tag + with tempfile.NamedTemporaryFile(mode="w", suffix=".jsonl", delete=False) as f: + transcript_path = f.name + f.write( + json.dumps( + { + "role": "assistant", + "message": { + "content": [ + { + "type": "text", + "text": "Python File Rule", + } + ] + }, + } + ) + ) + f.write("\n") + + try: + hook_input = {"transcript_path": transcript_path} + stdout, stderr, code = run_rules_check_module( + git_repo_with_rule, hook_input, src_dir=src_dir + ) + + response = validate_json_output(stdout) + validate_stop_hook_response(response) + + # Should be empty (allow) because rule was promised + if response is not None: + assert response == {}, f"Expected empty response: {response}" + + finally: + os.unlink(transcript_path) + + +# ****************************************************************************** +# *** DO NOT EDIT THESE EXIT CODE TESTS! *** +# ****************************************************************************** +# +# As documented in doc/platforms/claude/hooks_system.md: +# +# | Exit Code | Meaning | Behavior | +# |-----------|-----------------|-----------------------------------| +# | 0 | Success | stdout parsed as JSON | +# | 2 | Blocking error | stderr shown, operation blocked | +# | Other | Warning | stderr logged, continues | +# +# CRITICAL: Hooks using JSON output format MUST return exit code 0. +# The "decision" field in the JSON controls blocking behavior, NOT the exit code. +# +# Example valid outputs: +# Exit 0 + stdout: {} -> Allow +# Exit 0 + stdout: {"decision": "block", "reason": "..."} -> Block +# Exit 0 + stdout: {"decision": "deny", "reason": "..."} -> Block (Gemini) +# +# See: https://docs.anthropic.com/en/docs/claude-code/hooks +# ****************************************************************************** + + +class TestHookExitCodes: + """Tests for hook exit codes. + + CRITICAL: These tests verify the documented Claude Code hook contract. + All hooks MUST exit 0 when using JSON output format. + """ + + def test_stop_hook_exits_zero_on_allow(self, src_dir: Path, git_repo: Path) -> None: + """Test that stop hook exits 0 when allowing. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + """ + stdout, stderr, code = run_rules_check_module(git_repo, src_dir=src_dir) + + assert code == 0, f"Allow should exit 0. stderr: {stderr}" + + def test_stop_hook_exits_zero_on_block(self, src_dir: Path, git_repo_with_rule: Path) -> None: + """Test that stop hook exits 0 even when blocking. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + Blocking is communicated via JSON {"decision": "block"}, NOT via exit code. + """ + py_file = git_repo_with_rule / "test.py" + py_file.write_text("# Python file\n") + repo = Repo(git_repo_with_rule) + repo.index.add(["test.py"]) + + stdout, stderr, code = run_rules_check_module(git_repo_with_rule, src_dir=src_dir) + + # Hooks should exit 0 and communicate via JSON + assert code == 0, f"Block should still exit 0. stderr: {stderr}" + + def test_user_prompt_hook_exits_zero(self, rules_hooks_dir: Path, git_repo: Path) -> None: + """Test that user prompt hook always exits 0. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + """ + script_path = rules_hooks_dir / "user_prompt_submit.sh" + stdout, stderr, code = run_rules_hook_script(script_path, git_repo) + + assert code == 0, f"User prompt hook should exit 0. stderr: {stderr}" + + def test_capture_script_exits_zero(self, rules_hooks_dir: Path, git_repo: Path) -> None: + """Test that capture script exits 0. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + """ + script_path = rules_hooks_dir / "capture_prompt_work_tree.sh" + stdout, stderr, code = run_rules_hook_script(script_path, git_repo) + + assert code == 0, f"Capture script should exit 0. stderr: {stderr}" + + +# ============================================================================= +# Integration Tests +# ============================================================================= + + +class TestHookWrapperIntegration: + """Integration tests for hook wrappers with actual Python hooks.""" + + @pytest.fixture + def test_hook_module(self, tmp_path: Path) -> tuple[Path, str]: + """Create a temporary test hook module.""" + module_dir = tmp_path / "test_hooks" + module_dir.mkdir(parents=True) + + # Create __init__.py + (module_dir / "__init__.py").write_text("") + + # Create the hook module + hook_code = ''' +"""Test hook module.""" +import os +import sys + +from deepwork.hooks.wrapper import ( + HookInput, + HookOutput, + NormalizedEvent, + Platform, + run_hook, +) + + +def test_hook(hook_input: HookInput) -> HookOutput: + """Test hook that blocks for after_agent events.""" + if hook_input.event == NormalizedEvent.AFTER_AGENT: + return HookOutput(decision="block", reason="Test block reason") + return HookOutput() + + +def main() -> None: + platform_str = os.environ.get("DEEPWORK_HOOK_PLATFORM", "claude") + try: + platform = Platform(platform_str) + except ValueError: + platform = Platform.CLAUDE + + exit_code = run_hook(test_hook, platform) + sys.exit(exit_code) + + +if __name__ == "__main__": + main() +''' + (module_dir / "test_hook.py").write_text(hook_code) + + return tmp_path, "test_hooks.test_hook" + + def test_claude_wrapper_with_stop_event( + self, + hooks_dir: Path, + src_dir: Path, + test_hook_module: tuple[Path, str], + ) -> None: + """Test Claude wrapper processes Stop event correctly.""" + tmp_path, module_name = test_hook_module + script_path = hooks_dir / "claude_hook.sh" + + hook_input = { + "session_id": "test123", + "hook_event_name": "Stop", + "cwd": "/project", + } + + env = os.environ.copy() + env["PYTHONPATH"] = f"{src_dir}:{tmp_path}" + + result = subprocess.run( + ["bash", str(script_path), module_name], + capture_output=True, + text=True, + input=json.dumps(hook_input), + env=env, + ) + + # Exit code 0 even when blocking - the JSON decision field controls behavior + assert result.returncode == 0, f"Expected exit code 0. stderr: {result.stderr}" + + output = json.loads(result.stdout.strip()) + assert output["decision"] == "block" + assert "Test block reason" in output["reason"] + + def test_gemini_wrapper_with_afteragent_event( + self, + hooks_dir: Path, + src_dir: Path, + test_hook_module: tuple[Path, str], + ) -> None: + """Test Gemini wrapper processes AfterAgent event correctly.""" + tmp_path, module_name = test_hook_module + script_path = hooks_dir / "gemini_hook.sh" + + hook_input = { + "session_id": "test456", + "hook_event_name": "AfterAgent", + "cwd": "/project", + } + + env = os.environ.copy() + env["PYTHONPATH"] = f"{src_dir}:{tmp_path}" + + result = subprocess.run( + ["bash", str(script_path), module_name], + capture_output=True, + text=True, + input=json.dumps(hook_input), + env=env, + ) + + # Exit code 0 even when blocking - the JSON decision field controls behavior + assert result.returncode == 0, f"Expected exit code 0. stderr: {result.stderr}" + + output = json.loads(result.stdout.strip()) + # Gemini should get "deny" instead of "block" + assert output["decision"] == "deny" + assert "Test block reason" in output["reason"] + + def test_non_blocking_event( + self, + hooks_dir: Path, + src_dir: Path, + test_hook_module: tuple[Path, str], + ) -> None: + """Test that non-blocking events return exit code 0.""" + tmp_path, module_name = test_hook_module + script_path = hooks_dir / "claude_hook.sh" + + # SessionStart is not blocked by the test hook + hook_input = { + "session_id": "test789", + "hook_event_name": "SessionStart", + "cwd": "/project", + } + + env = os.environ.copy() + env["PYTHONPATH"] = f"{src_dir}:{tmp_path}" + + result = subprocess.run( + ["bash", str(script_path), module_name], + capture_output=True, + text=True, + input=json.dumps(hook_input), + env=env, + ) + + assert result.returncode == 0, f"Expected exit code 0. stderr: {result.stderr}" + output = json.loads(result.stdout.strip()) + assert output == {} or output.get("decision", "") not in ("block", "deny") + + +# ============================================================================= +# Python Module Tests +# ============================================================================= + + +class TestRulesCheckModule: + """Tests for the rules_check hook module.""" + + def test_module_imports(self) -> None: + """Test that the rules_check module can be imported.""" + from deepwork.hooks import rules_check + + assert hasattr(rules_check, "main") + assert hasattr(rules_check, "rules_check_hook") + + def test_hook_function_returns_output(self) -> None: + """Test that rules_check_hook returns a HookOutput.""" + from deepwork.hooks.rules_check import rules_check_hook + from deepwork.hooks.wrapper import HookInput, HookOutput, NormalizedEvent, Platform + + # Create a minimal hook input + hook_input = HookInput( + platform=Platform.CLAUDE, + event=NormalizedEvent.BEFORE_PROMPT, # Not after_agent, so no blocking + session_id="test", + ) + + output = rules_check_hook(hook_input) + + assert isinstance(output, HookOutput) + # Should not block for before_prompt event + assert output.decision != "block" diff --git a/tests/shell_script_tests/test_hooks_json_format.py b/tests/shell_script_tests/test_hooks_json_format.py deleted file mode 100644 index 14de1b21..00000000 --- a/tests/shell_script_tests/test_hooks_json_format.py +++ /dev/null @@ -1,363 +0,0 @@ -"""Tests for Claude Code hooks JSON format validation. - -Claude Code hooks have specific JSON response formats that must be followed: - -Stop hooks (hooks.after_agent): - - {} - Allow stop (empty object) - - {"decision": "block", "reason": "..."} - Block stop with reason - -UserPromptSubmit hooks (hooks.before_prompt): - - {} - No response needed (empty object) - - No output - Also acceptable - -BeforeTool hooks (hooks.before_tool): - - {} - Allow tool execution - - {"decision": "block", "reason": "..."} - Block tool execution - -All hooks: - - Must return valid JSON if producing output - - Must not contain non-JSON output on stdout (stderr is ok) - - Exit code 0 indicates success -""" - -import json -import os -import tempfile -from pathlib import Path - -import pytest -from git import Repo - -from .conftest import run_shell_script - - -def run_hook_script( - script_path: Path, - cwd: Path, - hook_input: dict | None = None, -) -> tuple[str, str, int]: - """Run a hook script and return its output.""" - return run_shell_script(script_path, cwd, hook_input=hook_input) - - -def validate_json_output(output: str) -> dict | None: - """ - Validate that output is valid JSON or empty. - - Args: - output: The stdout from a hook script - - Returns: - Parsed JSON dict, or None if empty/no output - - Raises: - AssertionError: If output is invalid JSON - """ - stripped = output.strip() - - if not stripped: - return None - - try: - result = json.loads(stripped) - assert isinstance(result, dict), "Hook output must be a JSON object" - return result - except json.JSONDecodeError as e: - pytest.fail(f"Invalid JSON output: {stripped!r}. Error: {e}") - - -def validate_stop_hook_response(response: dict | None) -> None: - """ - Validate a Stop hook response follows Claude Code format. - - Args: - response: Parsed JSON response or None - - Raises: - AssertionError: If response format is invalid - """ - if response is None: - # No output is acceptable for stop hooks - return - - if response == {}: - # Empty object means allow stop - return - - # Must have decision and reason for blocking - assert "decision" in response, ( - f"Stop hook blocking response must have 'decision' key: {response}" - ) - assert response["decision"] == "block", ( - f"Stop hook decision must be 'block', got: {response['decision']}" - ) - assert "reason" in response, f"Stop hook blocking response must have 'reason' key: {response}" - assert isinstance(response["reason"], str), f"Stop hook reason must be a string: {response}" - - # Reason should not be empty when blocking - assert response["reason"].strip(), "Stop hook blocking reason should not be empty" - - -def validate_prompt_hook_response(response: dict | None) -> None: - """ - Validate a UserPromptSubmit hook response. - - Args: - response: Parsed JSON response or None - - Raises: - AssertionError: If response format is invalid - """ - if response is None: - # No output is acceptable - return - - # Empty object or valid JSON object is fine - assert isinstance(response, dict), f"Prompt hook output must be a JSON object: {response}" - - -class TestPolicyStopHookJsonFormat: - """Tests specifically for policy_stop_hook.sh JSON format compliance.""" - - def test_allow_response_is_empty_json(self, policy_hooks_dir: Path, git_repo: Path) -> None: - """Test that allow response is empty JSON object.""" - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo) - - response = validate_json_output(stdout) - validate_stop_hook_response(response) - - if response is not None: - assert response == {}, f"Allow response should be empty: {response}" - - def test_block_response_has_required_fields( - self, policy_hooks_dir: Path, git_repo_with_policy: Path - ) -> None: - """Test that block response has decision and reason.""" - # Create a file that triggers the policy - py_file = git_repo_with_policy / "test.py" - py_file.write_text("# Python file\n") - repo = Repo(git_repo_with_policy) - repo.index.add(["test.py"]) - - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo_with_policy) - - response = validate_json_output(stdout) - validate_stop_hook_response(response) - - # Should be blocking - assert response is not None, "Expected blocking response" - assert response.get("decision") == "block", "Expected block decision" - assert "reason" in response, "Expected reason field" - - def test_block_reason_contains_policy_info( - self, policy_hooks_dir: Path, git_repo_with_policy: Path - ) -> None: - """Test that block reason contains policy information.""" - py_file = git_repo_with_policy / "test.py" - py_file.write_text("# Python file\n") - repo = Repo(git_repo_with_policy) - repo.index.add(["test.py"]) - - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo_with_policy) - - response = validate_json_output(stdout) - - assert response is not None, "Expected blocking response" - reason = response.get("reason", "") - - # Should contain useful policy information - assert "Policy" in reason or "policy" in reason, f"Reason should mention policy: {reason}" - - def test_no_extraneous_keys_in_response( - self, policy_hooks_dir: Path, git_repo_with_policy: Path - ) -> None: - """Test that response only contains expected keys.""" - py_file = git_repo_with_policy / "test.py" - py_file.write_text("# Python file\n") - repo = Repo(git_repo_with_policy) - repo.index.add(["test.py"]) - - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo_with_policy) - - response = validate_json_output(stdout) - - if response and response != {}: - # Only decision and reason are valid keys for stop hooks - valid_keys = {"decision", "reason"} - actual_keys = set(response.keys()) - assert actual_keys <= valid_keys, ( - f"Unexpected keys in response: {actual_keys - valid_keys}" - ) - - def test_output_is_single_line_json( - self, policy_hooks_dir: Path, git_repo_with_policy: Path - ) -> None: - """Test that JSON output is single-line (no pretty printing).""" - py_file = git_repo_with_policy / "test.py" - py_file.write_text("# Python file\n") - repo = Repo(git_repo_with_policy) - repo.index.add(["test.py"]) - - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo_with_policy) - - # Remove trailing newline and check for internal newlines - output = stdout.strip() - if output: - # JSON output should ideally be single line - # Multiple lines could indicate print statements or logging - lines = output.split("\n") - # Only the last line should be JSON - json_line = lines[-1] - # Verify the JSON is parseable - json.loads(json_line) - - -class TestUserPromptSubmitHookJsonFormat: - """Tests for user_prompt_submit.sh JSON format compliance.""" - - def test_output_is_valid_json_or_empty(self, policy_hooks_dir: Path, git_repo: Path) -> None: - """Test that output is valid JSON or empty.""" - script_path = policy_hooks_dir / "user_prompt_submit.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo) - - response = validate_json_output(stdout) - validate_prompt_hook_response(response) - - def test_does_not_block_prompt_submission(self, policy_hooks_dir: Path, git_repo: Path) -> None: - """Test that hook does not block prompt submission.""" - script_path = policy_hooks_dir / "user_prompt_submit.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo) - - response = validate_json_output(stdout) - - # UserPromptSubmit hooks should not block - if response: - assert response.get("decision") != "block", ( - "UserPromptSubmit hook should not return block decision" - ) - - -class TestHooksJsonFormatWithTranscript: - """Tests for hook JSON format when using transcript input.""" - - def test_stop_hook_with_transcript_input( - self, policy_hooks_dir: Path, git_repo_with_policy: Path - ) -> None: - """Test stop hook JSON format when transcript is provided.""" - py_file = git_repo_with_policy / "test.py" - py_file.write_text("# Python file\n") - repo = Repo(git_repo_with_policy) - repo.index.add(["test.py"]) - - # Create mock transcript - with tempfile.NamedTemporaryFile(mode="w", suffix=".jsonl", delete=False) as f: - transcript_path = f.name - f.write( - json.dumps( - { - "role": "assistant", - "message": {"content": [{"type": "text", "text": "Hello"}]}, - } - ) - ) - f.write("\n") - - try: - script_path = policy_hooks_dir / "policy_stop_hook.sh" - hook_input = {"transcript_path": transcript_path} - stdout, stderr, code = run_hook_script(script_path, git_repo_with_policy, hook_input) - - response = validate_json_output(stdout) - validate_stop_hook_response(response) - - finally: - os.unlink(transcript_path) - - def test_stop_hook_with_promise_returns_empty( - self, policy_hooks_dir: Path, git_repo_with_policy: Path - ) -> None: - """Test that promised policies return empty JSON.""" - py_file = git_repo_with_policy / "test.py" - py_file.write_text("# Python file\n") - repo = Repo(git_repo_with_policy) - repo.index.add(["test.py"]) - - # Create transcript with promise tag - with tempfile.NamedTemporaryFile(mode="w", suffix=".jsonl", delete=False) as f: - transcript_path = f.name - f.write( - json.dumps( - { - "role": "assistant", - "message": { - "content": [ - { - "type": "text", - "text": "✓ Python File Policy", - } - ] - }, - } - ) - ) - f.write("\n") - - try: - script_path = policy_hooks_dir / "policy_stop_hook.sh" - hook_input = {"transcript_path": transcript_path} - stdout, stderr, code = run_hook_script(script_path, git_repo_with_policy, hook_input) - - response = validate_json_output(stdout) - validate_stop_hook_response(response) - - # Should be empty (allow) because policy was promised - if response is not None: - assert response == {}, f"Expected empty response: {response}" - - finally: - os.unlink(transcript_path) - - -class TestHooksExitCodes: - """Tests for hook script exit codes.""" - - def test_stop_hook_exits_zero_on_allow(self, policy_hooks_dir: Path, git_repo: Path) -> None: - """Test that stop hook exits 0 when allowing.""" - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo) - - assert code == 0, f"Allow should exit 0. stderr: {stderr}" - - def test_stop_hook_exits_zero_on_block( - self, policy_hooks_dir: Path, git_repo_with_policy: Path - ) -> None: - """Test that stop hook exits 0 even when blocking.""" - py_file = git_repo_with_policy / "test.py" - py_file.write_text("# Python file\n") - repo = Repo(git_repo_with_policy) - repo.index.add(["test.py"]) - - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo_with_policy) - - # Hooks should exit 0 and communicate via JSON - assert code == 0, f"Block should still exit 0. stderr: {stderr}" - - def test_user_prompt_hook_exits_zero(self, policy_hooks_dir: Path, git_repo: Path) -> None: - """Test that user prompt hook always exits 0.""" - script_path = policy_hooks_dir / "user_prompt_submit.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo) - - assert code == 0, f"User prompt hook should exit 0. stderr: {stderr}" - - def test_capture_script_exits_zero(self, policy_hooks_dir: Path, git_repo: Path) -> None: - """Test that capture script exits 0.""" - script_path = policy_hooks_dir / "capture_prompt_work_tree.sh" - stdout, stderr, code = run_hook_script(script_path, git_repo) - - assert code == 0, f"Capture script should exit 0. stderr: {stderr}" diff --git a/tests/shell_script_tests/test_policy_stop_hook.py b/tests/shell_script_tests/test_policy_stop_hook.py deleted file mode 100644 index 07a2d221..00000000 --- a/tests/shell_script_tests/test_policy_stop_hook.py +++ /dev/null @@ -1,287 +0,0 @@ -"""Tests for policy_stop_hook.sh shell script. - -These tests verify that the policy stop hook correctly outputs JSON -to block or allow the stop event in Claude Code. -""" - -import json -import os -import tempfile -from pathlib import Path - -import pytest -from git import Repo - -from .conftest import run_shell_script - - -@pytest.fixture -def git_repo_with_src_policy(tmp_path: Path) -> Path: - """Create a git repo with a policy file that triggers on src/** changes.""" - repo = Repo.init(tmp_path) - - readme = tmp_path / "README.md" - readme.write_text("# Test Project\n") - repo.index.add(["README.md"]) - repo.index.commit("Initial commit") - - # Use compare_to: prompt since test repos don't have origin remote - policy_file = tmp_path / ".deepwork.policy.yml" - policy_file.write_text( - """- name: "Test Policy" - trigger: "src/**/*" - compare_to: prompt - instructions: | - This is a test policy that fires when src/ files change. - Please address this policy. -""" - ) - - # Empty baseline means all current files are "new" - deepwork_dir = tmp_path / ".deepwork" - deepwork_dir.mkdir(exist_ok=True) - (deepwork_dir / ".last_work_tree").write_text("") - - return tmp_path - - -def run_stop_hook( - script_path: Path, - cwd: Path, - hook_input: dict | None = None, -) -> tuple[str, str, int]: - """Run the policy_stop_hook.sh script and return its output.""" - return run_shell_script(script_path, cwd, hook_input=hook_input) - - -class TestPolicyStopHookBlocking: - """Tests for policy_stop_hook.sh blocking behavior.""" - - def test_outputs_block_json_when_policy_fires( - self, policy_hooks_dir: Path, git_repo_with_src_policy: Path - ) -> None: - """Test that the hook outputs blocking JSON when a policy fires.""" - # Create a file that triggers the policy - src_dir = git_repo_with_src_policy / "src" - src_dir.mkdir(exist_ok=True) - (src_dir / "main.py").write_text("# New file\n") - - # Stage the change - repo = Repo(git_repo_with_src_policy) - repo.index.add(["src/main.py"]) - - # Run the stop hook - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_stop_hook(script_path, git_repo_with_src_policy) - - # Parse the output as JSON - output = stdout.strip() - assert output, f"Expected JSON output but got empty string. stderr: {stderr}" - - try: - result = json.loads(output) - except json.JSONDecodeError as e: - pytest.fail(f"Output is not valid JSON: {output!r}. Error: {e}") - - # Verify the JSON has the blocking structure - assert "decision" in result, f"Expected 'decision' key in JSON: {result}" - assert result["decision"] == "block", f"Expected decision='block', got: {result}" - assert "reason" in result, f"Expected 'reason' key in JSON: {result}" - assert "Test Policy" in result["reason"], f"Policy name not in reason: {result}" - - def test_outputs_empty_json_when_no_policy_fires( - self, policy_hooks_dir: Path, git_repo_with_src_policy: Path - ) -> None: - """Test that the hook outputs empty JSON when no policy fires.""" - # Don't create any files that would trigger the policy - # (policy triggers on src/** but we haven't created anything in src/) - - # Run the stop hook - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_stop_hook(script_path, git_repo_with_src_policy) - - # Parse the output as JSON - output = stdout.strip() - assert output, f"Expected JSON output but got empty string. stderr: {stderr}" - - try: - result = json.loads(output) - except json.JSONDecodeError as e: - pytest.fail(f"Output is not valid JSON: {output!r}. Error: {e}") - - # Should be empty JSON (no blocking) - assert result == {}, f"Expected empty JSON when no policies fire, got: {result}" - - def test_exits_early_when_no_policy_file(self, policy_hooks_dir: Path, git_repo: Path) -> None: - """Test that the hook exits cleanly when no policy file exists.""" - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_stop_hook(script_path, git_repo) - - # Should exit with code 0 and produce no output (or empty) - assert code == 0, f"Expected exit code 0, got {code}. stderr: {stderr}" - # No output is fine when there's no policy file - output = stdout.strip() - if output: - # If there is output, it should be valid JSON - try: - result = json.loads(output) - assert result == {}, f"Expected empty JSON, got: {result}" - except json.JSONDecodeError: - # Empty or no output is acceptable - pass - - def test_respects_promise_tags( - self, policy_hooks_dir: Path, git_repo_with_src_policy: Path - ) -> None: - """Test that promised policies are not re-triggered.""" - # Create a file that triggers the policy - src_dir = git_repo_with_src_policy / "src" - src_dir.mkdir(exist_ok=True) - (src_dir / "main.py").write_text("# New file\n") - - # Stage the change - repo = Repo(git_repo_with_src_policy) - repo.index.add(["src/main.py"]) - - # Create a mock transcript with the promise tag - with tempfile.NamedTemporaryFile(mode="w", suffix=".jsonl", delete=False) as f: - transcript_path = f.name - # Write a mock assistant message with the promise tag - f.write( - json.dumps( - { - "role": "assistant", - "message": { - "content": [ - { - "type": "text", - "text": "I've addressed the policy. ✓ Test Policy", - } - ] - }, - } - ) - ) - f.write("\n") - - try: - # Run the stop hook with transcript path - script_path = policy_hooks_dir / "policy_stop_hook.sh" - hook_input = {"transcript_path": transcript_path} - stdout, stderr, code = run_stop_hook(script_path, git_repo_with_src_policy, hook_input) - - # Parse the output - output = stdout.strip() - assert output, f"Expected JSON output. stderr: {stderr}" - - result = json.loads(output) - - # Should be empty JSON because the policy was promised - assert result == {}, f"Expected empty JSON when policy is promised, got: {result}" - finally: - os.unlink(transcript_path) - - def test_safety_pattern_prevents_firing(self, policy_hooks_dir: Path, tmp_path: Path) -> None: - """Test that safety patterns prevent policies from firing.""" - # Initialize git repo - repo = Repo.init(tmp_path) - - readme = tmp_path / "README.md" - readme.write_text("# Test Project\n") - repo.index.add(["README.md"]) - repo.index.commit("Initial commit") - - # Create a policy with a safety pattern - # Use compare_to: prompt since test repos don't have origin remote - policy_file = tmp_path / ".deepwork.policy.yml" - policy_file.write_text( - """- name: "Documentation Policy" - trigger: "src/**/*" - safety: "docs/**/*" - compare_to: prompt - instructions: | - Update documentation when changing source files. -""" - ) - - # Create .deepwork directory with empty baseline - deepwork_dir = tmp_path / ".deepwork" - deepwork_dir.mkdir(exist_ok=True) - (deepwork_dir / ".last_work_tree").write_text("") - - # Create both trigger and safety files - src_dir = tmp_path / "src" - src_dir.mkdir(exist_ok=True) - (src_dir / "main.py").write_text("# Source file\n") - - docs_dir = tmp_path / "docs" - docs_dir.mkdir(exist_ok=True) - (docs_dir / "api.md").write_text("# API docs\n") - - # Stage both changes so they appear in git diff --cached - repo.index.add(["src/main.py", "docs/api.md"]) - - # Run the stop hook - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_stop_hook(script_path, tmp_path) - - # Parse the output - output = stdout.strip() - assert output, f"Expected JSON output. stderr: {stderr}" - - result = json.loads(output) - - # Should be empty JSON because safety pattern matched - assert result == {}, f"Expected empty JSON when safety pattern matches, got: {result}" - - -class TestPolicyStopHookJsonFormat: - """Tests for the JSON output format of policy_stop_hook.sh.""" - - def test_json_has_correct_structure( - self, policy_hooks_dir: Path, git_repo_with_src_policy: Path - ) -> None: - """Test that blocking JSON has the correct Claude Code structure.""" - # Create a file that triggers the policy - src_dir = git_repo_with_src_policy / "src" - src_dir.mkdir(exist_ok=True) - (src_dir / "main.py").write_text("# New file\n") - - repo = Repo(git_repo_with_src_policy) - repo.index.add(["src/main.py"]) - - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_stop_hook(script_path, git_repo_with_src_policy) - - result = json.loads(stdout.strip()) - - # Verify exact structure expected by Claude Code - assert set(result.keys()) == { - "decision", - "reason", - }, f"Unexpected keys in JSON: {result.keys()}" - assert result["decision"] == "block" - assert isinstance(result["reason"], str) - assert len(result["reason"]) > 0 - - def test_reason_contains_policy_instructions( - self, policy_hooks_dir: Path, git_repo_with_src_policy: Path - ) -> None: - """Test that the reason includes the policy instructions.""" - src_dir = git_repo_with_src_policy / "src" - src_dir.mkdir(exist_ok=True) - (src_dir / "main.py").write_text("# New file\n") - - repo = Repo(git_repo_with_src_policy) - repo.index.add(["src/main.py"]) - - script_path = policy_hooks_dir / "policy_stop_hook.sh" - stdout, stderr, code = run_stop_hook(script_path, git_repo_with_src_policy) - - result = json.loads(stdout.strip()) - - # Check that the reason contains the policy content - reason = result["reason"] - assert "DeepWork Policies Triggered" in reason - assert "Test Policy" in reason - assert "test policy that fires" in reason diff --git a/tests/shell_script_tests/test_rules_stop_hook.py b/tests/shell_script_tests/test_rules_stop_hook.py new file mode 100644 index 00000000..9aeb3306 --- /dev/null +++ b/tests/shell_script_tests/test_rules_stop_hook.py @@ -0,0 +1,299 @@ +"""Tests for the rules stop hook (deepwork.hooks.rules_check). + +These tests verify that the rules stop hook correctly outputs JSON +to block or allow the stop event in Claude Code. +""" + +import json +import os +import subprocess +import tempfile +from pathlib import Path + +import pytest +from git import Repo + + +@pytest.fixture +def git_repo_with_src_rule(tmp_path: Path) -> Path: + """Create a git repo with a v2 rule file that triggers on src/** changes.""" + repo = Repo.init(tmp_path) + + readme = tmp_path / "README.md" + readme.write_text("# Test Project\n") + repo.index.add(["README.md"]) + repo.index.commit("Initial commit") + + # Create v2 rules directory and file + rules_dir = tmp_path / ".deepwork" / "rules" + rules_dir.mkdir(parents=True, exist_ok=True) + + # Use compare_to: prompt since test repos don't have origin remote + rule_file = rules_dir / "test-rule.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: "src/**/*" +compare_to: prompt +--- +This is a test rule that fires when src/ files change. +Please address this rule. +""" + ) + + # Empty baseline means all current files are "new" + deepwork_dir = tmp_path / ".deepwork" + (deepwork_dir / ".last_work_tree").write_text("") + + return tmp_path + + +def run_stop_hook( + cwd: Path, + hook_input: dict | None = None, + src_dir: Path | None = None, +) -> tuple[str, str, int]: + """Run the rules_check module and return its output.""" + env = os.environ.copy() + env["DEEPWORK_HOOK_PLATFORM"] = "claude" + if src_dir: + env["PYTHONPATH"] = str(src_dir) + + stdin_data = json.dumps(hook_input) if hook_input else "" + + result = subprocess.run( + ["python", "-m", "deepwork.hooks.rules_check"], + cwd=cwd, + capture_output=True, + text=True, + input=stdin_data, + env=env, + ) + + return result.stdout, result.stderr, result.returncode + + +class TestRulesStopHookBlocking: + """Tests for rules stop hook blocking behavior.""" + + def test_outputs_block_json_when_rule_fires( + self, src_dir: Path, git_repo_with_src_rule: Path + ) -> None: + """Test that the hook outputs blocking JSON when a rule fires.""" + # Create a file that triggers the rule + test_src_dir = git_repo_with_src_rule / "src" + test_src_dir.mkdir(exist_ok=True) + (test_src_dir / "main.py").write_text("# New file\n") + + # Stage the change + repo = Repo(git_repo_with_src_rule) + repo.index.add(["src/main.py"]) + + # Run the stop hook + stdout, stderr, code = run_stop_hook(git_repo_with_src_rule, src_dir=src_dir) + + # Parse the output as JSON + output = stdout.strip() + assert output, f"Expected JSON output but got empty string. stderr: {stderr}" + + try: + result = json.loads(output) + except json.JSONDecodeError as e: + pytest.fail(f"Output is not valid JSON: {output!r}. Error: {e}") + + # Verify the JSON has the blocking structure + assert "decision" in result, f"Expected 'decision' key in JSON: {result}" + assert result["decision"] == "block", f"Expected decision='block', got: {result}" + assert "reason" in result, f"Expected 'reason' key in JSON: {result}" + assert "Test Rule" in result["reason"], f"Rule name not in reason: {result}" + + def test_outputs_empty_json_when_no_rule_fires( + self, src_dir: Path, git_repo_with_src_rule: Path + ) -> None: + """Test that the hook outputs empty JSON when no rule fires.""" + # Don't create any files that would trigger the rule + # (rule triggers on src/** but we haven't created anything in src/) + + # Run the stop hook + stdout, stderr, code = run_stop_hook(git_repo_with_src_rule, src_dir=src_dir) + + # Parse the output as JSON + output = stdout.strip() + assert output, f"Expected JSON output but got empty string. stderr: {stderr}" + + try: + result = json.loads(output) + except json.JSONDecodeError as e: + pytest.fail(f"Output is not valid JSON: {output!r}. Error: {e}") + + # Should be empty JSON (no blocking) + assert result == {}, f"Expected empty JSON when no rules fire, got: {result}" + + def test_exits_early_when_no_rules_dir(self, src_dir: Path, git_repo: Path) -> None: + """Test that the hook exits cleanly when no rules directory exists.""" + stdout, stderr, code = run_stop_hook(git_repo, src_dir=src_dir) + + # Should exit with code 0 and produce no output (or empty) + assert code == 0, f"Expected exit code 0, got {code}. stderr: {stderr}" + # No output is fine when there's no rules directory + output = stdout.strip() + if output: + # If there is output, it should be valid JSON + try: + result = json.loads(output) + assert result == {}, f"Expected empty JSON, got: {result}" + except json.JSONDecodeError: + # Empty or no output is acceptable + pass + + def test_respects_promise_tags(self, src_dir: Path, git_repo_with_src_rule: Path) -> None: + """Test that promised rules are not re-triggered.""" + # Create a file that triggers the rule + test_src_dir = git_repo_with_src_rule / "src" + test_src_dir.mkdir(exist_ok=True) + (test_src_dir / "main.py").write_text("# New file\n") + + # Stage the change + repo = Repo(git_repo_with_src_rule) + repo.index.add(["src/main.py"]) + + # Create a mock transcript with the promise tag + with tempfile.NamedTemporaryFile(mode="w", suffix=".jsonl", delete=False) as f: + transcript_path = f.name + # Write a mock assistant message with the promise tag + f.write( + json.dumps( + { + "role": "assistant", + "message": { + "content": [ + { + "type": "text", + "text": "I've addressed the rule. Test Rule", + } + ] + }, + } + ) + ) + f.write("\n") + + try: + # Run the stop hook with transcript path + hook_input = {"transcript_path": transcript_path, "hook_event_name": "Stop"} + stdout, stderr, code = run_stop_hook( + git_repo_with_src_rule, hook_input, src_dir=src_dir + ) + + # Parse the output + output = stdout.strip() + assert output, f"Expected JSON output. stderr: {stderr}" + + result = json.loads(output) + + # Should be empty JSON because the rule was promised + assert result == {}, f"Expected empty JSON when rule is promised, got: {result}" + finally: + os.unlink(transcript_path) + + def test_safety_pattern_prevents_firing(self, src_dir: Path, tmp_path: Path) -> None: + """Test that safety patterns prevent rules from firing.""" + # Initialize git repo + repo = Repo.init(tmp_path) + + readme = tmp_path / "README.md" + readme.write_text("# Test Project\n") + repo.index.add(["README.md"]) + repo.index.commit("Initial commit") + + # Create v2 rule with a safety pattern + rules_dir = tmp_path / ".deepwork" / "rules" + rules_dir.mkdir(parents=True, exist_ok=True) + + rule_file = rules_dir / "documentation-rule.md" + rule_file.write_text( + """--- +name: Documentation Rule +trigger: "src/**/*" +safety: "docs/**/*" +compare_to: prompt +--- +Update documentation when changing source files. +""" + ) + + # Create .deepwork directory with empty baseline + deepwork_dir = tmp_path / ".deepwork" + (deepwork_dir / ".last_work_tree").write_text("") + + # Create both trigger and safety files + test_src_dir = tmp_path / "src" + test_src_dir.mkdir(exist_ok=True) + (test_src_dir / "main.py").write_text("# Source file\n") + + docs_dir = tmp_path / "docs" + docs_dir.mkdir(exist_ok=True) + (docs_dir / "api.md").write_text("# API docs\n") + + # Stage both changes so they appear in git diff --cached + repo.index.add(["src/main.py", "docs/api.md"]) + + # Run the stop hook + stdout, stderr, code = run_stop_hook(tmp_path, src_dir=src_dir) + + # Parse the output + output = stdout.strip() + assert output, f"Expected JSON output. stderr: {stderr}" + + result = json.loads(output) + + # Should be empty JSON because safety pattern matched + assert result == {}, f"Expected empty JSON when safety pattern matches, got: {result}" + + +class TestRulesStopHookJsonFormat: + """Tests for the JSON output format of the rules stop hook.""" + + def test_json_has_correct_structure(self, src_dir: Path, git_repo_with_src_rule: Path) -> None: + """Test that blocking JSON has the correct Claude Code structure.""" + # Create a file that triggers the rule + test_src_dir = git_repo_with_src_rule / "src" + test_src_dir.mkdir(exist_ok=True) + (test_src_dir / "main.py").write_text("# New file\n") + + repo = Repo(git_repo_with_src_rule) + repo.index.add(["src/main.py"]) + + stdout, stderr, code = run_stop_hook(git_repo_with_src_rule, src_dir=src_dir) + + result = json.loads(stdout.strip()) + + # Verify exact structure expected by Claude Code + assert set(result.keys()) == { + "decision", + "reason", + }, f"Unexpected keys in JSON: {result.keys()}" + assert result["decision"] == "block" + assert isinstance(result["reason"], str) + assert len(result["reason"]) > 0 + + def test_reason_contains_rule_instructions( + self, src_dir: Path, git_repo_with_src_rule: Path + ) -> None: + """Test that the reason includes the rule instructions.""" + test_src_dir = git_repo_with_src_rule / "src" + test_src_dir.mkdir(exist_ok=True) + (test_src_dir / "main.py").write_text("# New file\n") + + repo = Repo(git_repo_with_src_rule) + repo.index.add(["src/main.py"]) + + stdout, stderr, code = run_stop_hook(git_repo_with_src_rule, src_dir=src_dir) + + result = json.loads(stdout.strip()) + + # Check that the reason contains the rule content + reason = result["reason"] + assert "DeepWork Rules Triggered" in reason + assert "Test Rule" in reason + assert "test rule that fires" in reason diff --git a/tests/shell_script_tests/test_user_prompt_submit.py b/tests/shell_script_tests/test_user_prompt_submit.py index b503727b..3f1b655e 100644 --- a/tests/shell_script_tests/test_user_prompt_submit.py +++ b/tests/shell_script_tests/test_user_prompt_submit.py @@ -28,34 +28,34 @@ def run_user_prompt_submit_hook( class TestUserPromptSubmitHookExecution: """Tests for user_prompt_submit.sh execution behavior.""" - def test_exits_successfully(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_exits_successfully(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the hook exits with code 0.""" - script_path = policy_hooks_dir / "user_prompt_submit.sh" + script_path = rules_hooks_dir / "user_prompt_submit.sh" stdout, stderr, code = run_user_prompt_submit_hook(script_path, git_repo) assert code == 0, f"Expected exit code 0, got {code}. stderr: {stderr}" - def test_creates_deepwork_directory(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_creates_deepwork_directory(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the hook creates .deepwork directory if it doesn't exist.""" deepwork_dir = git_repo / ".deepwork" assert not deepwork_dir.exists(), "Precondition: .deepwork should not exist" - script_path = policy_hooks_dir / "user_prompt_submit.sh" + script_path = rules_hooks_dir / "user_prompt_submit.sh" stdout, stderr, code = run_user_prompt_submit_hook(script_path, git_repo) assert code == 0, f"Script failed with stderr: {stderr}" assert deepwork_dir.exists(), "Hook should create .deepwork directory" - def test_creates_last_work_tree_file(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_creates_last_work_tree_file(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the hook creates .deepwork/.last_work_tree file.""" - script_path = policy_hooks_dir / "user_prompt_submit.sh" + script_path = rules_hooks_dir / "user_prompt_submit.sh" stdout, stderr, code = run_user_prompt_submit_hook(script_path, git_repo) work_tree_file = git_repo / ".deepwork" / ".last_work_tree" assert code == 0, f"Script failed with stderr: {stderr}" assert work_tree_file.exists(), "Hook should create .last_work_tree file" - def test_captures_staged_changes(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_captures_staged_changes(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the hook captures staged file changes.""" # Create and stage a new file new_file = git_repo / "new_file.py" @@ -63,7 +63,7 @@ def test_captures_staged_changes(self, policy_hooks_dir: Path, git_repo: Path) - repo = Repo(git_repo) repo.index.add(["new_file.py"]) - script_path = policy_hooks_dir / "user_prompt_submit.sh" + script_path = rules_hooks_dir / "user_prompt_submit.sh" stdout, stderr, code = run_user_prompt_submit_hook(script_path, git_repo) assert code == 0, f"Script failed with stderr: {stderr}" @@ -72,13 +72,13 @@ def test_captures_staged_changes(self, policy_hooks_dir: Path, git_repo: Path) - content = work_tree_file.read_text() assert "new_file.py" in content, "Staged file should be captured" - def test_captures_untracked_files(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_captures_untracked_files(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the hook captures untracked files.""" # Create an untracked file (don't stage it) untracked = git_repo / "untracked.txt" untracked.write_text("untracked content\n") - script_path = policy_hooks_dir / "user_prompt_submit.sh" + script_path = rules_hooks_dir / "user_prompt_submit.sh" stdout, stderr, code = run_user_prompt_submit_hook(script_path, git_repo) assert code == 0, f"Script failed with stderr: {stderr}" @@ -99,9 +99,9 @@ class TestUserPromptSubmitHookJsonOutput: Either is acceptable; invalid JSON is NOT acceptable. """ - def test_output_is_empty_or_valid_json(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_output_is_empty_or_valid_json(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that output is either empty or valid JSON.""" - script_path = policy_hooks_dir / "user_prompt_submit.sh" + script_path = rules_hooks_dir / "user_prompt_submit.sh" stdout, stderr, code = run_user_prompt_submit_hook(script_path, git_repo) output = stdout.strip() @@ -114,9 +114,9 @@ def test_output_is_empty_or_valid_json(self, policy_hooks_dir: Path, git_repo: P except json.JSONDecodeError as e: pytest.fail(f"Output is not valid JSON: {output!r}. Error: {e}") - def test_does_not_block_prompt(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_does_not_block_prompt(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the hook does not return a blocking response.""" - script_path = policy_hooks_dir / "user_prompt_submit.sh" + script_path = rules_hooks_dir / "user_prompt_submit.sh" stdout, stderr, code = run_user_prompt_submit_hook(script_path, git_repo) output = stdout.strip() @@ -135,18 +135,18 @@ def test_does_not_block_prompt(self, policy_hooks_dir: Path, git_repo: Path) -> class TestUserPromptSubmitHookIdempotence: """Tests for idempotent behavior of user_prompt_submit.sh.""" - def test_multiple_runs_succeed(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_multiple_runs_succeed(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that the hook can be run multiple times successfully.""" - script_path = policy_hooks_dir / "user_prompt_submit.sh" + script_path = rules_hooks_dir / "user_prompt_submit.sh" # Run multiple times for i in range(3): stdout, stderr, code = run_user_prompt_submit_hook(script_path, git_repo) assert code == 0, f"Run {i + 1} failed with stderr: {stderr}" - def test_updates_work_tree_on_new_changes(self, policy_hooks_dir: Path, git_repo: Path) -> None: + def test_updates_work_tree_on_new_changes(self, rules_hooks_dir: Path, git_repo: Path) -> None: """Test that subsequent runs update the work tree state.""" - script_path = policy_hooks_dir / "user_prompt_submit.sh" + script_path = rules_hooks_dir / "user_prompt_submit.sh" repo = Repo(git_repo) # First run - capture initial state diff --git a/tests/unit/test_command_executor.py b/tests/unit/test_command_executor.py new file mode 100644 index 00000000..77d7b320 --- /dev/null +++ b/tests/unit/test_command_executor.py @@ -0,0 +1,197 @@ +"""Tests for command executor (CMD-5.x from test_scenarios.md).""" + +from pathlib import Path + +from deepwork.core.command_executor import ( + CommandResult, + all_commands_succeeded, + execute_command, + format_command_errors, + run_command_action, + substitute_command_variables, +) +from deepwork.core.rules_parser import CommandAction + + +class TestSubstituteCommandVariables: + """Tests for command variable substitution.""" + + def test_single_file_substitution(self) -> None: + """Substitute {file} variable.""" + result = substitute_command_variables( + "ruff format {file}", + file="src/main.py", + ) + assert result == "ruff format src/main.py" + + def test_multiple_files_substitution(self) -> None: + """Substitute {files} variable.""" + result = substitute_command_variables( + "eslint --fix {files}", + files=["a.js", "b.js", "c.js"], + ) + assert result == "eslint --fix a.js b.js c.js" + + def test_repo_root_substitution(self) -> None: + """Substitute {repo_root} variable.""" + result = substitute_command_variables( + "cd {repo_root} && pytest", + repo_root=Path("/home/user/project"), + ) + assert result == "cd /home/user/project && pytest" + + def test_all_variables(self) -> None: + """Substitute all variables together.""" + result = substitute_command_variables( + "{repo_root}/scripts/process.sh {file} {files}", + file="main.py", + files=["a.py", "b.py"], + repo_root=Path("/project"), + ) + assert result == "/project/scripts/process.sh main.py a.py b.py" + + +class TestExecuteCommand: + """Tests for command execution.""" + + def test_successful_command(self) -> None: + """CMD-5.3.1: Exit code 0 - success.""" + result = execute_command("echo hello") + assert result.success is True + assert result.exit_code == 0 + assert "hello" in result.stdout + + def test_failed_command(self) -> None: + """CMD-5.3.2: Exit code 1 - failure.""" + result = execute_command("exit 1") + assert result.success is False + assert result.exit_code == 1 + + def test_command_timeout(self) -> None: + """CMD-5.3.3: Command timeout.""" + result = execute_command("sleep 10", timeout=1) + assert result.success is False + assert "timed out" in result.stderr.lower() + + def test_command_not_found(self) -> None: + """CMD-5.3.4: Command not found.""" + result = execute_command("nonexistent_command_12345") + assert result.success is False + # Different systems return different error messages + assert result.exit_code != 0 or "not found" in result.stderr.lower() + + +class TestRunCommandActionEachMatch: + """Tests for run_for: each_match mode (CMD-5.1.x).""" + + def test_single_file(self) -> None: + """CMD-5.1.1: Single file triggers single command.""" + action = CommandAction(command="echo {file}", run_for="each_match") + results = run_command_action(action, ["src/main.py"]) + + assert len(results) == 1 + assert results[0].command == "echo src/main.py" + assert results[0].success is True + + def test_multiple_files(self) -> None: + """CMD-5.1.2: Multiple files trigger command for each.""" + action = CommandAction(command="echo {file}", run_for="each_match") + results = run_command_action(action, ["src/a.py", "src/b.py"]) + + assert len(results) == 2 + assert results[0].command == "echo src/a.py" + assert results[1].command == "echo src/b.py" + + def test_no_files(self) -> None: + """CMD-5.1.3: No files - no command run.""" + action = CommandAction(command="echo {file}", run_for="each_match") + results = run_command_action(action, []) + + assert len(results) == 0 + + +class TestRunCommandActionAllMatches: + """Tests for run_for: all_matches mode (CMD-5.2.x).""" + + def test_multiple_files_single_command(self) -> None: + """CMD-5.2.1: Multiple files in single command.""" + action = CommandAction(command="echo {files}", run_for="all_matches") + results = run_command_action(action, ["a.js", "b.js", "c.js"]) + + assert len(results) == 1 + assert results[0].command == "echo a.js b.js c.js" + assert results[0].success is True + + def test_single_file_single_command(self) -> None: + """CMD-5.2.2: Single file in single command.""" + action = CommandAction(command="echo {files}", run_for="all_matches") + results = run_command_action(action, ["a.js"]) + + assert len(results) == 1 + assert results[0].command == "echo a.js" + + +class TestAllCommandsSucceeded: + """Tests for all_commands_succeeded helper.""" + + def test_all_success(self) -> None: + """All commands succeeded.""" + results = [ + CommandResult(success=True, exit_code=0, stdout="ok", stderr="", command="echo 1"), + CommandResult(success=True, exit_code=0, stdout="ok", stderr="", command="echo 2"), + ] + assert all_commands_succeeded(results) is True + + def test_one_failure(self) -> None: + """One command failed.""" + results = [ + CommandResult(success=True, exit_code=0, stdout="ok", stderr="", command="echo 1"), + CommandResult(success=False, exit_code=1, stdout="", stderr="error", command="exit 1"), + ] + assert all_commands_succeeded(results) is False + + def test_empty_list(self) -> None: + """Empty list is considered success.""" + assert all_commands_succeeded([]) is True + + +class TestFormatCommandErrors: + """Tests for format_command_errors helper.""" + + def test_single_error(self) -> None: + """Format single error.""" + results = [ + CommandResult( + success=False, + exit_code=1, + stdout="", + stderr="Something went wrong", + command="failing_cmd", + ), + ] + output = format_command_errors(results) + assert "failing_cmd" in output + assert "Something went wrong" in output + assert "Exit code: 1" in output + + def test_multiple_errors(self) -> None: + """Format multiple errors.""" + results = [ + CommandResult(success=False, exit_code=1, stdout="", stderr="Error 1", command="cmd1"), + CommandResult(success=False, exit_code=2, stdout="", stderr="Error 2", command="cmd2"), + ] + output = format_command_errors(results) + assert "cmd1" in output + assert "Error 1" in output + assert "cmd2" in output + assert "Error 2" in output + + def test_ignores_success(self) -> None: + """Ignore successful commands.""" + results = [ + CommandResult(success=True, exit_code=0, stdout="ok", stderr="", command="good_cmd"), + CommandResult(success=False, exit_code=1, stdout="", stderr="bad", command="bad_cmd"), + ] + output = format_command_errors(results) + assert "good_cmd" not in output + assert "bad_cmd" in output diff --git a/tests/unit/test_evaluate_policies.py b/tests/unit/test_evaluate_policies.py deleted file mode 100644 index 03f1a26a..00000000 --- a/tests/unit/test_evaluate_policies.py +++ /dev/null @@ -1,101 +0,0 @@ -"""Tests for the hooks evaluate_policies module.""" - -from deepwork.core.policy_parser import Policy -from deepwork.hooks.evaluate_policies import extract_promise_tags, format_policy_message - - -class TestExtractPromiseTags: - """Tests for extract_promise_tags function.""" - - def test_extracts_policy_name_from_promise(self) -> None: - """Test extracting policy name from promise tag body.""" - text = "✓ Update Docs" - result = extract_promise_tags(text) - assert result == {"Update Docs"} - - def test_extracts_multiple_promises(self) -> None: - """Test extracting multiple promise tags.""" - text = """ - I've addressed the policies. - ✓ Update Docs - ✓ Security Review - """ - result = extract_promise_tags(text) - assert result == {"Update Docs", "Security Review"} - - def test_case_insensitive(self) -> None: - """Test that promise tag matching is case insensitive.""" - text = "✓ Test Policy" - result = extract_promise_tags(text) - assert result == {"Test Policy"} - - def test_returns_empty_set_for_no_promises(self) -> None: - """Test that empty set is returned when no promises found.""" - text = "This is just some regular text without any promise tags." - result = extract_promise_tags(text) - assert result == set() - - def test_strips_whitespace_from_policy_name(self) -> None: - """Test that whitespace is stripped from extracted policy names.""" - text = "✓ Policy With Spaces " - result = extract_promise_tags(text) - assert result == {"Policy With Spaces"} - - -class TestFormatPolicyMessage: - """Tests for format_policy_message function.""" - - def test_formats_single_policy(self) -> None: - """Test formatting a single policy.""" - policies = [ - Policy( - name="Test Policy", - triggers=["src/*"], - safety=[], - instructions="Please update the documentation.", - ) - ] - result = format_policy_message(policies) - - assert "## DeepWork Policies Triggered" in result - assert "### Policy: Test Policy" in result - assert "Please update the documentation." in result - assert "✓ Policy Name" in result - - def test_formats_multiple_policies(self) -> None: - """Test formatting multiple policies.""" - policies = [ - Policy( - name="Policy 1", - triggers=["src/*"], - safety=[], - instructions="Do thing 1.", - ), - Policy( - name="Policy 2", - triggers=["test/*"], - safety=[], - instructions="Do thing 2.", - ), - ] - result = format_policy_message(policies) - - assert "### Policy: Policy 1" in result - assert "### Policy: Policy 2" in result - assert "Do thing 1." in result - assert "Do thing 2." in result - - def test_strips_instruction_whitespace(self) -> None: - """Test that instruction whitespace is stripped.""" - policies = [ - Policy( - name="Test", - triggers=["*"], - safety=[], - instructions=" \n Instructions here \n ", - ) - ] - result = format_policy_message(policies) - - # Should be stripped but present - assert "Instructions here" in result diff --git a/tests/unit/test_hook_wrapper.py b/tests/unit/test_hook_wrapper.py index 4332c914..fd1a51d9 100644 --- a/tests/unit/test_hook_wrapper.py +++ b/tests/unit/test_hook_wrapper.py @@ -1,5 +1,22 @@ """Tests for the hook wrapper module. +# ****************************************************************************** +# *** CRITICAL CONTRACT TESTS *** +# ****************************************************************************** +# +# These tests verify the EXACT format required by Claude Code hooks as +# documented in: doc/platforms/claude/hooks_system.md +# +# Hook JSON Contract Summary: +# - Allow response: {} (empty JSON object) +# - Block response: {"decision": "block", "reason": "..."} (Claude Code) +# - Block response: {"decision": "deny", "reason": "..."} (Gemini CLI) +# +# DO NOT MODIFY these tests without first consulting the official Claude Code +# documentation at: https://docs.anthropic.com/en/docs/claude-code/hooks +# +# ****************************************************************************** + These tests verify that the hook wrapper correctly normalizes input/output between different AI CLI platforms (Claude Code, Gemini CLI). """ @@ -157,18 +174,34 @@ def test_empty_input(self) -> None: assert hook_input.tool_name == "" +# ****************************************************************************** +# *** DO NOT EDIT THESE OUTPUT FORMAT TESTS! *** +# As documented in doc/platforms/claude/hooks_system.md, hook responses must be: +# - {} (empty object) to allow +# - {"decision": "block", "reason": "..."} to block (Claude Code) +# - {"decision": "deny", "reason": "..."} to block (Gemini CLI) +# Any other format may cause undefined behavior. +# See: https://docs.anthropic.com/en/docs/claude-code/hooks +# ****************************************************************************** class TestHookOutput: """Tests for HookOutput denormalization.""" def test_empty_output_produces_empty_json(self) -> None: - """Test that empty HookOutput produces empty dict.""" + """Test that empty HookOutput produces empty dict. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + """ output = HookOutput() result = output.to_dict(Platform.CLAUDE, NormalizedEvent.AFTER_AGENT) assert result == {} def test_block_decision_claude(self) -> None: - """Test blocking output for Claude.""" + """Test blocking output for Claude. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + Claude Code expects {"decision": "block", "reason": "..."} to block. + """ output = HookOutput(decision="block", reason="Must complete X first") result = output.to_dict(Platform.CLAUDE, NormalizedEvent.AFTER_AGENT) @@ -176,7 +209,11 @@ def test_block_decision_claude(self) -> None: assert result["reason"] == "Must complete X first" def test_block_decision_gemini_converts_to_deny(self) -> None: - """Test that 'block' is converted to 'deny' for Gemini.""" + """Test that 'block' is converted to 'deny' for Gemini. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + Gemini CLI expects {"decision": "deny", "reason": "..."} to block. + """ output = HookOutput(decision="block", reason="Must complete X first") result = output.to_dict(Platform.GEMINI, NormalizedEvent.AFTER_AGENT) @@ -288,11 +325,22 @@ def test_invalid_json(self) -> None: assert hook_input.session_id == "" +# ****************************************************************************** +# *** DO NOT EDIT THESE JSON OUTPUT TESTS! *** +# As documented in doc/platforms/claude/hooks_system.md, hook JSON output must: +# - Be valid JSON +# - Return {} for allow +# - Return {"decision": "block", "reason": "..."} for block +# See: https://docs.anthropic.com/en/docs/claude-code/hooks +# ****************************************************************************** class TestDenormalizeOutput: """Tests for the denormalize_output function.""" def test_produces_valid_json(self) -> None: - """Test that output is valid JSON.""" + """Test that output is valid JSON. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + """ output = HookOutput(decision="block", reason="test") json_str = denormalize_output(output, Platform.CLAUDE, NormalizedEvent.AFTER_AGENT) @@ -301,7 +349,10 @@ def test_produces_valid_json(self) -> None: assert parsed["decision"] == "block" def test_empty_output_produces_empty_object(self) -> None: - """Test that empty output produces '{}'.""" + """Test that empty output produces '{}' (allow response). + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + """ output = HookOutput() json_str = denormalize_output(output, Platform.CLAUDE, NormalizedEvent.AFTER_AGENT) @@ -389,11 +440,31 @@ def test_common_tools_map_to_same_normalized_name(self) -> None: assert TOOL_TO_NORMALIZED[Platform.GEMINI][gemini_tool] == tool +# ****************************************************************************** +# *** DO NOT EDIT THESE INTEGRATION TESTS! *** +# ****************************************************************************** +# +# These tests verify the complete input/output flow for both Claude Code and +# Gemini CLI, following the hook contracts documented in: +# doc/platforms/claude/hooks_system.md +# +# Claude Code contract: +# - Block: {"decision": "block", "reason": "..."} +# +# Gemini CLI contract: +# - Block: {"decision": "deny", "reason": "..."} +# +# The "block" vs "deny" terminology is a platform difference, not a bug. +# See: https://docs.anthropic.com/en/docs/claude-code/hooks +# ****************************************************************************** class TestIntegration: """Integration tests for the full normalization flow.""" def test_claude_stop_hook_flow(self) -> None: - """Test complete flow for Claude Stop hook.""" + """Test complete flow for Claude Stop hook. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + """ # Input from Claude raw_input = json.dumps( { @@ -409,17 +480,20 @@ def test_claude_stop_hook_flow(self) -> None: assert hook_input.event == NormalizedEvent.AFTER_AGENT # Process (would call hook function here) - hook_output = HookOutput(decision="block", reason="Policy X requires attention") + hook_output = HookOutput(decision="block", reason="Rule X requires attention") # Denormalize output_json = denormalize_output(hook_output, Platform.CLAUDE, hook_input.event) result = json.loads(output_json) assert result["decision"] == "block" - assert "Policy X" in result["reason"] + assert "Rule X" in result["reason"] def test_gemini_afteragent_hook_flow(self) -> None: - """Test complete flow for Gemini AfterAgent hook.""" + """Test complete flow for Gemini AfterAgent hook. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + """ # Input from Gemini raw_input = json.dumps( { @@ -436,7 +510,7 @@ def test_gemini_afteragent_hook_flow(self) -> None: assert hook_input.event == NormalizedEvent.AFTER_AGENT # Process (would call hook function here) - hook_output = HookOutput(decision="block", reason="Policy Y requires attention") + hook_output = HookOutput(decision="block", reason="Rule Y requires attention") # Denormalize output_json = denormalize_output(hook_output, Platform.GEMINI, hook_input.event) @@ -444,10 +518,14 @@ def test_gemini_afteragent_hook_flow(self) -> None: # Gemini should get "deny" instead of "block" assert result["decision"] == "deny" - assert "Policy Y" in result["reason"] + assert "Rule Y" in result["reason"] def test_cross_platform_same_hook_logic(self) -> None: - """Test that the same hook logic produces correct output for both platforms.""" + """Test that the same hook logic produces correct output for both platforms. + + DO NOT CHANGE THIS TEST - it verifies the documented hook contract. + The "block" vs "deny" platform difference is intentional. + """ def sample_hook(hook_input: HookInput) -> HookOutput: """Sample hook that blocks if event is after_agent.""" diff --git a/tests/unit/test_hooks_syncer.py b/tests/unit/test_hooks_syncer.py index 0a1b1c0c..abaca222 100644 --- a/tests/unit/test_hooks_syncer.py +++ b/tests/unit/test_hooks_syncer.py @@ -6,6 +6,7 @@ from deepwork.core.adapters import ClaudeAdapter from deepwork.core.hooks_syncer import ( HookEntry, + HookSpec, JobHooks, collect_job_hooks, merge_hooks_for_platform, @@ -16,19 +17,33 @@ class TestHookEntry: """Tests for HookEntry dataclass.""" - def test_get_script_path_relative(self, temp_dir: Path) -> None: - """Test getting relative script path.""" + def test_get_command_for_script(self, temp_dir: Path) -> None: + """Test getting command for a script hook.""" job_dir = temp_dir / ".deepwork" / "jobs" / "test_job" job_dir.mkdir(parents=True) entry = HookEntry( + job_name="test_job", + job_dir=job_dir, script="test_hook.sh", + ) + + cmd = entry.get_command(temp_dir) + assert cmd == ".deepwork/jobs/test_job/hooks/test_hook.sh" + + def test_get_command_for_module(self, temp_dir: Path) -> None: + """Test getting command for a module hook.""" + job_dir = temp_dir / ".deepwork" / "jobs" / "test_job" + job_dir.mkdir(parents=True) + + entry = HookEntry( job_name="test_job", job_dir=job_dir, + module="deepwork.hooks.rules_check", ) - path = entry.get_script_path(temp_dir) - assert path == ".deepwork/jobs/test_job/hooks/test_hook.sh" + cmd = entry.get_command(temp_dir) + assert cmd == "python -m deepwork.hooks.rules_check" class TestJobHooks: @@ -47,7 +62,7 @@ def test_from_job_dir_with_hooks(self, temp_dir: Path) -> None: UserPromptSubmit: - capture.sh Stop: - - policy_check.sh + - rules_check.sh - cleanup.sh """ ) @@ -56,8 +71,35 @@ def test_from_job_dir_with_hooks(self, temp_dir: Path) -> None: assert result is not None assert result.job_name == "test_job" - assert result.hooks["UserPromptSubmit"] == ["capture.sh"] - assert result.hooks["Stop"] == ["policy_check.sh", "cleanup.sh"] + assert len(result.hooks["UserPromptSubmit"]) == 1 + assert result.hooks["UserPromptSubmit"][0].script == "capture.sh" + assert len(result.hooks["Stop"]) == 2 + assert result.hooks["Stop"][0].script == "rules_check.sh" + assert result.hooks["Stop"][1].script == "cleanup.sh" + + def test_from_job_dir_with_module_hooks(self, temp_dir: Path) -> None: + """Test loading module-based hooks from job directory.""" + job_dir = temp_dir / "test_job" + hooks_dir = job_dir / "hooks" + hooks_dir.mkdir(parents=True) + + # Create global_hooks.yml with module format + hooks_file = hooks_dir / "global_hooks.yml" + hooks_file.write_text( + """ +UserPromptSubmit: + - capture.sh +Stop: + - module: deepwork.hooks.rules_check +""" + ) + + result = JobHooks.from_job_dir(job_dir) + + assert result is not None + assert result.hooks["UserPromptSubmit"][0].script == "capture.sh" + assert result.hooks["Stop"][0].module == "deepwork.hooks.rules_check" + assert result.hooks["Stop"][0].script is None def test_from_job_dir_no_hooks_file(self, temp_dir: Path) -> None: """Test returns None when no hooks file exists.""" @@ -91,7 +133,8 @@ def test_from_job_dir_single_script_as_string(self, temp_dir: Path) -> None: result = JobHooks.from_job_dir(job_dir) assert result is not None - assert result.hooks["Stop"] == ["cleanup.sh"] + assert len(result.hooks["Stop"]) == 1 + assert result.hooks["Stop"][0].script == "cleanup.sh" class TestCollectJobHooks: @@ -143,12 +186,15 @@ def test_merges_hooks_from_multiple_jobs(self, temp_dir: Path) -> None: JobHooks( job_name="job1", job_dir=job1_dir, - hooks={"Stop": ["hook1.sh"]}, + hooks={"Stop": [HookSpec(script="hook1.sh")]}, ), JobHooks( job_name="job2", job_dir=job2_dir, - hooks={"Stop": ["hook2.sh"], "UserPromptSubmit": ["capture.sh"]}, + hooks={ + "Stop": [HookSpec(script="hook2.sh")], + "UserPromptSubmit": [HookSpec(script="capture.sh")], + }, ), ] @@ -169,7 +215,7 @@ def test_avoids_duplicate_hooks(self, temp_dir: Path) -> None: JobHooks( job_name="job1", job_dir=job_dir, - hooks={"Stop": ["hook.sh", "hook.sh"]}, + hooks={"Stop": [HookSpec(script="hook.sh"), HookSpec(script="hook.sh")]}, ), ] @@ -197,7 +243,7 @@ def test_syncs_hooks_via_adapter(self, temp_dir: Path) -> None: JobHooks( job_name="test_job", job_dir=job_dir, - hooks={"Stop": ["test_hook.sh"]}, + hooks={"Stop": [HookSpec(script="test_hook.sh")]}, ), ] @@ -250,7 +296,7 @@ def test_merges_with_existing_settings(self, temp_dir: Path) -> None: JobHooks( job_name="test_job", job_dir=job_dir, - hooks={"Stop": ["new_hook.sh"]}, + hooks={"Stop": [HookSpec(script="new_hook.sh")]}, ), ] diff --git a/tests/unit/test_pattern_matcher.py b/tests/unit/test_pattern_matcher.py new file mode 100644 index 00000000..69d73e7e --- /dev/null +++ b/tests/unit/test_pattern_matcher.py @@ -0,0 +1,205 @@ +"""Tests for pattern matching with variable extraction.""" + +import pytest + +from deepwork.core.pattern_matcher import ( + PatternError, + match_pattern, + matches_any_pattern, + matches_glob, + resolve_pattern, + validate_pattern, +) + + +class TestBasicGlobPatterns: + """Tests for basic glob pattern matching (PM-1.1.x from test_scenarios.md).""" + + def test_exact_match(self) -> None: + """PM-1.1.1: Exact match.""" + assert matches_glob("README.md", "README.md") + + def test_exact_no_match(self) -> None: + """PM-1.1.2: Exact no match (case sensitive).""" + assert not matches_glob("readme.md", "README.md") + + def test_single_wildcard(self) -> None: + """PM-1.1.3: Single wildcard.""" + assert matches_glob("main.py", "*.py") + + def test_single_wildcard_nested(self) -> None: + """PM-1.1.4: Single wildcard - fnmatch matches nested paths too. + + Note: Standard fnmatch does match across directory separators. + Use **/*.py pattern to explicitly require directory prefixes. + """ + # fnmatch's * matches any character including / + # This is different from shell glob behavior + assert matches_glob("src/main.py", "*.py") + + def test_double_wildcard(self) -> None: + """PM-1.1.5: Double wildcard matches nested paths.""" + assert matches_glob("src/main.py", "**/*.py") + + def test_double_wildcard_deep(self) -> None: + """PM-1.1.6: Double wildcard matches deeply nested paths.""" + assert matches_glob("src/a/b/c/main.py", "**/*.py") + + def test_double_wildcard_root(self) -> None: + """PM-1.1.7: Double wildcard matches root-level files.""" + assert matches_glob("main.py", "**/*.py") + + def test_directory_prefix(self) -> None: + """PM-1.1.8: Directory prefix matching.""" + assert matches_glob("src/foo.py", "src/**/*") + + def test_directory_prefix_deep(self) -> None: + """PM-1.1.9: Directory prefix matching deeply nested.""" + assert matches_glob("src/a/b/c.py", "src/**/*") + + def test_directory_no_match(self) -> None: + """PM-1.1.10: Directory prefix no match.""" + assert not matches_glob("lib/foo.py", "src/**/*") + + def test_brace_expansion_ts(self) -> None: + """PM-1.1.11: Brace expansion - not supported by fnmatch. + + Note: Python's fnmatch doesn't support brace expansion. + Use matches_any_pattern with multiple patterns instead. + """ + # fnmatch doesn't support {a,b} syntax + assert not matches_glob("app.ts", "*.{js,ts}") + # Use matches_any_pattern for multiple extensions + assert matches_any_pattern("app.ts", ["*.ts", "*.js"]) + + def test_brace_expansion_js(self) -> None: + """PM-1.1.12: Brace expansion - not supported by fnmatch.""" + assert not matches_glob("app.js", "*.{js,ts}") + assert matches_any_pattern("app.js", ["*.ts", "*.js"]) + + def test_brace_expansion_no_match(self) -> None: + """PM-1.1.13: Brace expansion no match.""" + # Neither {a,b} syntax nor multiple patterns match + assert not matches_glob("app.py", "*.{js,ts}") + assert not matches_any_pattern("app.py", ["*.ts", "*.js"]) + + +class TestVariablePatterns: + """Tests for variable pattern matching and extraction (PM-1.2.x).""" + + def test_single_var_path(self) -> None: + """PM-1.2.1: Single variable captures nested path.""" + result = match_pattern("src/{path}.py", "src/foo/bar.py") + assert result.matched + assert result.variables == {"path": "foo/bar"} + + def test_single_var_name(self) -> None: + """PM-1.2.2: Single variable name (non-path).""" + result = match_pattern("src/{name}.py", "src/utils.py") + assert result.matched + assert result.variables == {"name": "utils"} + + def test_name_no_nested(self) -> None: + """PM-1.2.3: {name} doesn't match nested paths (single segment).""" + result = match_pattern("src/{name}.py", "src/foo/bar.py") + # {name} only captures single segment, not nested paths + assert not result.matched + + def test_two_variables(self) -> None: + """PM-1.2.4: Two variables in pattern.""" + result = match_pattern("{dir}/{name}.py", "src/main.py") + assert result.matched + assert result.variables == {"dir": "src", "name": "main"} + + def test_prefix_and_suffix(self) -> None: + """PM-1.2.5: Prefix and suffix around variable.""" + result = match_pattern("test_{name}_test.py", "test_foo_test.py") + assert result.matched + assert result.variables == {"name": "foo"} + + def test_nested_path_variable(self) -> None: + """PM-1.2.6: Nested path in middle.""" + result = match_pattern("src/{path}/index.py", "src/a/b/index.py") + assert result.matched + assert result.variables == {"path": "a/b"} + + def test_explicit_multi_segment(self) -> None: + """PM-1.2.7: Explicit {**mod} for multi-segment.""" + result = match_pattern("src/{**mod}/main.py", "src/a/b/c/main.py") + assert result.matched + assert result.variables == {"mod": "a/b/c"} + + def test_explicit_single_segment(self) -> None: + """PM-1.2.8: Explicit {*name} for single segment.""" + result = match_pattern("src/{*name}.py", "src/utils.py") + assert result.matched + assert result.variables == {"name": "utils"} + + def test_mixed_explicit(self) -> None: + """PM-1.2.9: Mixed explicit single and multi.""" + result = match_pattern("{*dir}/{**path}.py", "src/a/b/c.py") + assert result.matched + assert result.variables == {"dir": "src", "path": "a/b/c"} + + +class TestPatternResolution: + """Tests for pattern resolution / substitution (PM-1.3.x).""" + + def test_simple_substitution(self) -> None: + """PM-1.3.1: Simple variable substitution.""" + result = resolve_pattern("tests/{path}_test.py", {"path": "foo"}) + assert result == "tests/foo_test.py" + + def test_nested_path_substitution(self) -> None: + """PM-1.3.2: Nested path substitution.""" + result = resolve_pattern("tests/{path}_test.py", {"path": "a/b/c"}) + assert result == "tests/a/b/c_test.py" + + def test_multiple_vars_substitution(self) -> None: + """PM-1.3.3: Multiple variables substitution.""" + result = resolve_pattern("{dir}/test_{name}.py", {"dir": "tests", "name": "foo"}) + assert result == "tests/test_foo.py" + + +class TestPatternValidation: + """Tests for pattern syntax validation (SV-8.3.x).""" + + def test_unclosed_brace(self) -> None: + """SV-8.3.1: Unclosed brace.""" + with pytest.raises(PatternError, match="Unclosed brace|unclosed brace"): + validate_pattern("src/{path.py") + + def test_empty_variable(self) -> None: + """SV-8.3.2: Empty variable name.""" + with pytest.raises(PatternError, match="[Ee]mpty variable name"): + validate_pattern("src/{}.py") + + def test_invalid_chars_in_var(self) -> None: + """SV-8.3.3: Invalid characters in variable name.""" + with pytest.raises(PatternError, match="[Ii]nvalid"): + validate_pattern("src/{path/name}.py") + + def test_duplicate_variable(self) -> None: + """SV-8.3.4: Duplicate variable name.""" + with pytest.raises(PatternError, match="[Dd]uplicate"): + validate_pattern("{path}/{path}.py") + + +class TestMatchesAnyPattern: + """Tests for matches_any_pattern function.""" + + def test_matches_first_pattern(self) -> None: + """Match against first of multiple patterns.""" + assert matches_any_pattern("file.py", ["*.py", "*.js"]) + + def test_matches_second_pattern(self) -> None: + """Match against second of multiple patterns.""" + assert matches_any_pattern("file.js", ["*.py", "*.js"]) + + def test_no_match(self) -> None: + """No match in any pattern.""" + assert not matches_any_pattern("file.txt", ["*.py", "*.js"]) + + def test_empty_patterns(self) -> None: + """Empty patterns list never matches.""" + assert not matches_any_pattern("file.py", []) diff --git a/tests/unit/test_policy_parser.py b/tests/unit/test_policy_parser.py deleted file mode 100644 index 80eedbb1..00000000 --- a/tests/unit/test_policy_parser.py +++ /dev/null @@ -1,404 +0,0 @@ -"""Tests for policy definition parser.""" - -from pathlib import Path - -import pytest - -from deepwork.core.policy_parser import ( - DEFAULT_COMPARE_TO, - Policy, - PolicyParseError, - evaluate_policies, - evaluate_policy, - matches_pattern, - parse_policy_file, -) - - -class TestPolicy: - """Tests for Policy dataclass.""" - - def test_from_dict_with_inline_instructions(self) -> None: - """Test creating policy from dict with inline instructions.""" - data = { - "name": "Test Policy", - "trigger": "src/**/*", - "safety": "docs/readme.md", - "instructions": "Do something", - } - policy = Policy.from_dict(data) - - assert policy.name == "Test Policy" - assert policy.triggers == ["src/**/*"] - assert policy.safety == ["docs/readme.md"] - assert policy.instructions == "Do something" - - def test_from_dict_normalizes_trigger_string_to_list(self) -> None: - """Test that trigger string is normalized to list.""" - data = { - "name": "Test", - "trigger": "*.py", - "instructions": "Check it", - } - policy = Policy.from_dict(data) - - assert policy.triggers == ["*.py"] - - def test_from_dict_preserves_trigger_list(self) -> None: - """Test that trigger list is preserved.""" - data = { - "name": "Test", - "trigger": ["*.py", "*.js"], - "instructions": "Check it", - } - policy = Policy.from_dict(data) - - assert policy.triggers == ["*.py", "*.js"] - - def test_from_dict_normalizes_safety_string_to_list(self) -> None: - """Test that safety string is normalized to list.""" - data = { - "name": "Test", - "trigger": "src/*", - "safety": "docs/README.md", - "instructions": "Check it", - } - policy = Policy.from_dict(data) - - assert policy.safety == ["docs/README.md"] - - def test_from_dict_safety_defaults_to_empty_list(self) -> None: - """Test that missing safety defaults to empty list.""" - data = { - "name": "Test", - "trigger": "src/*", - "instructions": "Check it", - } - policy = Policy.from_dict(data) - - assert policy.safety == [] - - def test_from_dict_with_instructions_file(self, temp_dir: Path) -> None: - """Test creating policy from dict with instructions_file.""" - # Create instructions file - instructions_file = temp_dir / "instructions.md" - instructions_file.write_text("# Instructions\nDo this and that.") - - data = { - "name": "Test Policy", - "trigger": "src/*", - "instructions_file": "instructions.md", - } - policy = Policy.from_dict(data, base_dir=temp_dir) - - assert policy.instructions == "# Instructions\nDo this and that." - - def test_from_dict_instructions_file_not_found(self, temp_dir: Path) -> None: - """Test error when instructions_file doesn't exist.""" - data = { - "name": "Test Policy", - "trigger": "src/*", - "instructions_file": "nonexistent.md", - } - - with pytest.raises(PolicyParseError, match="instructions file not found"): - Policy.from_dict(data, base_dir=temp_dir) - - def test_from_dict_instructions_file_without_base_dir(self) -> None: - """Test error when instructions_file used without base_dir.""" - data = { - "name": "Test Policy", - "trigger": "src/*", - "instructions_file": "instructions.md", - } - - with pytest.raises(PolicyParseError, match="no base_dir provided"): - Policy.from_dict(data, base_dir=None) - - def test_from_dict_compare_to_defaults_to_base(self) -> None: - """Test that compare_to defaults to 'base'.""" - data = { - "name": "Test", - "trigger": "src/*", - "instructions": "Check it", - } - policy = Policy.from_dict(data) - - assert policy.compare_to == DEFAULT_COMPARE_TO - assert policy.compare_to == "base" - - def test_from_dict_compare_to_explicit_base(self) -> None: - """Test explicit compare_to: base.""" - data = { - "name": "Test", - "trigger": "src/*", - "instructions": "Check it", - "compare_to": "base", - } - policy = Policy.from_dict(data) - - assert policy.compare_to == "base" - - def test_from_dict_compare_to_default_tip(self) -> None: - """Test compare_to: default_tip.""" - data = { - "name": "Test", - "trigger": "src/*", - "instructions": "Check it", - "compare_to": "default_tip", - } - policy = Policy.from_dict(data) - - assert policy.compare_to == "default_tip" - - def test_from_dict_compare_to_prompt(self) -> None: - """Test compare_to: prompt.""" - data = { - "name": "Test", - "trigger": "src/*", - "instructions": "Check it", - "compare_to": "prompt", - } - policy = Policy.from_dict(data) - - assert policy.compare_to == "prompt" - - -class TestMatchesPattern: - """Tests for matches_pattern function.""" - - def test_simple_glob_match(self) -> None: - """Test simple glob pattern matching.""" - assert matches_pattern("file.py", ["*.py"]) - assert not matches_pattern("file.js", ["*.py"]) - - def test_directory_glob_match(self) -> None: - """Test directory pattern matching.""" - assert matches_pattern("src/file.py", ["src/*"]) - assert not matches_pattern("test/file.py", ["src/*"]) - - def test_recursive_glob_match(self) -> None: - """Test recursive ** pattern matching.""" - assert matches_pattern("src/deep/nested/file.py", ["src/**/*.py"]) - assert matches_pattern("src/file.py", ["src/**/*.py"]) - assert not matches_pattern("test/file.py", ["src/**/*.py"]) - - def test_multiple_patterns(self) -> None: - """Test matching against multiple patterns.""" - patterns = ["*.py", "*.js"] - assert matches_pattern("file.py", patterns) - assert matches_pattern("file.js", patterns) - assert not matches_pattern("file.txt", patterns) - - def test_config_directory_pattern(self) -> None: - """Test pattern like app/config/**/*.""" - assert matches_pattern("app/config/settings.py", ["app/config/**/*"]) - assert matches_pattern("app/config/nested/deep.yml", ["app/config/**/*"]) - assert not matches_pattern("app/other/file.py", ["app/config/**/*"]) - - -class TestEvaluatePolicy: - """Tests for evaluate_policy function.""" - - def test_fires_when_trigger_matches(self) -> None: - """Test policy fires when trigger matches.""" - policy = Policy( - name="Test", - triggers=["src/**/*.py"], - safety=[], - instructions="Check it", - ) - changed_files = ["src/main.py", "README.md"] - - assert evaluate_policy(policy, changed_files) is True - - def test_does_not_fire_when_no_trigger_match(self) -> None: - """Test policy doesn't fire when no trigger matches.""" - policy = Policy( - name="Test", - triggers=["src/**/*.py"], - safety=[], - instructions="Check it", - ) - changed_files = ["test/main.py", "README.md"] - - assert evaluate_policy(policy, changed_files) is False - - def test_does_not_fire_when_safety_matches(self) -> None: - """Test policy doesn't fire when safety file is also changed.""" - policy = Policy( - name="Test", - triggers=["app/config/**/*"], - safety=["docs/install_guide.md"], - instructions="Update docs", - ) - changed_files = ["app/config/settings.py", "docs/install_guide.md"] - - assert evaluate_policy(policy, changed_files) is False - - def test_fires_when_trigger_matches_but_safety_doesnt(self) -> None: - """Test policy fires when trigger matches but safety doesn't.""" - policy = Policy( - name="Test", - triggers=["app/config/**/*"], - safety=["docs/install_guide.md"], - instructions="Update docs", - ) - changed_files = ["app/config/settings.py", "app/main.py"] - - assert evaluate_policy(policy, changed_files) is True - - def test_multiple_safety_patterns(self) -> None: - """Test policy with multiple safety patterns.""" - policy = Policy( - name="Test", - triggers=["src/auth/**/*"], - safety=["SECURITY.md", "docs/security_review.md"], - instructions="Security review", - ) - - # Should not fire if any safety file is changed - assert evaluate_policy(policy, ["src/auth/login.py", "SECURITY.md"]) is False - assert evaluate_policy(policy, ["src/auth/login.py", "docs/security_review.md"]) is False - - # Should fire if no safety files changed - assert evaluate_policy(policy, ["src/auth/login.py"]) is True - - -class TestEvaluatePolicies: - """Tests for evaluate_policies function.""" - - def test_returns_fired_policies(self) -> None: - """Test that evaluate_policies returns all fired policies.""" - policies = [ - Policy( - name="Policy 1", - triggers=["src/**/*"], - safety=[], - instructions="Do 1", - ), - Policy( - name="Policy 2", - triggers=["test/**/*"], - safety=[], - instructions="Do 2", - ), - ] - changed_files = ["src/main.py", "test/test_main.py"] - - fired = evaluate_policies(policies, changed_files) - - assert len(fired) == 2 - assert fired[0].name == "Policy 1" - assert fired[1].name == "Policy 2" - - def test_skips_promised_policies(self) -> None: - """Test that promised policies are skipped.""" - policies = [ - Policy( - name="Policy 1", - triggers=["src/**/*"], - safety=[], - instructions="Do 1", - ), - Policy( - name="Policy 2", - triggers=["src/**/*"], - safety=[], - instructions="Do 2", - ), - ] - changed_files = ["src/main.py"] - promised = {"Policy 1"} - - fired = evaluate_policies(policies, changed_files, promised) - - assert len(fired) == 1 - assert fired[0].name == "Policy 2" - - def test_returns_empty_when_no_policies_fire(self) -> None: - """Test returns empty list when no policies fire.""" - policies = [ - Policy( - name="Policy 1", - triggers=["src/**/*"], - safety=[], - instructions="Do 1", - ), - ] - changed_files = ["test/test_main.py"] - - fired = evaluate_policies(policies, changed_files) - - assert len(fired) == 0 - - -class TestParsePolicyFile: - """Tests for parse_policy_file function.""" - - def test_parses_valid_policy_file(self, fixtures_dir: Path) -> None: - """Test parsing a valid policy file.""" - policy_file = fixtures_dir / "policies" / "valid_policy.yml" - policies = parse_policy_file(policy_file) - - assert len(policies) == 1 - assert policies[0].name == "Update install guide on config changes" - assert policies[0].triggers == ["app/config/**/*"] - assert policies[0].safety == ["docs/install_guide.md"] - assert "Configuration files have changed" in policies[0].instructions - - def test_parses_multiple_policies(self, fixtures_dir: Path) -> None: - """Test parsing a file with multiple policies.""" - policy_file = fixtures_dir / "policies" / "multiple_policies.yml" - policies = parse_policy_file(policy_file) - - assert len(policies) == 3 - assert policies[0].name == "Update install guide on config changes" - assert policies[1].name == "Security review for auth changes" - assert policies[2].name == "API documentation update" - - # Check that arrays are parsed correctly - assert policies[1].triggers == ["src/auth/**/*", "src/security/**/*"] - assert policies[1].safety == ["SECURITY.md", "docs/security_review.md"] - - def test_parses_policy_with_instructions_file(self, fixtures_dir: Path) -> None: - """Test parsing a policy with instructions_file.""" - policy_file = fixtures_dir / "policies" / "policy_with_instructions_file.yml" - policies = parse_policy_file(policy_file) - - assert len(policies) == 1 - assert "Security Review Required" in policies[0].instructions - assert "hardcoded credentials" in policies[0].instructions - - def test_empty_policy_file_returns_empty_list(self, fixtures_dir: Path) -> None: - """Test that empty policy file returns empty list.""" - policy_file = fixtures_dir / "policies" / "empty_policy.yml" - policies = parse_policy_file(policy_file) - - assert policies == [] - - def test_raises_for_missing_trigger(self, fixtures_dir: Path) -> None: - """Test error when policy is missing trigger.""" - policy_file = fixtures_dir / "policies" / "invalid_missing_trigger.yml" - - with pytest.raises(PolicyParseError, match="validation failed"): - parse_policy_file(policy_file) - - def test_raises_for_missing_instructions(self, fixtures_dir: Path) -> None: - """Test error when policy is missing both instructions and instructions_file.""" - policy_file = fixtures_dir / "policies" / "invalid_missing_instructions.yml" - - with pytest.raises(PolicyParseError, match="validation failed"): - parse_policy_file(policy_file) - - def test_raises_for_nonexistent_file(self, temp_dir: Path) -> None: - """Test error when policy file doesn't exist.""" - policy_file = temp_dir / "nonexistent.yml" - - with pytest.raises(PolicyParseError, match="does not exist"): - parse_policy_file(policy_file) - - def test_raises_for_directory_path(self, temp_dir: Path) -> None: - """Test error when path is a directory.""" - with pytest.raises(PolicyParseError, match="is not a file"): - parse_policy_file(temp_dir) diff --git a/tests/unit/test_rules_parser.py b/tests/unit/test_rules_parser.py new file mode 100644 index 00000000..4aedea67 --- /dev/null +++ b/tests/unit/test_rules_parser.py @@ -0,0 +1,733 @@ +"""Tests for rule definition parser.""" + +from pathlib import Path + +from deepwork.core.pattern_matcher import matches_any_pattern as matches_pattern +from deepwork.core.rules_parser import ( + DetectionMode, + PairConfig, + Rule, + evaluate_rule, + evaluate_rules, + load_rules_from_directory, +) + + +class TestMatchesPattern: + """Tests for matches_pattern function.""" + + def test_simple_glob_match(self) -> None: + """Test simple glob pattern matching.""" + assert matches_pattern("file.py", ["*.py"]) + assert not matches_pattern("file.js", ["*.py"]) + + def test_directory_glob_match(self) -> None: + """Test directory pattern matching.""" + assert matches_pattern("src/file.py", ["src/*"]) + assert not matches_pattern("test/file.py", ["src/*"]) + + def test_recursive_glob_match(self) -> None: + """Test recursive ** pattern matching.""" + assert matches_pattern("src/deep/nested/file.py", ["src/**/*.py"]) + assert matches_pattern("src/file.py", ["src/**/*.py"]) + assert not matches_pattern("test/file.py", ["src/**/*.py"]) + + def test_multiple_patterns(self) -> None: + """Test matching against multiple patterns.""" + patterns = ["*.py", "*.js"] + assert matches_pattern("file.py", patterns) + assert matches_pattern("file.js", patterns) + assert not matches_pattern("file.txt", patterns) + + def test_config_directory_pattern(self) -> None: + """Test pattern like app/config/**/*.""" + assert matches_pattern("app/config/settings.py", ["app/config/**/*"]) + assert matches_pattern("app/config/nested/deep.yml", ["app/config/**/*"]) + assert not matches_pattern("app/other/file.py", ["app/config/**/*"]) + + +class TestEvaluateRule: + """Tests for evaluate_rule function.""" + + def test_fires_when_trigger_matches(self) -> None: + """Test rule fires when trigger matches.""" + rule = Rule( + name="Test", + filename="test", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["src/**/*.py"], + safety=[], + instructions="Check it", + ) + changed_files = ["src/main.py", "README.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + + def test_does_not_fire_when_no_trigger_match(self) -> None: + """Test rule doesn't fire when no trigger matches.""" + rule = Rule( + name="Test", + filename="test", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["src/**/*.py"], + safety=[], + instructions="Check it", + ) + changed_files = ["test/main.py", "README.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + def test_does_not_fire_when_safety_matches(self) -> None: + """Test rule doesn't fire when safety file is also changed.""" + rule = Rule( + name="Test", + filename="test", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["app/config/**/*"], + safety=["docs/install_guide.md"], + instructions="Update docs", + ) + changed_files = ["app/config/settings.py", "docs/install_guide.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + def test_fires_when_trigger_matches_but_safety_doesnt(self) -> None: + """Test rule fires when trigger matches but safety doesn't.""" + rule = Rule( + name="Test", + filename="test", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["app/config/**/*"], + safety=["docs/install_guide.md"], + instructions="Update docs", + ) + changed_files = ["app/config/settings.py", "app/main.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + + def test_multiple_safety_patterns(self) -> None: + """Test rule with multiple safety patterns.""" + rule = Rule( + name="Test", + filename="test", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["src/auth/**/*"], + safety=["SECURITY.md", "docs/security_review.md"], + instructions="Security review", + ) + + # Should not fire if any safety file is changed + result1 = evaluate_rule(rule, ["src/auth/login.py", "SECURITY.md"]) + assert result1.should_fire is False + result2 = evaluate_rule(rule, ["src/auth/login.py", "docs/security_review.md"]) + assert result2.should_fire is False + + # Should fire if no safety files changed + result3 = evaluate_rule(rule, ["src/auth/login.py"]) + assert result3.should_fire is True + + +class TestEvaluateRules: + """Tests for evaluate_rules function.""" + + def test_returns_fired_rules(self) -> None: + """Test that evaluate_rules returns all fired rules.""" + rules = [ + Rule( + name="Rule 1", + filename="rule1", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["src/**/*"], + safety=[], + instructions="Do 1", + ), + Rule( + name="Rule 2", + filename="rule2", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["test/**/*"], + safety=[], + instructions="Do 2", + ), + ] + changed_files = ["src/main.py", "test/test_main.py"] + + fired = evaluate_rules(rules, changed_files) + + assert len(fired) == 2 + assert fired[0].rule.name == "Rule 1" + assert fired[1].rule.name == "Rule 2" + + def test_skips_promised_rules(self) -> None: + """Test that promised rules are skipped.""" + rules = [ + Rule( + name="Rule 1", + filename="rule1", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["src/**/*"], + safety=[], + instructions="Do 1", + ), + Rule( + name="Rule 2", + filename="rule2", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["src/**/*"], + safety=[], + instructions="Do 2", + ), + ] + changed_files = ["src/main.py"] + promised = {"Rule 1"} + + fired = evaluate_rules(rules, changed_files, promised) + + assert len(fired) == 1 + assert fired[0].rule.name == "Rule 2" + + def test_returns_empty_when_no_rules_fire(self) -> None: + """Test returns empty list when no rules fire.""" + rules = [ + Rule( + name="Rule 1", + filename="rule1", + detection_mode=DetectionMode.TRIGGER_SAFETY, + triggers=["src/**/*"], + safety=[], + instructions="Do 1", + ), + ] + changed_files = ["test/test_main.py"] + + fired = evaluate_rules(rules, changed_files) + + assert len(fired) == 0 + + +class TestLoadRulesFromDirectory: + """Tests for load_rules_from_directory function.""" + + def test_loads_rules_from_directory(self, temp_dir: Path) -> None: + """Test loading rules from a directory.""" + rules_dir = temp_dir / "rules" + rules_dir.mkdir() + + # Create a rule file + rule_file = rules_dir / "test-rule.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: "src/**/*" +--- +Please check the source files. +""" + ) + + rules = load_rules_from_directory(rules_dir) + + assert len(rules) == 1 + assert rules[0].name == "Test Rule" + assert rules[0].triggers == ["src/**/*"] + assert rules[0].detection_mode == DetectionMode.TRIGGER_SAFETY + assert "check the source files" in rules[0].instructions + + def test_loads_multiple_rules(self, temp_dir: Path) -> None: + """Test loading multiple rules.""" + rules_dir = temp_dir / "rules" + rules_dir.mkdir() + + # Create rule files + (rules_dir / "rule1.md").write_text( + """--- +name: Rule 1 +trigger: "src/**/*" +--- +Instructions for rule 1. +""" + ) + (rules_dir / "rule2.md").write_text( + """--- +name: Rule 2 +trigger: "test/**/*" +--- +Instructions for rule 2. +""" + ) + + rules = load_rules_from_directory(rules_dir) + + assert len(rules) == 2 + names = {r.name for r in rules} + assert names == {"Rule 1", "Rule 2"} + + def test_returns_empty_for_empty_directory(self, temp_dir: Path) -> None: + """Test that empty directory returns empty list.""" + rules_dir = temp_dir / "rules" + rules_dir.mkdir() + + rules = load_rules_from_directory(rules_dir) + + assert rules == [] + + def test_returns_empty_for_nonexistent_directory(self, temp_dir: Path) -> None: + """Test that nonexistent directory returns empty list.""" + rules_dir = temp_dir / "nonexistent" + + rules = load_rules_from_directory(rules_dir) + + assert rules == [] + + def test_loads_rule_with_set_detection_mode(self, temp_dir: Path) -> None: + """Test loading a rule with set detection mode.""" + rules_dir = temp_dir / "rules" + rules_dir.mkdir() + + rule_file = rules_dir / "source-test-pairing.md" + rule_file.write_text( + """--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test files should change together. +""" + ) + + rules = load_rules_from_directory(rules_dir) + + assert len(rules) == 1 + assert rules[0].name == "Source/Test Pairing" + assert rules[0].detection_mode == DetectionMode.SET + assert rules[0].set_patterns == ["src/{path}.py", "tests/{path}_test.py"] + + def test_loads_rule_with_pair_detection_mode(self, temp_dir: Path) -> None: + """Test loading a rule with pair detection mode.""" + rules_dir = temp_dir / "rules" + rules_dir.mkdir() + + rule_file = rules_dir / "api-docs.md" + rule_file.write_text( + """--- +name: API Documentation +pair: + trigger: src/api/{name}.py + expects: docs/api/{name}.md +--- +API code requires documentation. +""" + ) + + rules = load_rules_from_directory(rules_dir) + + assert len(rules) == 1 + assert rules[0].name == "API Documentation" + assert rules[0].detection_mode == DetectionMode.PAIR + assert rules[0].pair_config is not None + assert rules[0].pair_config.trigger == "src/api/{name}.py" + assert rules[0].pair_config.expects == ["docs/api/{name}.md"] + + def test_loads_rule_with_command_action(self, temp_dir: Path) -> None: + """Test loading a rule with command action.""" + rules_dir = temp_dir / "rules" + rules_dir.mkdir() + + rule_file = rules_dir / "format-python.md" + rule_file.write_text( + """--- +name: Format Python +trigger: "**/*.py" +action: + command: "ruff format {file}" + run_for: each_match +--- +""" + ) + + rules = load_rules_from_directory(rules_dir) + + assert len(rules) == 1 + assert rules[0].name == "Format Python" + from deepwork.core.rules_parser import ActionType + + assert rules[0].action_type == ActionType.COMMAND + assert rules[0].command_action is not None + assert rules[0].command_action.command == "ruff format {file}" + assert rules[0].command_action.run_for == "each_match" + + +class TestCorrespondenceSets: + """Tests for set correspondence evaluation (CS-3.x from test_scenarios.md).""" + + def test_both_changed_no_fire(self) -> None: + """CS-3.1.1: Both source and test changed - no fire.""" + rule = Rule( + name="Source/Test Pairing", + filename="source-test-pairing", + detection_mode=DetectionMode.SET, + set_patterns=["src/{path}.py", "tests/{path}_test.py"], + instructions="Update tests", + ) + changed_files = ["src/foo.py", "tests/foo_test.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + def test_only_source_fires(self) -> None: + """CS-3.1.2: Only source changed - fires.""" + rule = Rule( + name="Source/Test Pairing", + filename="source-test-pairing", + detection_mode=DetectionMode.SET, + set_patterns=["src/{path}.py", "tests/{path}_test.py"], + instructions="Update tests", + ) + changed_files = ["src/foo.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + assert "src/foo.py" in result.trigger_files + assert "tests/foo_test.py" in result.missing_files + + def test_only_test_fires(self) -> None: + """CS-3.1.3: Only test changed - fires.""" + rule = Rule( + name="Source/Test Pairing", + filename="source-test-pairing", + detection_mode=DetectionMode.SET, + set_patterns=["src/{path}.py", "tests/{path}_test.py"], + instructions="Update source", + ) + changed_files = ["tests/foo_test.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + assert "tests/foo_test.py" in result.trigger_files + assert "src/foo.py" in result.missing_files + + def test_nested_both_no_fire(self) -> None: + """CS-3.1.4: Nested paths - both changed.""" + rule = Rule( + name="Source/Test Pairing", + filename="source-test-pairing", + detection_mode=DetectionMode.SET, + set_patterns=["src/{path}.py", "tests/{path}_test.py"], + instructions="Update tests", + ) + changed_files = ["src/a/b.py", "tests/a/b_test.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + def test_nested_only_source_fires(self) -> None: + """CS-3.1.5: Nested paths - only source.""" + rule = Rule( + name="Source/Test Pairing", + filename="source-test-pairing", + detection_mode=DetectionMode.SET, + set_patterns=["src/{path}.py", "tests/{path}_test.py"], + instructions="Update tests", + ) + changed_files = ["src/a/b.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + assert "tests/a/b_test.py" in result.missing_files + + def test_unrelated_file_no_fire(self) -> None: + """CS-3.1.6: Unrelated file - no fire.""" + rule = Rule( + name="Source/Test Pairing", + filename="source-test-pairing", + detection_mode=DetectionMode.SET, + set_patterns=["src/{path}.py", "tests/{path}_test.py"], + instructions="Update tests", + ) + changed_files = ["docs/readme.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + def test_source_plus_unrelated_fires(self) -> None: + """CS-3.1.7: Source + unrelated - fires.""" + rule = Rule( + name="Source/Test Pairing", + filename="source-test-pairing", + detection_mode=DetectionMode.SET, + set_patterns=["src/{path}.py", "tests/{path}_test.py"], + instructions="Update tests", + ) + changed_files = ["src/foo.py", "docs/readme.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + + def test_both_plus_unrelated_no_fire(self) -> None: + """CS-3.1.8: Both + unrelated - no fire.""" + rule = Rule( + name="Source/Test Pairing", + filename="source-test-pairing", + detection_mode=DetectionMode.SET, + set_patterns=["src/{path}.py", "tests/{path}_test.py"], + instructions="Update tests", + ) + changed_files = ["src/foo.py", "tests/foo_test.py", "docs/readme.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + +class TestThreePatternSets: + """Tests for three-pattern set correspondence (CS-3.2.x).""" + + def test_all_three_no_fire(self) -> None: + """CS-3.2.1: All three files changed - no fire.""" + rule = Rule( + name="Model/Schema/Migration", + filename="model-schema-migration", + detection_mode=DetectionMode.SET, + set_patterns=[ + "models/{name}.py", + "schemas/{name}.py", + "migrations/{name}.sql", + ], + instructions="Update all related files", + ) + changed_files = ["models/user.py", "schemas/user.py", "migrations/user.sql"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + def test_two_of_three_fires(self) -> None: + """CS-3.2.2: Two of three - fires (missing migration).""" + rule = Rule( + name="Model/Schema/Migration", + filename="model-schema-migration", + detection_mode=DetectionMode.SET, + set_patterns=[ + "models/{name}.py", + "schemas/{name}.py", + "migrations/{name}.sql", + ], + instructions="Update all related files", + ) + changed_files = ["models/user.py", "schemas/user.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + assert "migrations/user.sql" in result.missing_files + + def test_one_of_three_fires(self) -> None: + """CS-3.2.3: One of three - fires (missing 2).""" + rule = Rule( + name="Model/Schema/Migration", + filename="model-schema-migration", + detection_mode=DetectionMode.SET, + set_patterns=[ + "models/{name}.py", + "schemas/{name}.py", + "migrations/{name}.sql", + ], + instructions="Update all related files", + ) + changed_files = ["models/user.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + assert len(result.missing_files) == 2 + assert "schemas/user.py" in result.missing_files + assert "migrations/user.sql" in result.missing_files + + def test_different_names_fire_both(self) -> None: + """CS-3.2.4: Different names - both incomplete.""" + rule = Rule( + name="Model/Schema/Migration", + filename="model-schema-migration", + detection_mode=DetectionMode.SET, + set_patterns=[ + "models/{name}.py", + "schemas/{name}.py", + "migrations/{name}.sql", + ], + instructions="Update all related files", + ) + changed_files = ["models/user.py", "schemas/order.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + # Both trigger because each is incomplete + assert ( + "models/user.py" in result.trigger_files or "schemas/order.py" in result.trigger_files + ) + + +class TestCorrespondencePairs: + """Tests for pair correspondence evaluation (CP-4.x from test_scenarios.md).""" + + def test_both_changed_no_fire(self) -> None: + """CP-4.1.1: Both trigger and expected changed - no fire.""" + rule = Rule( + name="API Documentation", + filename="api-documentation", + detection_mode=DetectionMode.PAIR, + pair_config=PairConfig( + trigger="api/{path}.py", + expects=["docs/api/{path}.md"], + ), + instructions="Update API docs", + ) + changed_files = ["api/users.py", "docs/api/users.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + def test_only_trigger_fires(self) -> None: + """CP-4.1.2: Only trigger changed - fires.""" + rule = Rule( + name="API Documentation", + filename="api-documentation", + detection_mode=DetectionMode.PAIR, + pair_config=PairConfig( + trigger="api/{path}.py", + expects=["docs/api/{path}.md"], + ), + instructions="Update API docs", + ) + changed_files = ["api/users.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + assert "api/users.py" in result.trigger_files + assert "docs/api/users.md" in result.missing_files + + def test_only_expected_no_fire(self) -> None: + """CP-4.1.3: Only expected changed - no fire (directional).""" + rule = Rule( + name="API Documentation", + filename="api-documentation", + detection_mode=DetectionMode.PAIR, + pair_config=PairConfig( + trigger="api/{path}.py", + expects=["docs/api/{path}.md"], + ), + instructions="Update API docs", + ) + changed_files = ["docs/api/users.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + def test_trigger_plus_unrelated_fires(self) -> None: + """CP-4.1.4: Trigger + unrelated - fires.""" + rule = Rule( + name="API Documentation", + filename="api-documentation", + detection_mode=DetectionMode.PAIR, + pair_config=PairConfig( + trigger="api/{path}.py", + expects=["docs/api/{path}.md"], + ), + instructions="Update API docs", + ) + changed_files = ["api/users.py", "README.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + + def test_expected_plus_unrelated_no_fire(self) -> None: + """CP-4.1.5: Expected + unrelated - no fire.""" + rule = Rule( + name="API Documentation", + filename="api-documentation", + detection_mode=DetectionMode.PAIR, + pair_config=PairConfig( + trigger="api/{path}.py", + expects=["docs/api/{path}.md"], + ), + instructions="Update API docs", + ) + changed_files = ["docs/api/users.md", "README.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + +class TestMultiExpectsPairs: + """Tests for multi-expects pair correspondence (CP-4.2.x).""" + + def test_all_three_no_fire(self) -> None: + """CP-4.2.1: All three changed - no fire.""" + rule = Rule( + name="API Full Documentation", + filename="api-full-documentation", + detection_mode=DetectionMode.PAIR, + pair_config=PairConfig( + trigger="api/{path}.py", + expects=["docs/api/{path}.md", "openapi/{path}.yaml"], + ), + instructions="Update API docs and OpenAPI", + ) + changed_files = ["api/users.py", "docs/api/users.md", "openapi/users.yaml"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False + + def test_trigger_plus_one_expect_fires(self) -> None: + """CP-4.2.2: Trigger + one expect - fires (missing openapi).""" + rule = Rule( + name="API Full Documentation", + filename="api-full-documentation", + detection_mode=DetectionMode.PAIR, + pair_config=PairConfig( + trigger="api/{path}.py", + expects=["docs/api/{path}.md", "openapi/{path}.yaml"], + ), + instructions="Update API docs and OpenAPI", + ) + changed_files = ["api/users.py", "docs/api/users.md"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + assert "openapi/users.yaml" in result.missing_files + + def test_only_trigger_fires_missing_both(self) -> None: + """CP-4.2.3: Only trigger - fires (missing both).""" + rule = Rule( + name="API Full Documentation", + filename="api-full-documentation", + detection_mode=DetectionMode.PAIR, + pair_config=PairConfig( + trigger="api/{path}.py", + expects=["docs/api/{path}.md", "openapi/{path}.yaml"], + ), + instructions="Update API docs and OpenAPI", + ) + changed_files = ["api/users.py"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is True + assert len(result.missing_files) == 2 + assert "docs/api/users.md" in result.missing_files + assert "openapi/users.yaml" in result.missing_files + + def test_both_expects_only_no_fire(self) -> None: + """CP-4.2.4: Both expects only - no fire.""" + rule = Rule( + name="API Full Documentation", + filename="api-full-documentation", + detection_mode=DetectionMode.PAIR, + pair_config=PairConfig( + trigger="api/{path}.py", + expects=["docs/api/{path}.md", "openapi/{path}.yaml"], + ), + instructions="Update API docs and OpenAPI", + ) + changed_files = ["docs/api/users.md", "openapi/users.yaml"] + + result = evaluate_rule(rule, changed_files) + assert result.should_fire is False diff --git a/tests/unit/test_rules_queue.py b/tests/unit/test_rules_queue.py new file mode 100644 index 00000000..8c35d06d --- /dev/null +++ b/tests/unit/test_rules_queue.py @@ -0,0 +1,349 @@ +"""Tests for rules queue system (QS-6.x from test_scenarios.md).""" + +from pathlib import Path + +import pytest + +from deepwork.core.rules_queue import ( + ActionResult, + QueueEntry, + QueueEntryStatus, + RulesQueue, + compute_trigger_hash, +) + + +class TestComputeTriggerHash: + """Tests for hash calculation (QS-6.2.x).""" + + def test_same_everything_same_hash(self) -> None: + """QS-6.2.1: Same rule, files, baseline - same hash.""" + hash1 = compute_trigger_hash("RuleA", ["a.py"], "commit1") + hash2 = compute_trigger_hash("RuleA", ["a.py"], "commit1") + assert hash1 == hash2 + + def test_different_files_different_hash(self) -> None: + """QS-6.2.2: Different files - different hash.""" + hash1 = compute_trigger_hash("RuleA", ["a.py"], "commit1") + hash2 = compute_trigger_hash("RuleA", ["b.py"], "commit1") + assert hash1 != hash2 + + def test_different_baseline_different_hash(self) -> None: + """QS-6.2.3: Different baseline - different hash.""" + hash1 = compute_trigger_hash("RuleA", ["a.py"], "commit1") + hash2 = compute_trigger_hash("RuleA", ["a.py"], "commit2") + assert hash1 != hash2 + + def test_different_rule_different_hash(self) -> None: + """QS-6.2.4: Different rule - different hash.""" + hash1 = compute_trigger_hash("RuleA", ["a.py"], "commit1") + hash2 = compute_trigger_hash("RuleB", ["a.py"], "commit1") + assert hash1 != hash2 + + def test_file_order_independent(self) -> None: + """File order should not affect hash (sorted internally).""" + hash1 = compute_trigger_hash("RuleA", ["a.py", "b.py"], "commit1") + hash2 = compute_trigger_hash("RuleA", ["b.py", "a.py"], "commit1") + assert hash1 == hash2 + + +class TestQueueEntry: + """Tests for QueueEntry dataclass.""" + + def test_to_dict_and_from_dict(self) -> None: + """Round-trip serialization.""" + entry = QueueEntry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_hash="abc123", + status=QueueEntryStatus.QUEUED, + baseline_ref="commit1", + trigger_files=["src/main.py"], + expected_files=["tests/main_test.py"], + ) + + data = entry.to_dict() + restored = QueueEntry.from_dict(data) + + assert restored.rule_name == entry.rule_name + assert restored.rule_file == entry.rule_file + assert restored.trigger_hash == entry.trigger_hash + assert restored.status == entry.status + assert restored.trigger_files == entry.trigger_files + assert restored.expected_files == entry.expected_files + + def test_with_action_result(self) -> None: + """Serialization with action result.""" + entry = QueueEntry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_hash="abc123", + action_result=ActionResult(type="command", output="ok", exit_code=0), + ) + + data = entry.to_dict() + restored = QueueEntry.from_dict(data) + + assert restored.action_result is not None + assert restored.action_result.type == "command" + assert restored.action_result.exit_code == 0 + + +class TestRulesQueue: + """Tests for RulesQueue class (QS-6.1.x, QS-6.3.x).""" + + @pytest.fixture + def queue(self, tmp_path: Path) -> RulesQueue: + """Create a queue with temp directory.""" + return RulesQueue(tmp_path / "queue") + + def test_create_entry(self, queue: RulesQueue) -> None: + """QS-6.1.1: Create new queue entry.""" + entry = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + + assert entry is not None + assert entry.status == QueueEntryStatus.QUEUED + assert entry.rule_name == "Test Rule" + + def test_create_duplicate_returns_none(self, queue: RulesQueue) -> None: + """QS-6.1.6: Re-trigger same files returns None.""" + entry1 = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + entry2 = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + + assert entry1 is not None + assert entry2 is None # Duplicate + + def test_create_different_files_new_entry(self, queue: RulesQueue) -> None: + """QS-6.1.7: Different files create new entry.""" + entry1 = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/a.py"], + baseline_ref="commit1", + ) + entry2 = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/b.py"], # Different file + baseline_ref="commit1", + ) + + assert entry1 is not None + assert entry2 is not None + + def test_has_entry(self, queue: RulesQueue) -> None: + """Check if entry exists.""" + entry = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + assert entry is not None + + assert queue.has_entry(entry.trigger_hash) is True + assert queue.has_entry("nonexistent") is False + + def test_get_entry(self, queue: RulesQueue) -> None: + """Retrieve entry by hash.""" + entry = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + assert entry is not None + + retrieved = queue.get_entry(entry.trigger_hash) + assert retrieved is not None + assert retrieved.rule_name == "Test Rule" + + def test_get_nonexistent_entry(self, queue: RulesQueue) -> None: + """Get nonexistent entry returns None.""" + assert queue.get_entry("nonexistent") is None + + def test_update_status_to_passed(self, queue: RulesQueue) -> None: + """QS-6.1.3: Update status to passed.""" + entry = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + assert entry is not None + + success = queue.update_status(entry.trigger_hash, QueueEntryStatus.PASSED) + assert success is True + + updated = queue.get_entry(entry.trigger_hash) + assert updated is not None + assert updated.status == QueueEntryStatus.PASSED + assert updated.evaluated_at is not None + + def test_update_status_to_failed(self, queue: RulesQueue) -> None: + """QS-6.1.5: Update status to failed.""" + entry = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + assert entry is not None + + action_result = ActionResult(type="command", output="error", exit_code=1) + success = queue.update_status(entry.trigger_hash, QueueEntryStatus.FAILED, action_result) + assert success is True + + updated = queue.get_entry(entry.trigger_hash) + assert updated is not None + assert updated.status == QueueEntryStatus.FAILED + assert updated.action_result is not None + assert updated.action_result.exit_code == 1 + + def test_update_status_to_skipped(self, queue: RulesQueue) -> None: + """QS-6.1.2: Update status to skipped (safety suppression).""" + entry = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + assert entry is not None + + success = queue.update_status(entry.trigger_hash, QueueEntryStatus.SKIPPED) + assert success is True + + updated = queue.get_entry(entry.trigger_hash) + assert updated is not None + assert updated.status == QueueEntryStatus.SKIPPED + + def test_update_nonexistent_returns_false(self, queue: RulesQueue) -> None: + """Update nonexistent entry returns False.""" + success = queue.update_status("nonexistent", QueueEntryStatus.PASSED) + assert success is False + + def test_get_queued_entries(self, queue: RulesQueue) -> None: + """Get only queued entries.""" + # Create multiple entries with different statuses + entry1 = queue.create_entry( + rule_name="Rule 1", + rule_file="rule1.md", + trigger_files=["a.py"], + baseline_ref="commit1", + ) + entry2 = queue.create_entry( + rule_name="Rule 2", + rule_file="rule2.md", + trigger_files=["b.py"], + baseline_ref="commit1", + ) + assert entry1 is not None + assert entry2 is not None + + # Update one to passed + queue.update_status(entry1.trigger_hash, QueueEntryStatus.PASSED) + + # Get queued only + queued = queue.get_queued_entries() + assert len(queued) == 1 + assert queued[0].rule_name == "Rule 2" + + def test_get_all_entries(self, queue: RulesQueue) -> None: + """Get all entries regardless of status.""" + entry1 = queue.create_entry( + rule_name="Rule 1", + rule_file="rule1.md", + trigger_files=["a.py"], + baseline_ref="commit1", + ) + entry2 = queue.create_entry( + rule_name="Rule 2", + rule_file="rule2.md", + trigger_files=["b.py"], + baseline_ref="commit1", + ) + assert entry1 is not None + assert entry2 is not None + + queue.update_status(entry1.trigger_hash, QueueEntryStatus.PASSED) + + all_entries = queue.get_all_entries() + assert len(all_entries) == 2 + + def test_remove_entry(self, queue: RulesQueue) -> None: + """Remove entry by hash.""" + entry = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + assert entry is not None + + removed = queue.remove_entry(entry.trigger_hash) + assert removed is True + assert queue.has_entry(entry.trigger_hash) is False + + def test_remove_nonexistent_returns_false(self, queue: RulesQueue) -> None: + """Remove nonexistent entry returns False.""" + removed = queue.remove_entry("nonexistent") + assert removed is False + + def test_clear(self, queue: RulesQueue) -> None: + """Clear all entries.""" + queue.create_entry( + rule_name="Rule 1", + rule_file="rule1.md", + trigger_files=["a.py"], + baseline_ref="commit1", + ) + queue.create_entry( + rule_name="Rule 2", + rule_file="rule2.md", + trigger_files=["b.py"], + baseline_ref="commit1", + ) + + count = queue.clear() + assert count == 2 + assert len(queue.get_all_entries()) == 0 + + def test_clear_empty_queue(self, queue: RulesQueue) -> None: + """Clear empty queue returns 0.""" + count = queue.clear() + assert count == 0 + + def test_file_structure(self, queue: RulesQueue) -> None: + """Verify queue files are named correctly.""" + entry = queue.create_entry( + rule_name="Test Rule", + rule_file="test-rule.md", + trigger_files=["src/main.py"], + baseline_ref="commit1", + ) + assert entry is not None + + # Check file exists with correct naming + expected_file = queue.queue_dir / f"{entry.trigger_hash}.queued.json" + assert expected_file.exists() + + # Update status and check file renamed + queue.update_status(entry.trigger_hash, QueueEntryStatus.PASSED) + assert not expected_file.exists() + passed_file = queue.queue_dir / f"{entry.trigger_hash}.passed.json" + assert passed_file.exists() diff --git a/tests/unit/test_schema_validation.py b/tests/unit/test_schema_validation.py new file mode 100644 index 00000000..fc921ec8 --- /dev/null +++ b/tests/unit/test_schema_validation.py @@ -0,0 +1,323 @@ +"""Tests for schema validation (SV-8.x from test_scenarios.md).""" + +from pathlib import Path + +import pytest + +from deepwork.core.rules_parser import RulesParseError, parse_rule_file + + +class TestRequiredFields: + """Tests for required field validation (SV-8.1.x).""" + + def test_missing_name(self, tmp_path: Path) -> None: + """SV-8.1.1: Missing name field.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +trigger: "src/**/*" +--- +Instructions here. +""" + ) + + with pytest.raises(RulesParseError, match="name"): + parse_rule_file(rule_file) + + def test_missing_detection_mode(self, tmp_path: Path) -> None: + """SV-8.1.2: Missing trigger, set, or pair.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +--- +Instructions here. +""" + ) + + with pytest.raises(RulesParseError): + parse_rule_file(rule_file) + + def test_missing_markdown_body(self, tmp_path: Path) -> None: + """SV-8.1.3: Missing markdown body (for prompt action).""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: "src/**/*" +--- +""" + ) + + with pytest.raises(RulesParseError, match="markdown body|instructions"): + parse_rule_file(rule_file) + + def test_set_requires_two_patterns(self, tmp_path: Path) -> None: + """SV-8.1.4: Set requires at least 2 patterns. + + Note: Schema validation catches this before rule parser. + """ + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +set: + - src/{path}.py +--- +Instructions here. +""" + ) + + # Schema validation will fail due to minItems: 2 + with pytest.raises(RulesParseError): + parse_rule_file(rule_file) + + +class TestMutuallyExclusiveFields: + """Tests for mutually exclusive field validation (SV-8.2.x).""" + + def test_both_trigger_and_set(self, tmp_path: Path) -> None: + """SV-8.2.1: Both trigger and set is invalid.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: "src/**/*" +set: + - src/{path}.py + - tests/{path}_test.py +--- +Instructions here. +""" + ) + + with pytest.raises(RulesParseError): + parse_rule_file(rule_file) + + def test_both_trigger_and_pair(self, tmp_path: Path) -> None: + """SV-8.2.2: Both trigger and pair is invalid.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: "src/**/*" +pair: + trigger: api/{path}.py + expects: docs/{path}.md +--- +Instructions here. +""" + ) + + with pytest.raises(RulesParseError): + parse_rule_file(rule_file) + + def test_all_detection_modes(self, tmp_path: Path) -> None: + """SV-8.2.3: All three detection modes is invalid.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: "src/**/*" +set: + - src/{path}.py + - tests/{path}_test.py +pair: + trigger: api/{path}.py + expects: docs/{path}.md +--- +Instructions here. +""" + ) + + with pytest.raises(RulesParseError): + parse_rule_file(rule_file) + + +class TestValueValidation: + """Tests for value validation (SV-8.4.x).""" + + def test_invalid_compare_to(self, tmp_path: Path) -> None: + """SV-8.4.1: Invalid compare_to value.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: "src/**/*" +compare_to: yesterday +--- +Instructions here. +""" + ) + + with pytest.raises(RulesParseError): + parse_rule_file(rule_file) + + def test_invalid_run_for(self, tmp_path: Path) -> None: + """SV-8.4.2: Invalid run_for value.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: "**/*.py" +action: + command: "ruff format {file}" + run_for: first_match +--- +""" + ) + + with pytest.raises(RulesParseError): + parse_rule_file(rule_file) + + +class TestValidRules: + """Tests for valid rule parsing.""" + + def test_valid_trigger_safety_rule(self, tmp_path: Path) -> None: + """Valid trigger/safety rule parses successfully.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: "src/**/*" +safety: README.md +--- +Please check the code. +""" + ) + + rule = parse_rule_file(rule_file) + assert rule.name == "Test Rule" + assert rule.triggers == ["src/**/*"] + assert rule.safety == ["README.md"] + + def test_valid_set_rule(self, tmp_path: Path) -> None: + """Valid set rule parses successfully.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Source/Test Pairing +set: + - src/{path}.py + - tests/{path}_test.py +--- +Source and test should change together. +""" + ) + + rule = parse_rule_file(rule_file) + assert rule.name == "Source/Test Pairing" + assert len(rule.set_patterns) == 2 + + def test_valid_pair_rule(self, tmp_path: Path) -> None: + """Valid pair rule parses successfully.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: API Documentation +pair: + trigger: api/{module}.py + expects: docs/api/{module}.md +--- +API changes need documentation. +""" + ) + + rule = parse_rule_file(rule_file) + assert rule.name == "API Documentation" + assert rule.pair_config is not None + assert rule.pair_config.trigger == "api/{module}.py" + assert rule.pair_config.expects == ["docs/api/{module}.md"] + + def test_valid_command_rule(self, tmp_path: Path) -> None: + """Valid command rule parses successfully.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Format Python +trigger: "**/*.py" +action: + command: "ruff format {file}" + run_for: each_match +--- +""" + ) + + rule = parse_rule_file(rule_file) + assert rule.name == "Format Python" + assert rule.command_action is not None + assert rule.command_action.command == "ruff format {file}" + assert rule.command_action.run_for == "each_match" + + def test_valid_compare_to_values(self, tmp_path: Path) -> None: + """Valid compare_to values parse successfully.""" + for compare_to in ["base", "default_tip", "prompt"]: + rule_file = tmp_path / "test.md" + rule_file.write_text( + f"""--- +name: Test Rule +trigger: "src/**/*" +compare_to: {compare_to} +--- +Instructions here. +""" + ) + + rule = parse_rule_file(rule_file) + assert rule.compare_to == compare_to + + def test_multiple_triggers(self, tmp_path: Path) -> None: + """Multiple triggers as array parses successfully.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: + - src/**/*.py + - lib/**/*.py +--- +Instructions here. +""" + ) + + rule = parse_rule_file(rule_file) + assert rule.triggers == ["src/**/*.py", "lib/**/*.py"] + + def test_multiple_safety_patterns(self, tmp_path: Path) -> None: + """Multiple safety patterns as array parses successfully.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +trigger: src/**/* +safety: + - README.md + - CHANGELOG.md +--- +Instructions here. +""" + ) + + rule = parse_rule_file(rule_file) + assert rule.safety == ["README.md", "CHANGELOG.md"] + + def test_multiple_expects(self, tmp_path: Path) -> None: + """Multiple expects patterns parses successfully.""" + rule_file = tmp_path / "test.md" + rule_file.write_text( + """--- +name: Test Rule +pair: + trigger: api/{module}.py + expects: + - docs/api/{module}.md + - openapi/{module}.yaml +--- +Instructions here. +""" + ) + + rule = parse_rule_file(rule_file) + assert rule.pair_config is not None + assert rule.pair_config.expects == ["docs/api/{module}.md", "openapi/{module}.yaml"] diff --git a/uv.lock b/uv.lock index c4091ca4..cd4110a3 100644 --- a/uv.lock +++ b/uv.lock @@ -126,7 +126,7 @@ toml = [ [[package]] name = "deepwork" -version = "0.3.0" +version = "0.4.0" source = { editable = "." } dependencies = [ { name = "click" },