diff --git a/.claude/commands/AGENTS.md b/.claude/commands/AGENTS.md
deleted file mode 100644
index 8e7ac73c..00000000
--- a/.claude/commands/AGENTS.md
+++ /dev/null
@@ -1,5 +0,0 @@
-# Auto-Generated Commands
-
-All command files in this directory are auto-generated by `deepwork-sync` and should not be manually edited. Changes will be overwritten on the next sync.
-
-If you need to modify a command, update the source configuration and re-run the sync process.
diff --git a/.claude/commands/add_platform.md b/.claude/commands/add_platform.md
new file mode 100644
index 00000000..e23ea1f2
--- /dev/null
+++ b/.claude/commands/add_platform.md
@@ -0,0 +1,75 @@
+---
+description: Add a new AI platform to DeepWork with adapter, templates, and tests
+---
+
+# add_platform
+
+You are executing the **add_platform** job. Add a new AI platform to DeepWork with adapter, templates, and tests
+
+A workflow for adding support for a new AI platform (like Cursor, Windsurf, etc.) to DeepWork.
+
+This job guides you through four phases:
+1. **Research**: Capture the platform's CLI configuration and hooks system documentation
+2. **Add Capabilities**: Update the job schema and adapters with any new hook events
+3. **Implement**: Create the platform adapter, templates, tests (100% coverage), and README updates
+4. **Verify**: Ensure installation works correctly and produces expected files
+
+The workflow ensures consistency across all supported platforms and maintains
+comprehensive test coverage for new functionality.
+
+**Important Notes**:
+- Only hooks available on slash command definitions should be captured
+- Each existing adapter must be updated when new hooks are added (typically with null values)
+- Tests must achieve 100% coverage for any new functionality
+- Installation verification confirms the platform integrates correctly with existing jobs
+
+
+## Available Steps
+
+This job has 4 step(s):
+
+### research
+**Research Platform Documentation**: Capture CLI configuration and hooks system documentation for the new platform
+- Command: `uw.add_platform.research`
+### add_capabilities
+**Add Hook Capabilities**: Update job schema and adapters with any new hook events the platform supports
+- Command: `uw.add_platform.add_capabilities`
+- Requires: research
+### implement
+**Implement Platform Support**: Add platform adapter, templates, tests with 100% coverage, and README documentation
+- Command: `uw.add_platform.implement`
+- Requires: research, add_capabilities
+### verify
+**Verify Installation**: Set up platform directories and verify deepwork install works correctly
+- Command: `uw.add_platform.verify`
+- Requires: implement
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and run through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/add_platform`
+
+2. **Identify the starting step** based on intent:
+   - research: Capture CLI configuration and hooks system documentation for the new platform
+   - add_capabilities: Update job schema and adapters with any new hook events the platform supports
+   - implement: Add platform adapter, templates, tests with 100% coverage, and README documentation
+   - verify: Set up platform directories and verify deepwork install works correctly
+
+3. **Run the workflow** starting from the identified step:
+   - Invoke the starting step using the Skill tool
+   - When that step completes, **automatically continue** to the next step in the workflow
+   - Continue until the workflow is complete or the user intervenes
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Use AskUserQuestion to let them choose
+
+**Critical**:
+- You MUST invoke each step using the Skill tool. Do not copy/paste step instructions.
+- After each step completes, check if there's a next step and invoke it automatically.
+- The workflow continues until all dependent steps are complete.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/add_platform/job.yml`
\ No newline at end of file
diff --git a/.claude/commands/deepwork_jobs.learn.md b/.claude/commands/deepwork_jobs.learn.md
index b1eb339c..d8910651 100644
--- a/.claude/commands/deepwork_jobs.learn.md
+++ b/.claude/commands/deepwork_jobs.learn.md
@@ -9,10 +9,8 @@ hooks:
 
             ## Quality Criteria
 
-            Verify the learning process meets ALL quality criteria before completing:
-
-            1. **Conversation Analyzed**: Did you review the conversation for DeepWork job executions?
-            2. **Confusion Identified**: Did you identify points of confusion, errors, or inefficiencies?
+            1. **Conversation Analyzed**: Did the agent review the conversation for DeepWork job executions?
+            2. **Confusion Identified**: Did the agent identify points of confusion, errors, or inefficiencies?
             3. **Instructions Improved**: Were job instructions updated to address identified issues?
             4. **Instructions Concise**: Are instructions free of redundancy and unnecessary verbosity?
             5. **Shared Content Extracted**: Is lengthy/duplicated content extracted into referenced files?
@@ -22,10 +20,6 @@ hooks:
             9. **Generalizable Separated**: Are generalizable improvements in instructions, not AGENTS.md?
             10. **Sync Complete**: Has `deepwork sync` been run if instructions were modified?
 
-            If ANY criterion is not met, continue working to address it.
-            If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
-
-
             ## Instructions
 
             Review the conversation and determine if ALL quality criteria above have been satisfied.
@@ -34,8 +28,8 @@ hooks:
             If the agent has included `<promise>✓ Quality Criteria Met</promise>` in their response AND
             all criteria appear to be met, respond with: {"ok": true}
 
-            If criteria are NOT met AND the promise tag is missing, respond with:
-            {"ok": false, "reason": "Continue working. [specific feedback on what's wrong]"}
+            If criteria are NOT met OR the promise tag is missing, respond with:
+            {"ok": false, "reason": "**AGENT: TAKE ACTION** - [which criteria failed and why]"}
 ---
 
 # deepwork_jobs.learn
@@ -386,10 +380,9 @@ Ensure all outputs are:
 This step uses an iterative quality validation loop. After completing your work, stop hook(s) will evaluate whether the outputs meet quality criteria. If criteria are not met, you will be prompted to continue refining.
 
 ### Quality Criteria
-Verify the learning process meets ALL quality criteria before completing:
 
-1. **Conversation Analyzed**: Did you review the conversation for DeepWork job executions?
-2. **Confusion Identified**: Did you identify points of confusion, errors, or inefficiencies?
+1. **Conversation Analyzed**: Did the agent review the conversation for DeepWork job executions?
+2. **Confusion Identified**: Did the agent identify points of confusion, errors, or inefficiencies?
 3. **Instructions Improved**: Were job instructions updated to address identified issues?
 4. **Instructions Concise**: Are instructions free of redundancy and unnecessary verbosity?
 5. **Shared Content Extracted**: Is lengthy/duplicated content extracted into referenced files?
@@ -399,9 +392,6 @@ Verify the learning process meets ALL quality criteria before completing:
 9. **Generalizable Separated**: Are generalizable improvements in instructions, not AGENTS.md?
 10. **Sync Complete**: Has `deepwork sync` been run if instructions were modified?
 
-If ANY criterion is not met, continue working to address it.
-If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
-
 
 ### Completion Promise
 
diff --git a/.claude/commands/deepwork_jobs.md b/.claude/commands/deepwork_jobs.md
new file mode 100644
index 00000000..2d9b911c
--- /dev/null
+++ b/.claude/commands/deepwork_jobs.md
@@ -0,0 +1,63 @@
+---
+description: DeepWork job management commands
+---
+
+# deepwork_jobs
+
+You are executing the **deepwork_jobs** job. DeepWork job management commands
+
+Core commands for managing DeepWork jobs. These commands help you define new multi-step
+workflows and learn from running them.
+
+The `define` command guides you through an interactive process to create a new job by
+asking structured questions about your workflow, understanding each step's inputs and outputs,
+and generating all necessary files.
+
+The `learn` command reflects on conversations where DeepWork jobs were run, identifies
+confusion or inefficiencies, and improves job instructions. It also captures bespoke
+learnings specific to the current run into AGENTS.md files in the working folder.
+
+
+## Available Steps
+
+This job has 3 step(s):
+
+### define
+**Define Job Specification**: Create the job.yml specification file by understanding workflow requirements
+- Command: `uw.deepwork_jobs.define`
+### implement
+**Implement Job Steps**: Generate instruction files for each step based on the job.yml specification
+- Command: `uw.deepwork_jobs.implement`
+- Requires: define
+### learn
+**Learn from Job Execution**: Reflect on conversation to improve job instructions and capture learnings
+- Command: `deepwork_jobs.learn`
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and run through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/deepwork_jobs`
+
+2. **Identify the starting step** based on intent:
+   - define: Create the job.yml specification file by understanding workflow requirements
+   - implement: Generate instruction files for each step based on the job.yml specification
+   - learn: Reflect on conversation to improve job instructions and capture learnings
+
+3. **Run the workflow** starting from the identified step:
+   - Invoke the starting step using the Skill tool
+   - When that step completes, **automatically continue** to the next step in the workflow
+   - Continue until the workflow is complete or the user intervenes
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Use AskUserQuestion to let them choose
+
+**Critical**:
+- You MUST invoke each step using the Skill tool. Do not copy/paste step instructions.
+- After each step completes, check if there's a next step and invoke it automatically.
+- The workflow continues until all dependent steps are complete.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/deepwork_jobs/job.yml`
\ No newline at end of file
diff --git a/.claude/commands/deepwork_rules.md b/.claude/commands/deepwork_rules.md
new file mode 100644
index 00000000..7535c6e7
--- /dev/null
+++ b/.claude/commands/deepwork_rules.md
@@ -0,0 +1,59 @@
+---
+description: Rules enforcement for AI agent sessions
+---
+
+# deepwork_rules
+
+You are executing the **deepwork_rules** job. Rules enforcement for AI agent sessions
+
+Manages rules that automatically trigger when certain files change during an AI agent session.
+Rules help ensure that code changes follow team guidelines, documentation is updated,
+and architectural decisions are respected.
+
+Rules are stored as individual markdown files with YAML frontmatter in the `.deepwork/rules/`
+directory. Each rule file specifies:
+- Detection mode: trigger/safety, set (bidirectional), or pair (directional)
+- Patterns: Glob patterns for matching files, with optional variable capture
+- Instructions: Markdown content describing what the agent should do
+
+Example use cases:
+- Update installation docs when configuration files change
+- Require security review when authentication code is modified
+- Ensure API documentation stays in sync with API code
+- Enforce source/test file pairing
+
+
+## Available Steps
+
+This job has 1 step(s):
+
+### define
+**Define Rule**: Create a new rule file in .deepwork/rules/
+- Command: `uw.deepwork_rules.define`
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and run through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/deepwork_rules`
+
+2. **Identify the starting step** based on intent:
+   - define: Create a new rule file in .deepwork/rules/
+
+3. **Run the workflow** starting from the identified step:
+   - Invoke the starting step using the Skill tool
+   - When that step completes, **automatically continue** to the next step in the workflow
+   - Continue until the workflow is complete or the user intervenes
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Use AskUserQuestion to let them choose
+
+**Critical**:
+- You MUST invoke each step using the Skill tool. Do not copy/paste step instructions.
+- After each step completes, check if there's a next step and invoke it automatically.
+- The workflow continues until all dependent steps are complete.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/deepwork_rules/job.yml`
\ No newline at end of file
diff --git a/.claude/commands/update.md b/.claude/commands/update.md
new file mode 100644
index 00000000..3c7b2705
--- /dev/null
+++ b/.claude/commands/update.md
@@ -0,0 +1,57 @@
+---
+description: Update standard jobs in src/ and sync to installed locations
+---
+
+# update
+
+You are executing the **update** job. Update standard jobs in src/ and sync to installed locations
+
+A workflow for maintaining standard jobs bundled with DeepWork. Standard jobs
+(like `deepwork_jobs` and `deepwork_rules`) are source-controlled in
+`src/deepwork/standard_jobs/` and must be edited there—never in `.deepwork/jobs/`
+or `.claude/commands/` directly.
+
+This job guides you through:
+1. Identifying which standard job(s) to update from conversation context
+2. Making changes in the correct source location (`src/deepwork/standard_jobs/[job_name]/`)
+3. Running `deepwork install` to propagate changes to `.deepwork/` and command directories
+4. Verifying the sync completed successfully
+
+Use this job whenever you need to modify job.yml files, step instructions, or hooks
+for any standard job in the DeepWork repository.
+
+
+## Available Steps
+
+This job has 1 step(s):
+
+### job
+**Update Standard Job**: Edit standard job source files and sync to installed locations
+- Command: `uw.update.job`
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and run through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/update`
+
+2. **Identify the starting step** based on intent:
+   - job: Edit standard job source files and sync to installed locations
+
+3. **Run the workflow** starting from the identified step:
+   - Invoke the starting step using the Skill tool
+   - When that step completes, **automatically continue** to the next step in the workflow
+   - Continue until the workflow is complete or the user intervenes
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Use AskUserQuestion to let them choose
+
+**Critical**:
+- You MUST invoke each step using the Skill tool. Do not copy/paste step instructions.
+- After each step completes, check if there's a next step and invoke it automatically.
+- The workflow continues until all dependent steps are complete.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/update/job.yml`
\ No newline at end of file
diff --git a/.claude/commands/add_platform.add_capabilities.md b/.claude/commands/uw.add_platform.add_capabilities.md
similarity index 87%
rename from .claude/commands/add_platform.add_capabilities.md
rename to .claude/commands/uw.add_platform.add_capabilities.md
index dac48f9f..f63224d7 100644
--- a/.claude/commands/add_platform.add_capabilities.md
+++ b/.claude/commands/uw.add_platform.add_capabilities.md
@@ -5,10 +5,6 @@ hooks:
     - hooks:
         - type: prompt
           prompt: |
-            You must evaluate whether Claude has met all the below quality criteria for the request.
-
-            ## Quality Criteria
-
             Verify the capability additions meet ALL criteria:
             1. Any new hooks from the platform (for slash commands only) are added to src/deepwork/schemas/job_schema.py
             2. All existing adapters in src/deepwork/adapters.py are updated with the new hook fields
@@ -20,17 +16,6 @@ hooks:
 
             If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
 
-
-            ## Instructions
-
-            Review the conversation and determine if ALL quality criteria above have been satisfied.
-            Look for evidence that each criterion has been addressed.
-
-            If the agent has included `<promise>✓ Quality Criteria Met</promise>` in their response AND
-            all criteria appear to be met, respond with: {"ok": true}
-
-            If criteria are NOT met AND the promise tag is missing, respond with:
-            {"ok": false, "reason": "Continue working. [specific feedback on what's wrong]"}
 ---
 
 # add_platform.add_capabilities
@@ -241,17 +226,6 @@ Ensure all outputs are:
 
 This step uses an iterative quality validation loop. After completing your work, stop hook(s) will evaluate whether the outputs meet quality criteria. If criteria are not met, you will be prompted to continue refining.
 
-### Quality Criteria
-Verify the capability additions meet ALL criteria:
-1. Any new hooks from the platform (for slash commands only) are added to src/deepwork/schemas/job_schema.py
-2. All existing adapters in src/deepwork/adapters.py are updated with the new hook fields
-   (set to None/null if the platform doesn't support that hook)
-3. Only hooks available on slash command definitions are added (not general CLI hooks)
-4. job_schema.py remains valid Python with no syntax errors
-5. adapters.py remains consistent - all adapters have the same hook fields
-6. If no new hooks are needed, document why in a comment
-
-If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
 
 
 ### Completion Promise
diff --git a/.claude/commands/add_platform.implement.md b/.claude/commands/uw.add_platform.implement.md
similarity index 89%
rename from .claude/commands/add_platform.implement.md
rename to .claude/commands/uw.add_platform.implement.md
index 098fb55e..3e1e99c8 100644
--- a/.claude/commands/add_platform.implement.md
+++ b/.claude/commands/uw.add_platform.implement.md
@@ -7,10 +7,6 @@ hooks:
           command: ".deepwork/jobs/add_platform/hooks/run_tests.sh"
         - type: prompt
           prompt: |
-            You must evaluate whether Claude has met all the below quality criteria for the request.
-
-            ## Quality Criteria
-
             Verify the implementation meets ALL criteria:
             1. Platform adapter class is added to src/deepwork/adapters.py
             2. Templates exist in src/deepwork/templates/<platform>/ with appropriate command structure
@@ -24,17 +20,6 @@ hooks:
 
             If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
 
-
-            ## Instructions
-
-            Review the conversation and determine if ALL quality criteria above have been satisfied.
-            Look for evidence that each criterion has been addressed.
-
-            If the agent has included `<promise>✓ Quality Criteria Met</promise>` in their response AND
-            all criteria appear to be met, respond with: {"ok": true}
-
-            If criteria are NOT met AND the promise tag is missing, respond with:
-            {"ok": false, "reason": "Continue working. [specific feedback on what's wrong]"}
 ---
 
 # add_platform.implement
@@ -339,23 +324,10 @@ Ensure all outputs are:
 
 This step uses an iterative quality validation loop. After completing your work, stop hook(s) will evaluate whether the outputs meet quality criteria. If criteria are not met, you will be prompted to continue refining.
 
+
 **Validation Script**: `.deepwork/jobs/add_platform/hooks/run_tests.sh`
 
 The validation script will be executed automatically when you attempt to complete this step.
-### Quality Criteria (2)
-Verify the implementation meets ALL criteria:
-1. Platform adapter class is added to src/deepwork/adapters.py
-2. Templates exist in src/deepwork/templates/<platform>/ with appropriate command structure
-3. Tests exist for all new functionality
-4. Test coverage is 100% for new code (run: uv run pytest --cov)
-5. All tests pass
-6. README.md is updated with:
-   - New platform listed in supported platforms
-   - Installation instructions for the platform
-   - Any platform-specific notes
-
-If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
-
 
 ### Completion Promise
 
diff --git a/.claude/commands/add_platform.research.md b/.claude/commands/uw.add_platform.research.md
similarity index 87%
rename from .claude/commands/add_platform.research.md
rename to .claude/commands/uw.add_platform.research.md
index 3fd6ec01..d8bf5be4 100644
--- a/.claude/commands/add_platform.research.md
+++ b/.claude/commands/uw.add_platform.research.md
@@ -5,10 +5,6 @@ hooks:
     - hooks:
         - type: prompt
           prompt: |
-            You must evaluate whether Claude has met all the below quality criteria for the request.
-
-            ## Quality Criteria
-
             Verify the research output meets ALL criteria:
             1. Both files exist in doc/platforms/<platform>/: cli_configuration.md and hooks_system.md
             2. Each file has a comment at the top with:
@@ -21,17 +17,6 @@ hooks:
 
             If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
 
-
-            ## Instructions
-
-            Review the conversation and determine if ALL quality criteria above have been satisfied.
-            Look for evidence that each criterion has been addressed.
-
-            If the agent has included `<promise>✓ Quality Criteria Met</promise>` in their response AND
-            all criteria appear to be met, respond with: {"ok": true}
-
-            If criteria are NOT met AND the promise tag is missing, respond with:
-            {"ok": false, "reason": "Continue working. [specific feedback on what's wrong]"}
 ---
 
 # add_platform.research
@@ -287,18 +272,6 @@ Ensure all outputs are:
 
 This step uses an iterative quality validation loop. After completing your work, stop hook(s) will evaluate whether the outputs meet quality criteria. If criteria are not met, you will be prompted to continue refining.
 
-### Quality Criteria
-Verify the research output meets ALL criteria:
-1. Both files exist in doc/platforms/<platform>/: cli_configuration.md and hooks_system.md
-2. Each file has a comment at the top with:
-   - Last updated date
-   - Source URL where the documentation was obtained
-3. cli_configuration.md covers how the platform's CLI is configured
-4. hooks_system.md covers hooks available for slash command definitions ONLY
-5. No extraneous documentation (only these two specific topics)
-6. Documentation is comprehensive enough to implement the platform
-
-If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
 
 
 ### Completion Promise
diff --git a/.claude/commands/add_platform.verify.md b/.claude/commands/uw.add_platform.verify.md
similarity index 85%
rename from .claude/commands/add_platform.verify.md
rename to .claude/commands/uw.add_platform.verify.md
index d937b537..d7c3a61e 100644
--- a/.claude/commands/add_platform.verify.md
+++ b/.claude/commands/uw.add_platform.verify.md
@@ -5,10 +5,6 @@ hooks:
     - hooks:
         - type: prompt
           prompt: |
-            You must evaluate whether Claude has met all the below quality criteria for the request.
-
-            ## Quality Criteria
-
             Verify the installation meets ALL criteria:
             1. Platform-specific directories/files are added to the deepwork repo as needed
             2. Running `deepwork install --platform <platform>` completes without errors
@@ -19,17 +15,6 @@ hooks:
 
             If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
 
-
-            ## Instructions
-
-            Review the conversation and determine if ALL quality criteria above have been satisfied.
-            Look for evidence that each criterion has been addressed.
-
-            If the agent has included `<promise>✓ Quality Criteria Met</promise>` in their response AND
-            all criteria appear to be met, respond with: {"ok": true}
-
-            If criteria are NOT met AND the promise tag is missing, respond with:
-            {"ok": false, "reason": "Continue working. [specific feedback on what's wrong]"}
 ---
 
 # add_platform.verify
@@ -212,16 +197,6 @@ No specific files are output by this command.
 
 This step uses an iterative quality validation loop. After completing your work, stop hook(s) will evaluate whether the outputs meet quality criteria. If criteria are not met, you will be prompted to continue refining.
 
-### Quality Criteria
-Verify the installation meets ALL criteria:
-1. Platform-specific directories/files are added to the deepwork repo as needed
-2. Running `deepwork install --platform <platform>` completes without errors
-3. Expected command files are created in the platform's command directory
-4. Command file content matches the templates and job definitions
-5. Established DeepWork jobs (deepwork_jobs, deepwork_rules) are installed correctly
-6. The platform can be used alongside existing platforms without conflicts
-
-If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
 
 
 ### Completion Promise
diff --git a/.claude/commands/deepwork_jobs.define.md b/.claude/commands/uw.deepwork_jobs.define.md
similarity index 93%
rename from .claude/commands/deepwork_jobs.define.md
rename to .claude/commands/uw.deepwork_jobs.define.md
index 4de4eaf0..6902fe1b 100644
--- a/.claude/commands/deepwork_jobs.define.md
+++ b/.claude/commands/uw.deepwork_jobs.define.md
@@ -9,10 +9,8 @@ hooks:
 
             ## Quality Criteria
 
-            Verify the job.yml output meets ALL quality criteria before completing:
-
-            1. **User Understanding**: Did you fully understand the user's workflow by asking structured questions?
-            2. **Structured Questions Used**: Did you ask structured questions (using the AskUserQuestion tool) to gather user input?
+            1. **User Understanding**: Did the agent fully understand the user's workflow by asking structured questions?
+            2. **Structured Questions Used**: Did the agent ask structured questions (using the AskUserQuestion tool) to gather user input?
             3. **Clear Inputs/Outputs**: Does every step have clearly defined inputs and outputs?
             4. **Logical Dependencies**: Do step dependencies make sense and avoid circular references?
             5. **Concise Summary**: Is the summary under 200 characters and descriptive?
@@ -20,10 +18,6 @@ hooks:
             7. **Valid Schema**: Does the job.yml follow the required schema (name, version, summary, steps)?
             8. **File Created**: Has the job.yml file been created in `.deepwork/jobs/[job_name]/job.yml`?
 
-            If ANY criterion is not met, continue working to address it.
-            If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
-
-
             ## Instructions
 
             Review the conversation and determine if ALL quality criteria above have been satisfied.
@@ -32,8 +26,8 @@ hooks:
             If the agent has included `<promise>✓ Quality Criteria Met</promise>` in their response AND
             all criteria appear to be met, respond with: {"ok": true}
 
-            If criteria are NOT met AND the promise tag is missing, respond with:
-            {"ok": false, "reason": "Continue working. [specific feedback on what's wrong]"}
+            If criteria are NOT met OR the promise tag is missing, respond with:
+            {"ok": false, "reason": "**AGENT: TAKE ACTION** - [which criteria failed and why]"}
 ---
 
 # deepwork_jobs.define
@@ -427,10 +421,9 @@ Ensure all outputs are:
 This step uses an iterative quality validation loop. After completing your work, stop hook(s) will evaluate whether the outputs meet quality criteria. If criteria are not met, you will be prompted to continue refining.
 
 ### Quality Criteria
-Verify the job.yml output meets ALL quality criteria before completing:
 
-1. **User Understanding**: Did you fully understand the user's workflow by asking structured questions?
-2. **Structured Questions Used**: Did you ask structured questions (using the AskUserQuestion tool) to gather user input?
+1. **User Understanding**: Did the agent fully understand the user's workflow by asking structured questions?
+2. **Structured Questions Used**: Did the agent ask structured questions (using the AskUserQuestion tool) to gather user input?
 3. **Clear Inputs/Outputs**: Does every step have clearly defined inputs and outputs?
 4. **Logical Dependencies**: Do step dependencies make sense and avoid circular references?
 5. **Concise Summary**: Is the summary under 200 characters and descriptive?
@@ -438,9 +431,6 @@ Verify the job.yml output meets ALL quality criteria before completing:
 7. **Valid Schema**: Does the job.yml follow the required schema (name, version, summary, steps)?
 8. **File Created**: Has the job.yml file been created in `.deepwork/jobs/[job_name]/job.yml`?
 
-If ANY criterion is not met, continue working to address it.
-If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
-
 
 ### Completion Promise
 
diff --git a/.claude/commands/deepwork_jobs.implement.md b/.claude/commands/uw.deepwork_jobs.implement.md
similarity index 92%
rename from .claude/commands/deepwork_jobs.implement.md
rename to .claude/commands/uw.deepwork_jobs.implement.md
index 7c224679..2ad241f7 100644
--- a/.claude/commands/deepwork_jobs.implement.md
+++ b/.claude/commands/uw.deepwork_jobs.implement.md
@@ -9,8 +9,6 @@ hooks:
 
             ## Quality Criteria
 
-            Verify the implementation meets ALL quality criteria before completing:
-
             1. **Directory Structure**: Is `.deepwork/jobs/[job_name]/` created correctly?
             2. **Complete Instructions**: Are ALL step instruction files complete (not stubs or placeholders)?
             3. **Specific & Actionable**: Are instructions tailored to each step's purpose, not generic?
@@ -19,13 +17,7 @@ hooks:
             6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"?
             7. **Sync Complete**: Has `deepwork sync` been run successfully?
             8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`?
-            9. **Rules Considered**: Have you thought about whether rules would benefit this job?
-               - If relevant rules were identified, did you explain them and offer to run `/deepwork_rules.define`?
-               - Not every job needs rules - only suggest when genuinely helpful.
-
-            If ANY criterion is not met, continue working to address it.
-            If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
-
+            9. **Rules Considered**: Has the agent thought about whether rules would benefit this job? If relevant rules were identified, did they explain them and offer to run `/deepwork_rules.define`? Not every job needs rules - only suggest when genuinely helpful.
 
             ## Instructions
 
@@ -35,8 +27,8 @@ hooks:
             If the agent has included `<promise>✓ Quality Criteria Met</promise>` in their response AND
             all criteria appear to be met, respond with: {"ok": true}
 
-            If criteria are NOT met AND the promise tag is missing, respond with:
-            {"ok": false, "reason": "Continue working. [specific feedback on what's wrong]"}
+            If criteria are NOT met OR the promise tag is missing, respond with:
+            {"ok": false, "reason": "**AGENT: TAKE ACTION** - [which criteria failed and why]"}
 ---
 
 # deepwork_jobs.implement
@@ -345,7 +337,6 @@ Ensure all outputs are:
 This step uses an iterative quality validation loop. After completing your work, stop hook(s) will evaluate whether the outputs meet quality criteria. If criteria are not met, you will be prompted to continue refining.
 
 ### Quality Criteria
-Verify the implementation meets ALL quality criteria before completing:
 
 1. **Directory Structure**: Is `.deepwork/jobs/[job_name]/` created correctly?
 2. **Complete Instructions**: Are ALL step instruction files complete (not stubs or placeholders)?
@@ -355,12 +346,7 @@ Verify the implementation meets ALL quality criteria before completing:
 6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"?
 7. **Sync Complete**: Has `deepwork sync` been run successfully?
 8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`?
-9. **Rules Considered**: Have you thought about whether rules would benefit this job?
-   - If relevant rules were identified, did you explain them and offer to run `/deepwork_rules.define`?
-   - Not every job needs rules - only suggest when genuinely helpful.
-
-If ANY criterion is not met, continue working to address it.
-If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
+9. **Rules Considered**: Has the agent thought about whether rules would benefit this job? If relevant rules were identified, did they explain them and offer to run `/deepwork_rules.define`? Not every job needs rules - only suggest when genuinely helpful.
 
 
 ### Completion Promise
diff --git a/.claude/commands/deepwork_rules.define.md b/.claude/commands/uw.deepwork_rules.define.md
similarity index 100%
rename from .claude/commands/deepwork_rules.define.md
rename to .claude/commands/uw.deepwork_rules.define.md
diff --git a/.claude/commands/update.job.md b/.claude/commands/uw.update.job.md
similarity index 83%
rename from .claude/commands/update.job.md
rename to .claude/commands/uw.update.job.md
index 9698eecf..6a30da15 100644
--- a/.claude/commands/update.job.md
+++ b/.claude/commands/uw.update.job.md
@@ -5,10 +5,6 @@ hooks:
     - hooks:
         - type: prompt
           prompt: |
-            You must evaluate whether Claude has met all the below quality criteria for the request.
-
-            ## Quality Criteria
-
             Verify the update process completed successfully:
             1. Changes were made in src/deepwork/standard_jobs/[job_name]/ (NOT in .deepwork/jobs/)
             2. `deepwork install --platform claude` was run
@@ -16,17 +12,6 @@ hooks:
             4. Command files in .claude/commands/ were regenerated
             If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
 
-
-            ## Instructions
-
-            Review the conversation and determine if ALL quality criteria above have been satisfied.
-            Look for evidence that each criterion has been addressed.
-
-            If the agent has included `<promise>✓ Quality Criteria Met</promise>` in their response AND
-            all criteria appear to be met, respond with: {"ok": true}
-
-            If criteria are NOT met AND the promise tag is missing, respond with:
-            {"ok": false, "reason": "Continue working. [specific feedback on what's wrong]"}
 ---
 
 # update.job
@@ -160,13 +145,6 @@ No specific files are output by this command.
 
 This step uses an iterative quality validation loop. After completing your work, stop hook(s) will evaluate whether the outputs meet quality criteria. If criteria are not met, you will be prompted to continue refining.
 
-### Quality Criteria
-Verify the update process completed successfully:
-1. Changes were made in src/deepwork/standard_jobs/[job_name]/ (NOT in .deepwork/jobs/)
-2. `deepwork install --platform claude` was run
-3. Files in .deepwork/jobs/ match the source files
-4. Command files in .claude/commands/ were regenerated
-If ALL criteria are met, include `<promise>✓ Quality Criteria Met</promise>`.
 
 
 ### Completion Promise
diff --git a/.deepwork/jobs/deepwork_jobs/job.yml b/.deepwork/jobs/deepwork_jobs/job.yml
index e95aa2c0..d311ab29 100644
--- a/.deepwork/jobs/deepwork_jobs/job.yml
+++ b/.deepwork/jobs/deepwork_jobs/job.yml
@@ -36,22 +36,15 @@ steps:
     outputs:
       - job.yml
     dependencies: []
-    hooks:
-      after_agent:
-        - prompt: |
-            Verify the job.yml output meets ALL quality criteria before completing:
-
-            1. **User Understanding**: Did you fully understand the user's workflow by asking structured questions?
-            2. **Structured Questions Used**: Did you ask structured questions (using the AskUserQuestion tool) to gather user input?
-            3. **Clear Inputs/Outputs**: Does every step have clearly defined inputs and outputs?
-            4. **Logical Dependencies**: Do step dependencies make sense and avoid circular references?
-            5. **Concise Summary**: Is the summary under 200 characters and descriptive?
-            6. **Rich Description**: Does the description provide enough context for future refinement?
-            7. **Valid Schema**: Does the job.yml follow the required schema (name, version, summary, steps)?
-            8. **File Created**: Has the job.yml file been created in `.deepwork/jobs/[job_name]/job.yml`?
-
-            If ANY criterion is not met, continue working to address it.
-            If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
+    quality_criteria:
+      - "**User Understanding**: Did the agent fully understand the user's workflow by asking structured questions?"
+      - "**Structured Questions Used**: Did the agent ask structured questions (using the AskUserQuestion tool) to gather user input?"
+      - "**Clear Inputs/Outputs**: Does every step have clearly defined inputs and outputs?"
+      - "**Logical Dependencies**: Do step dependencies make sense and avoid circular references?"
+      - "**Concise Summary**: Is the summary under 200 characters and descriptive?"
+      - "**Rich Description**: Does the description provide enough context for future refinement?"
+      - "**Valid Schema**: Does the job.yml follow the required schema (name, version, summary, steps)?"
+      - "**File Created**: Has the job.yml file been created in `.deepwork/jobs/[job_name]/job.yml`?"
 
   - id: implement
     name: "Implement Job Steps"
@@ -64,51 +57,36 @@ steps:
       - steps/
     dependencies:
       - define
-    hooks:
-      after_agent:
-        - prompt: |
-            Verify the implementation meets ALL quality criteria before completing:
-
-            1. **Directory Structure**: Is `.deepwork/jobs/[job_name]/` created correctly?
-            2. **Complete Instructions**: Are ALL step instruction files complete (not stubs or placeholders)?
-            3. **Specific & Actionable**: Are instructions tailored to each step's purpose, not generic?
-            4. **Output Examples**: Does each instruction file show what good output looks like?
-            5. **Quality Criteria**: Does each instruction file define quality criteria for its outputs?
-            6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"?
-            7. **Sync Complete**: Has `deepwork sync` been run successfully?
-            8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`?
-            9. **Rules Considered**: Have you thought about whether rules would benefit this job?
-               - If relevant rules were identified, did you explain them and offer to run `/deepwork_rules.define`?
-               - Not every job needs rules - only suggest when genuinely helpful.
-
-            If ANY criterion is not met, continue working to address it.
-            If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
+    quality_criteria:
+      - "**Directory Structure**: Is `.deepwork/jobs/[job_name]/` created correctly?"
+      - "**Complete Instructions**: Are ALL step instruction files complete (not stubs or placeholders)?"
+      - "**Specific & Actionable**: Are instructions tailored to each step's purpose, not generic?"
+      - "**Output Examples**: Does each instruction file show what good output looks like?"
+      - "**Quality Criteria**: Does each instruction file define quality criteria for its outputs?"
+      - "**Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase \"ask structured questions\"?"
+      - "**Sync Complete**: Has `deepwork sync` been run successfully?"
+      - "**Commands Available**: Are the slash-commands generated in `.claude/commands/`?"
+      - "**Rules Considered**: Has the agent thought about whether rules would benefit this job? If relevant rules were identified, did they explain them and offer to run `/deepwork_rules.define`? Not every job needs rules - only suggest when genuinely helpful."
 
   - id: learn
     name: "Learn from Job Execution"
     description: "Reflect on conversation to improve job instructions and capture learnings"
     instructions_file: steps/learn.md
+    exposed: true
     inputs:
       - name: job_name
         description: "Name of the job that was run (optional - will auto-detect from conversation)"
     outputs:
       - AGENTS.md
     dependencies: []
-    hooks:
-      after_agent:
-        - prompt: |
-            Verify the learning process meets ALL quality criteria before completing:
-
-            1. **Conversation Analyzed**: Did you review the conversation for DeepWork job executions?
-            2. **Confusion Identified**: Did you identify points of confusion, errors, or inefficiencies?
-            3. **Instructions Improved**: Were job instructions updated to address identified issues?
-            4. **Instructions Concise**: Are instructions free of redundancy and unnecessary verbosity?
-            5. **Shared Content Extracted**: Is lengthy/duplicated content extracted into referenced files?
-            6. **Bespoke Learnings Captured**: Were run-specific learnings added to AGENTS.md?
-            7. **File References Used**: Do AGENTS.md entries reference other files where appropriate?
-            8. **Working Folder Correct**: Is AGENTS.md in the correct working folder for the job?
-            9. **Generalizable Separated**: Are generalizable improvements in instructions, not AGENTS.md?
-            10. **Sync Complete**: Has `deepwork sync` been run if instructions were modified?
-
-            If ANY criterion is not met, continue working to address it.
-            If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
+    quality_criteria:
+      - "**Conversation Analyzed**: Did the agent review the conversation for DeepWork job executions?"
+      - "**Confusion Identified**: Did the agent identify points of confusion, errors, or inefficiencies?"
+      - "**Instructions Improved**: Were job instructions updated to address identified issues?"
+      - "**Instructions Concise**: Are instructions free of redundancy and unnecessary verbosity?"
+      - "**Shared Content Extracted**: Is lengthy/duplicated content extracted into referenced files?"
+      - "**Bespoke Learnings Captured**: Were run-specific learnings added to AGENTS.md?"
+      - "**File References Used**: Do AGENTS.md entries reference other files where appropriate?"
+      - "**Working Folder Correct**: Is AGENTS.md in the correct working folder for the job?"
+      - "**Generalizable Separated**: Are generalizable improvements in instructions, not AGENTS.md?"
+      - "**Sync Complete**: Has `deepwork sync` been run if instructions were modified?"
diff --git a/.gemini/commands/add_platform/index.toml b/.gemini/commands/add_platform/index.toml
new file mode 100644
index 00000000..b46ed1fc
--- /dev/null
+++ b/.gemini/commands/add_platform/index.toml
@@ -0,0 +1,84 @@
+# add_platform
+#
+# Add a new AI platform to DeepWork with adapter, templates, and tests
+#
+# Generated by DeepWork - do not edit manually
+
+description = "Add a new AI platform to DeepWork with adapter, templates, and tests"
+
+prompt = """
+# add_platform
+
+You are executing the **add_platform** job. Add a new AI platform to DeepWork with adapter, templates, and tests
+
+A workflow for adding support for a new AI platform (like Cursor, Windsurf, etc.) to DeepWork.
+
+This job guides you through four phases:
+1. **Research**: Capture the platform's CLI configuration and hooks system documentation
+2. **Add Capabilities**: Update the job schema and adapters with any new hook events
+3. **Implement**: Create the platform adapter, templates, tests (100% coverage), and README updates
+4. **Verify**: Ensure installation works correctly and produces expected files
+
+The workflow ensures consistency across all supported platforms and maintains
+comprehensive test coverage for new functionality.
+
+**Important Notes**:
+- Only hooks available on slash command definitions should be captured
+- Each existing adapter must be updated when new hooks are added (typically with null values)
+- Tests must achieve 100% coverage for any new functionality
+- Installation verification confirms the platform integrates correctly with existing jobs
+
+
+## Available Steps
+
+This job has 4 step(s):
+
+### research
+**Research Platform Documentation**: Capture CLI configuration and hooks system documentation for the new platform
+- Command: `/add_platform:uw.research`
+### add_capabilities
+**Add Hook Capabilities**: Update job schema and adapters with any new hook events the platform supports
+- Command: `/add_platform:uw.add_capabilities`
+- Requires: research
+### implement
+**Implement Platform Support**: Add platform adapter, templates, tests with 100% coverage, and README documentation
+- Command: `/add_platform:uw.implement`
+- Requires: research, add_capabilities
+### verify
+**Verify Installation**: Set up platform directories and verify deepwork install works correctly
+- Command: `/add_platform:uw.verify`
+- Requires: implement
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and guide the user through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/add_platform`
+
+2. **Identify the starting step** based on intent:
+   - research: Capture CLI configuration and hooks system documentation for the new platform
+   - add_capabilities: Update job schema and adapters with any new hook events the platform supports
+   - implement: Add platform adapter, templates, tests with 100% coverage, and README documentation
+   - verify: Set up platform directories and verify deepwork install works correctly
+
+3. **Guide the user through the workflow**:
+   - Tell the user to run the starting step command
+   - After each step completes, guide them to the next step in the workflow
+   - Continue until the workflow is complete
+
+   Step commands:
+   - research: `/add_platform:uw.research`
+   - add_capabilities: `/add_platform:uw.add_capabilities`
+   - implement: `/add_platform:uw.implement`
+   - verify: `/add_platform:uw.verify`
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Let them choose
+
+**Note**: Gemini CLI requires manual command invocation. After each step completes, remind the user to run the next step command.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/add_platform/job.yml`
+"""
\ No newline at end of file
diff --git a/.gemini/commands/add_platform/add_capabilities.toml b/.gemini/commands/add_platform/uw.add_capabilities.toml
similarity index 100%
rename from .gemini/commands/add_platform/add_capabilities.toml
rename to .gemini/commands/add_platform/uw.add_capabilities.toml
diff --git a/.gemini/commands/add_platform/implement.toml b/.gemini/commands/add_platform/uw.implement.toml
similarity index 100%
rename from .gemini/commands/add_platform/implement.toml
rename to .gemini/commands/add_platform/uw.implement.toml
diff --git a/.gemini/commands/add_platform/research.toml b/.gemini/commands/add_platform/uw.research.toml
similarity index 100%
rename from .gemini/commands/add_platform/research.toml
rename to .gemini/commands/add_platform/uw.research.toml
diff --git a/.gemini/commands/add_platform/verify.toml b/.gemini/commands/add_platform/uw.verify.toml
similarity index 100%
rename from .gemini/commands/add_platform/verify.toml
rename to .gemini/commands/add_platform/uw.verify.toml
diff --git a/.gemini/commands/deepwork_jobs/index.toml b/.gemini/commands/deepwork_jobs/index.toml
new file mode 100644
index 00000000..7ab8bcf4
--- /dev/null
+++ b/.gemini/commands/deepwork_jobs/index.toml
@@ -0,0 +1,71 @@
+# deepwork_jobs
+#
+# DeepWork job management commands
+#
+# Generated by DeepWork - do not edit manually
+
+description = "DeepWork job management commands"
+
+prompt = """
+# deepwork_jobs
+
+You are executing the **deepwork_jobs** job. DeepWork job management commands
+
+Core commands for managing DeepWork jobs. These commands help you define new multi-step
+workflows and learn from running them.
+
+The `define` command guides you through an interactive process to create a new job by
+asking structured questions about your workflow, understanding each step's inputs and outputs,
+and generating all necessary files.
+
+The `learn` command reflects on conversations where DeepWork jobs were run, identifies
+confusion or inefficiencies, and improves job instructions. It also captures bespoke
+learnings specific to the current run into AGENTS.md files in the working folder.
+
+
+## Available Steps
+
+This job has 3 step(s):
+
+### define
+**Define Job Specification**: Create the job.yml specification file by understanding workflow requirements
+- Command: `/deepwork_jobs:uw.define`
+### implement
+**Implement Job Steps**: Generate instruction files for each step based on the job.yml specification
+- Command: `/deepwork_jobs:uw.implement`
+- Requires: define
+### learn
+**Learn from Job Execution**: Reflect on conversation to improve job instructions and capture learnings
+- Command: `/deepwork_jobs:learn`
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and guide the user through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/deepwork_jobs`
+
+2. **Identify the starting step** based on intent:
+   - define: Create the job.yml specification file by understanding workflow requirements
+   - implement: Generate instruction files for each step based on the job.yml specification
+   - learn: Reflect on conversation to improve job instructions and capture learnings
+
+3. **Guide the user through the workflow**:
+   - Tell the user to run the starting step command
+   - After each step completes, guide them to the next step in the workflow
+   - Continue until the workflow is complete
+
+   Step commands:
+   - define: `/deepwork_jobs:uw.define`
+   - implement: `/deepwork_jobs:uw.implement`
+   - learn: `/deepwork_jobs:learn`
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Let them choose
+
+**Note**: Gemini CLI requires manual command invocation. After each step completes, remind the user to run the next step command.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/deepwork_jobs/job.yml`
+"""
\ No newline at end of file
diff --git a/.gemini/commands/deepwork_jobs/learn.toml b/.gemini/commands/deepwork_jobs/learn.toml
index eb33d6df..3cd2aaad 100644
--- a/.gemini/commands/deepwork_jobs/learn.toml
+++ b/.gemini/commands/deepwork_jobs/learn.toml
@@ -351,6 +351,25 @@ Ensure all outputs are:
 - Well-formatted and complete
 - Ready for review or use by subsequent steps
 
+## Quality Validation
+
+This step has quality criteria that should be verified before completion.
+
+### Quality Criteria
+
+1. **Conversation Analyzed**: Did the agent review the conversation for DeepWork job executions?
+2. **Confusion Identified**: Did the agent identify points of confusion, errors, or inefficiencies?
+3. **Instructions Improved**: Were job instructions updated to address identified issues?
+4. **Instructions Concise**: Are instructions free of redundancy and unnecessary verbosity?
+5. **Shared Content Extracted**: Is lengthy/duplicated content extracted into referenced files?
+6. **Bespoke Learnings Captured**: Were run-specific learnings added to AGENTS.md?
+7. **File References Used**: Do AGENTS.md entries reference other files where appropriate?
+8. **Working Folder Correct**: Is AGENTS.md in the correct working folder for the job?
+9. **Generalizable Separated**: Are generalizable improvements in instructions, not AGENTS.md?
+10. **Sync Complete**: Has `deepwork sync` been run if instructions were modified?
+
+**Note**: Gemini CLI does not support automated validation hooks. Please manually verify the criteria above before proceeding.
+
 ## Completion
 
 After completing this step:
diff --git a/.gemini/commands/deepwork_jobs/refine.toml b/.gemini/commands/deepwork_jobs/refine.toml
deleted file mode 100644
index 19523289..00000000
--- a/.gemini/commands/deepwork_jobs/refine.toml
+++ /dev/null
@@ -1,538 +0,0 @@
-# deepwork_jobs:refine
-#
-# Modify an existing job definition
-#
-# Generated by DeepWork - do not edit manually
-
-description = "Modify an existing job definition"
-
-prompt = """
-# deepwork_jobs:refine
-
-**Standalone command** in the **deepwork_jobs** job - can be run anytime
-
-**Summary**: DeepWork job management commands
-
-## Job Overview
-
-Core commands for managing DeepWork jobs. These commands help you define new multi-step
-workflows and refine existing ones.
-
-The `define` command guides you through an interactive process to create a new job by
-asking detailed questions about your workflow, understanding each step's inputs and outputs,
-and generating all necessary files.
-
-The `refine` command helps you modify existing jobs safely by understanding what you want
-to change, validating the impact, and ensuring consistency across your workflow.
-
-
-
-## Instructions
-
-# Refine Existing DeepWork Job
-
-## Objective
-
-Help the user modify an existing DeepWork job definition by understanding what they want to change and ensuring the modifications maintain consistency and validity.
-
-## Task
-
-Guide the user through refining a job by first understanding their existing job, then what they want to change, and finally implementing those changes safely.
-
-### Step 1: Select and Load Job
-
-1. **List available jobs**
-   - Scan `.deepwork/jobs/` directory for installed jobs
-   - Display installed jobs with versions and descriptions
-   - Ask which job to refine
-
-2. **Load job definition**
-   - Read `.deepwork/jobs/[job_name]/job.yml`
-   - Parse and understand the current structure
-
-3. **Show current structure**
-   - Display all steps with their names and descriptions
-   - Show the dependency flow
-   - Highlight key inputs and outputs
-
-### Step 2: Understand Desired Changes
-
-Ask the user what they want to change:
-
-**Change Options:**
-1. Add a new step to the workflow
-2. Modify step instructions
-3. Change step inputs/outputs
-4. Update dependencies between steps
-5. Update job metadata (description, version)
-6. Remove a step
-7. Add or modify stop hooks (quality validation)
-
-**For each change, ask clarifying questions:**
-- Why do they want to make this change?
-- How will it improve the workflow?
-- Are there dependencies or side effects to consider?
-
-### Step 3: Make Changes Safely
-
-Based on the user's selection:
-
-#### Adding a Step
-
-1. **Gather step details** (same as define command)
-   - What does this step accomplish?
-   - What are the inputs? User parameters or file inputs?
-   - What outputs does it produce?
-   - What are the dependencies?
-
-2. **Determine placement**
-   - Where should it go in the workflow?
-   - Before which existing step?
-   - Or at the end?
-
-3. **Validate placement**
-   - Can't depend on later steps if inserted early
-   - File inputs must come from dependencies
-   - Check for circular dependencies
-
-4. **Update files**
-   - Update `job.yml` with new step
-   - Create step instructions file in `steps/[step_id].md`
-   - Prepare changelog entry describing the addition
-
-#### Modifying Step Instructions
-
-1. **Select step to modify**
-   - Show list of steps
-   - Ask which one to modify
-
-2. **Understand the change**
-   - What's not working with current instructions?
-   - What should be different?
-   - Show current instructions
-
-3. **Update instructions**
-   - Modify `.deepwork/jobs/[job_name]/steps/[step_id].md`
-   - Keep the same structure (Objective, Task, Process, Output Format, Quality Criteria)
-   - Prepare changelog entry describing the modification
-
-#### Changing Inputs/Outputs
-
-1. **Select step to modify**
-2. **Show current inputs and outputs**
-3. **Understand the change**
-   - Adding or removing?
-   - Why is this needed?
-
-4. **Validate impact**
-   - If removing output: check if other steps depend on it (BREAKING CHANGE)
-   - If adding file input: ensure from_step is in dependencies
-   - If removing input: ensure it's not critical
-
-5. **Update job.yml**
-   - Prepare changelog entry describing the input/output changes
-
-#### Updating Dependencies
-
-1. **Select step to modify**
-2. **Show current dependency graph**
-3. **Understand the change**
-   - Adding or removing dependency?
-   - Why?
-
-4. **Validate**
-   - Check for circular dependencies
-   - Ensure all file inputs have matching dependencies
-   - Ensure dependency chain makes logical sense
-
-5. **Update job.yml**
-   - Prepare changelog entry describing the dependency changes
-
-#### Updating Metadata
-
-1. **Ask what to change**
-   - Description?
-   - Version?
-
-2. **If version change, explain semantic versioning**
-   - Major (x.0.0): Breaking changes (removing steps, removing outputs)
-   - Minor (0.x.0): New features, backwards compatible (adding steps)
-   - Patch (0.0.x): Bug fixes, improvements
-
-3. **Update job.yml**
-   - Prepare changelog entry describing the metadata changes
-
-#### Removing a Step
-
-1. **Select step to remove**
-
-2. **CRITICAL: Validate safety**
-   - Check if other steps depend on this step
-   - Check if other steps use outputs from this step
-   - If dependencies exist: **WARN** and suggest updating dependents first
-   - This is a BREAKING CHANGE - requires major version bump
-
-3. **If safe to remove**
-   - Remove from `job.yml`
-   - Delete step instructions file
-   - Suggest version bump
-   - Prepare changelog entry describing the removal
-
-#### Adding or Modifying Stop Hooks
-
-Stop hooks provide quality validation loops that ensure step outputs meet criteria before completing.
-
-1. **Select step to modify**
-   - Show list of steps
-   - Ask which one to add/modify hooks for
-
-2. **Understand the need**
-   - What quality criteria should be validated?
-   - Is the output subjective (use prompt hook) or objective (use script hook)?
-   - Should validation happen automatically or only on specific conditions?
-
-3. **Choose hook type**
-
-   **Prompt hooks** (recommended for most cases):
-   - Best for subjective quality criteria
-   - AI evaluates the output against criteria
-   - Example: "Verify the report is comprehensive and well-organized"
-   ```yaml
-   stop_hooks:
-     - prompt: |
-         Verify the output meets criteria:
-         1. Contains all required sections
-         2. Analysis is thorough
-         3. Recommendations are actionable
-   ```
-
-   **Prompt file hooks**:
-   - For reusable or complex validation criteria
-   - Stores criteria in a separate markdown file
-   ```yaml
-   stop_hooks:
-     - prompt_file: hooks/quality_check.md
-   ```
-
-   **Script hooks**:
-   - For objective, programmatic validation
-   - Best for tests, linting, format checking
-   ```yaml
-   stop_hooks:
-     - script: hooks/run_tests.sh
-   ```
-
-4. **Multiple hooks can be combined**
-   ```yaml
-   stop_hooks:
-     - script: hooks/lint.sh           # First: objective checks
-     - prompt: "Verify content quality" # Then: subjective review
-   ```
-
-5. **Update files**
-   - Add/modify `stop_hooks` array in job.yml
-   - Create hook files if using prompt_file or script types
-   - Update step instructions to match quality criteria
-   - Prepare changelog entry
-
-6. **Encourage prompt-based hooks**
-   - They leverage the AI's ability to understand context
-   - More flexible than rigid script checks
-   - Can evaluate nuanced quality criteria
-
-### Step 4: Update Changelog
-
-After making changes to the job.yml:
-
-1. **Add a changelog entry**
-   - Add a new entry to the `changelog` array in the job.yml
-   - Use the new version number
-   - List all changes made in this refinement
-
-2. **Changelog entry format**:
-   ```yaml
-   - version: "[new_version]"
-     changes: "[Description of all changes in this version]"
-   ```
-
-3. **Example changelog entries**:
-   - "Added step: validate_positioning"
-   - "Modified step instructions for research_competitors to improve clarity and add quality criteria"
-   - "Removed step: duplicate_analysis (consolidated into comparative_analysis)"
-   - "Updated dependencies: positioning_recommendations now depends on validate_positioning"
-   - "Changed output filename: comparison_matrix.md → comparison_table.md"
-   - "Added step: validate_positioning; Updated dependencies for positioning_recommendations"
-
-### Step 5: Validate Changes
-
-After updating the changelog:
-
-1. **Review the updated structure**
-   - Show the complete updated workflow
-   - Highlight what changed
-   - Check for consistency
-
-2. **Validate job definition**
-   - No circular dependencies
-   - All file inputs match dependencies
-   - All step IDs unique
-   - All outputs defined
-
-3. **Confirm version update**
-   - Ensure version was bumped appropriately
-   - Breaking changes? Major version
-   - New features? Minor version
-   - Improvements? Patch version
-
-### Step 6: Sync and Reload
-
-1. **Run sync**
-   ```bash
-   deepwork sync
-   ```
-
-2. **Reload commands**
-   - Instruct user to follow the reload instructions shown by `deepwork sync`
-
-3. **Provide summary**
-   - Show what changed
-   - List updated files
-   - Display the changelog entry
-   - Explain next steps
-
-## Safe Modification Patterns
-
-Guide users toward safe changes:
-
-### Safe Changes (Always OK)
-- Adding steps at the end with new outputs
-- Updating step instructions (content only)
-- Changing job description
-- Adding user inputs to a step
-- Adding outputs (not removing)
-- Patch version bumps
-
-### Potentially Breaking Changes (Warn User)
-- Removing steps
-- Removing outputs
-- Removing inputs
-- Changing dependencies
-- Renaming outputs (other steps may reference them)
-- Requires major or minor version bump
-
-### Always Validate
-- Circular dependencies after any dependency change
-- File inputs match dependencies after any change
-- Step IDs remain unique
-- At least one output per step
-
-## Example Dialog
-
-```
-Claude: I'll help you refine an existing job. Here are your installed jobs:
-
-1. competitive_research v1.0.0 - Systematic competitive analysis workflow
-2. blog_post v1.0.0 - Blog post creation workflow
-
-Which job would you like to refine?
-
-User: 1
-
-Claude: Loading competitive_research v1.0.0...
-
-Current structure:
-- Step 1: identify_competitors - Identify 5-7 competitors in target market
-- Step 2: research_competitors - Deep research on each competitor
-- Step 3: comparative_analysis - Side-by-side comparison matrix
-- Step 4: positioning_recommendations - Strategic positioning report
-
-What would you like to change?
-1. Add a new step
-2. Modify step instructions
-3. Change inputs/outputs
-4. Update dependencies
-5. Update job metadata
-6. Remove a step
-
-User: 1
-
-Claude: Great! Let's add a new step. First, help me understand:
-
-1. What should this new step accomplish?
-2. Where should it fit in the workflow?
-   - Should it go between existing steps or at the end?
-
-User: I want to add a step that validates our product positioning before the final recommendations. It should go between step 3 and step 4.
-
-Claude: Excellent idea. So this would be:
-- Step 4: validate_positioning
-- Step 5: positioning_recommendations (renumbered)
-
-For this validation step:
-1. What inputs does it need? I assume it needs the comparison_matrix.md from step 3?
-2. What should it output?
-3. What exactly should this validation check for?
-
-User: Yes, it needs the comparison matrix. It should output a validation_report.md that identifies gaps in our positioning.
-
-Claude: Perfect. Let me confirm the details:
-
-**New Step: validate_positioning**
-- Name: "Validate Product Positioning"
-- Description: "Identify gaps and opportunities in current positioning"
-- Inputs: comparison_matrix.md (from comparative_analysis step)
-- Outputs: validation_report.md
-- Dependencies: comparative_analysis
-- Position: Between step 3 and 4
-
-Should the positioning_recommendations step also use the validation_report.md as input?
-
-User: Yes, definitely
-
-Claude: Great! So I'll also update the positioning_recommendations step to include validation_report.md as an input and add validate_positioning as a dependency.
-
-This is a new feature, so I'll bump the version to 1.1.0 (minor version).
-
-Creating the new step... ✓
-
-Updated files:
-- .deepwork/jobs/competitive_research/job.yml (added step, updated dependencies, version → 1.1.0, updated changelog)
-- .deepwork/jobs/competitive_research/steps/validate_positioning.md (created)
-
-Changelog entry added:
-```yaml
-- version: "1.1.0"
-  changes: "Added step: validate_positioning between comparative_analysis and positioning_recommendations; Updated dependencies for positioning_recommendations"
-```
-
-Now run:
-```bash
-deepwork sync
-```
-
-Then follow the reload instructions shown by `deepwork sync` and you'll have:
-- /competitive_research.validate_positioning (new!)
-- All other commands updated
-```
-
-## Validation Rules
-
-Enforce the same rules as job definition:
-- Job name: lowercase, underscores
-- Version: semantic versioning
-- Step IDs: unique within job
-- Dependencies: must reference existing step IDs
-- File inputs: `from_step` must be in dependencies
-- At least one output per step
-- No circular dependencies
-
-## Error Handling
-
-If issues arise, provide clear guidance:
-- **Dependency conflict**: "Step X depends on step Y which you're trying to remove. You must update step X first, or remove both steps."
-- **Circular dependency**: "Adding this dependency would create a cycle: A → B → C → A. Please choose a different dependency structure."
-- **Missing file input**: "Step X requires file.md from step Y, but Y is not in its dependencies. I'll add Y to the dependencies."
-- **Breaking change**: "Removing this output is a breaking change. Other steps depend on it. I recommend against this change unless you update the dependent steps first."
-
-## Changelog Entry Format
-
-Instead of creating a separate refinement_summary.md file, add the changes directly to the job.yml changelog section. This creates a permanent version history within the job definition itself.
-
-**Location**: `.deepwork/jobs/[job_name]/job.yml`
-
-**Add to the `changelog` array**:
-
-```yaml
-changelog:
-  - version: "1.0.0"
-    changes: "Initial job creation"
-  - version: "[new_version]"
-    changes: "[Concise description of all changes in this version]"
-```
-
-**Guidelines for changelog entries**:
-- Be concise but descriptive
-- Use action verbs (Added, Modified, Removed, Updated, Changed, Fixed)
-- Reference specific step names when relevant
-- For breaking changes, prefix with "BREAKING:"
-- If multiple changes, separate with semicolons or use clear phrasing
-
-**Examples**:
-- "Added step: validate_positioning between comparative_analysis and positioning_recommendations"
-- "Modified step instructions for research_competitors to improve clarity and add quality criteria"
-- "Removed step: duplicate_analysis (consolidated into comparative_analysis)"
-- "Updated dependencies: positioning_recommendations now depends on validate_positioning"
-- "Changed output filename: comparison_matrix.md → comparison_table.md"
-- "BREAKING: Removed output file shared_data.json from identify_competitors step"
-- "Fixed circular dependency between steps A and B"
-- "Updated job description to reflect new validation phase"
-- "Added validate_positioning step; Updated dependencies for positioning_recommendations"
-
-## Quality Criteria
-
-- Changes maintain job consistency
-- Dependencies are logically valid
-- Version bump follows semantic versioning
-- No circular dependencies introduced
-- User understands impact of changes
-- Breaking changes are clearly communicated
-
-
-## Inputs
-
-### User Parameters
-
-Please gather the following information from the user:
-- **job_name**: Name of the job to refine
-
-
-## Work Branch Management
-
-All work for this job should be done on a dedicated work branch:
-
-1. **Check current branch**:
-   - If already on a work branch for this job (format: `deepwork/deepwork_jobs-[instance]-[date]`), continue using it
-   - If on main/master, create a new work branch
-
-2. **Create work branch** (if needed):
-   ```bash
-   git checkout -b deepwork/deepwork_jobs-[instance]-$(date +%Y%m%d)
-   ```
-   Replace `[instance]` with a descriptive identifier (e.g., `acme`, `q1-launch`, etc.)
-
-## Output Requirements
-
-Create the following output(s):
-- `job.yml`
-
-Ensure all outputs are:
-- Well-formatted and complete
-- Ready for review or use by subsequent steps
-
-## Completion
-
-After completing this step:
-
-1. **Verify outputs**: Confirm all required files have been created
-
-2. **Inform the user**:
-   - The refine command is complete
-   - Outputs created: job.yml
-   - This command can be run again anytime to make further changes
-
-## Command Complete
-
-This is a standalone command that can be run anytime. The outputs are ready for use.
-
-Consider:
-- Reviewing the outputs
-- Running `deepwork sync` if job definitions were changed
-- Re-running this command later if further changes are needed
-
----
-
-## Context Files
-
-- Job definition: `.deepwork/jobs/deepwork_jobs/job.yml`
-- Step instructions: `.deepwork/jobs/deepwork_jobs/steps/refine.md`
-"""
\ No newline at end of file
diff --git a/.gemini/commands/deepwork_jobs/define.toml b/.gemini/commands/deepwork_jobs/uw.define.toml
similarity index 93%
rename from .gemini/commands/deepwork_jobs/define.toml
rename to .gemini/commands/deepwork_jobs/uw.define.toml
index d59b3e15..13784b61 100644
--- a/.gemini/commands/deepwork_jobs/define.toml
+++ b/.gemini/commands/deepwork_jobs/uw.define.toml
@@ -394,6 +394,23 @@ Ensure all outputs are:
 - Well-formatted and complete
 - Ready for review or use by subsequent steps
 
+## Quality Validation
+
+This step has quality criteria that should be verified before completion.
+
+### Quality Criteria
+
+1. **User Understanding**: Did the agent fully understand the user's workflow by asking structured questions?
+2. **Structured Questions Used**: Did the agent ask structured questions (using the AskUserQuestion tool) to gather user input?
+3. **Clear Inputs/Outputs**: Does every step have clearly defined inputs and outputs?
+4. **Logical Dependencies**: Do step dependencies make sense and avoid circular references?
+5. **Concise Summary**: Is the summary under 200 characters and descriptive?
+6. **Rich Description**: Does the description provide enough context for future refinement?
+7. **Valid Schema**: Does the job.yml follow the required schema (name, version, summary, steps)?
+8. **File Created**: Has the job.yml file been created in `.deepwork/jobs/[job_name]/job.yml`?
+
+**Note**: Gemini CLI does not support automated validation hooks. Please manually verify the criteria above before proceeding.
+
 ## Completion
 
 After completing this step:
diff --git a/.gemini/commands/deepwork_jobs/implement.toml b/.gemini/commands/deepwork_jobs/uw.implement.toml
similarity index 91%
rename from .gemini/commands/deepwork_jobs/implement.toml
rename to .gemini/commands/deepwork_jobs/uw.implement.toml
index 4cc5a989..4170a65f 100644
--- a/.gemini/commands/deepwork_jobs/implement.toml
+++ b/.gemini/commands/deepwork_jobs/uw.implement.toml
@@ -309,6 +309,24 @@ Ensure all outputs are:
 - Well-formatted and complete
 - Ready for review or use by subsequent steps
 
+## Quality Validation
+
+This step has quality criteria that should be verified before completion.
+
+### Quality Criteria
+
+1. **Directory Structure**: Is `.deepwork/jobs/[job_name]/` created correctly?
+2. **Complete Instructions**: Are ALL step instruction files complete (not stubs or placeholders)?
+3. **Specific & Actionable**: Are instructions tailored to each step's purpose, not generic?
+4. **Output Examples**: Does each instruction file show what good output looks like?
+5. **Quality Criteria**: Does each instruction file define quality criteria for its outputs?
+6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"?
+7. **Sync Complete**: Has `deepwork sync` been run successfully?
+8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`?
+9. **Rules Considered**: Has the agent thought about whether rules would benefit this job? If relevant rules were identified, did they explain them and offer to run `/deepwork_rules.define`? Not every job needs rules - only suggest when genuinely helpful.
+
+**Note**: Gemini CLI does not support automated validation hooks. Please manually verify the criteria above before proceeding.
+
 ## Completion
 
 After completing this step:
diff --git a/.gemini/commands/deepwork_rules/index.toml b/.gemini/commands/deepwork_rules/index.toml
new file mode 100644
index 00000000..b3c6a0f1
--- /dev/null
+++ b/.gemini/commands/deepwork_rules/index.toml
@@ -0,0 +1,65 @@
+# deepwork_rules
+#
+# Rules enforcement for AI agent sessions
+#
+# Generated by DeepWork - do not edit manually
+
+description = "Rules enforcement for AI agent sessions"
+
+prompt = """
+# deepwork_rules
+
+You are executing the **deepwork_rules** job. Rules enforcement for AI agent sessions
+
+Manages rules that automatically trigger when certain files change during an AI agent session.
+Rules help ensure that code changes follow team guidelines, documentation is updated,
+and architectural decisions are respected.
+
+Rules are stored as individual markdown files with YAML frontmatter in the `.deepwork/rules/`
+directory. Each rule file specifies:
+- Detection mode: trigger/safety, set (bidirectional), or pair (directional)
+- Patterns: Glob patterns for matching files, with optional variable capture
+- Instructions: Markdown content describing what the agent should do
+
+Example use cases:
+- Update installation docs when configuration files change
+- Require security review when authentication code is modified
+- Ensure API documentation stays in sync with API code
+- Enforce source/test file pairing
+
+
+## Available Steps
+
+This job has 1 step(s):
+
+### define
+**Define Rule**: Create a new rule file in .deepwork/rules/
+- Command: `/deepwork_rules:uw.define`
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and guide the user through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/deepwork_rules`
+
+2. **Identify the starting step** based on intent:
+   - define: Create a new rule file in .deepwork/rules/
+
+3. **Guide the user through the workflow**:
+   - Tell the user to run the starting step command
+   - After each step completes, guide them to the next step in the workflow
+   - Continue until the workflow is complete
+
+   Step commands:
+   - define: `/deepwork_rules:uw.define`
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Let them choose
+
+**Note**: Gemini CLI requires manual command invocation. After each step completes, remind the user to run the next step command.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/deepwork_rules/job.yml`
+"""
\ No newline at end of file
diff --git a/.gemini/commands/deepwork_rules/define.toml b/.gemini/commands/deepwork_rules/uw.define.toml
similarity index 100%
rename from .gemini/commands/deepwork_rules/define.toml
rename to .gemini/commands/deepwork_rules/uw.define.toml
diff --git a/.gemini/commands/update/index.toml b/.gemini/commands/update/index.toml
new file mode 100644
index 00000000..f564b188
--- /dev/null
+++ b/.gemini/commands/update/index.toml
@@ -0,0 +1,63 @@
+# update
+#
+# Update standard jobs in src/ and sync to installed locations
+#
+# Generated by DeepWork - do not edit manually
+
+description = "Update standard jobs in src/ and sync to installed locations"
+
+prompt = """
+# update
+
+You are executing the **update** job. Update standard jobs in src/ and sync to installed locations
+
+A workflow for maintaining standard jobs bundled with DeepWork. Standard jobs
+(like `deepwork_jobs` and `deepwork_rules`) are source-controlled in
+`src/deepwork/standard_jobs/` and must be edited there—never in `.deepwork/jobs/`
+or `.claude/commands/` directly.
+
+This job guides you through:
+1. Identifying which standard job(s) to update from conversation context
+2. Making changes in the correct source location (`src/deepwork/standard_jobs/[job_name]/`)
+3. Running `deepwork install` to propagate changes to `.deepwork/` and command directories
+4. Verifying the sync completed successfully
+
+Use this job whenever you need to modify job.yml files, step instructions, or hooks
+for any standard job in the DeepWork repository.
+
+
+## Available Steps
+
+This job has 1 step(s):
+
+### job
+**Update Standard Job**: Edit standard job source files and sync to installed locations
+- Command: `/update:uw.job`
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and guide the user through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/update`
+
+2. **Identify the starting step** based on intent:
+   - job: Edit standard job source files and sync to installed locations
+
+3. **Guide the user through the workflow**:
+   - Tell the user to run the starting step command
+   - After each step completes, guide them to the next step in the workflow
+   - Continue until the workflow is complete
+
+   Step commands:
+   - job: `/update:uw.job`
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Let them choose
+
+**Note**: Gemini CLI requires manual command invocation. After each step completes, remind the user to run the next step command.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/update/job.yml`
+"""
\ No newline at end of file
diff --git a/.gemini/commands/update/job.toml b/.gemini/commands/update/uw.job.toml
similarity index 100%
rename from .gemini/commands/update/job.toml
rename to .gemini/commands/update/uw.job.toml
diff --git a/manual_tests/test_multi_safety/test_multi_safety.py b/manual_tests/test_multi_safety/test_multi_safety.py
index 40cd981c..27734025 100644
--- a/manual_tests/test_multi_safety/test_multi_safety.py
+++ b/manual_tests/test_multi_safety/test_multi_safety.py
@@ -30,7 +30,6 @@
 .deepwork/rules/manual-test-multi-safety.md
 """
 
-
 VERSION = "1.0.0"
 
 
diff --git a/src/deepwork/core/adapters.py b/src/deepwork/core/adapters.py
index 9e8e27cd..0aa6c27a 100644
--- a/src/deepwork/core/adapters.py
+++ b/src/deepwork/core/adapters.py
@@ -56,6 +56,7 @@ class AgentAdapter(ABC):
     config_dir: ClassVar[str]
     commands_dir: ClassVar[str] = "commands"
     command_template: ClassVar[str] = "command-job-step.md.jinja"
+    meta_command_template: ClassVar[str] = "command-job-meta.md.jinja"
 
     # Instructions for reloading commands after sync (shown to users)
     # Subclasses should override with platform-specific instructions.
@@ -153,20 +154,38 @@ def get_commands_dir(self, project_root: Path | None = None) -> Path:
             raise AdapterError("No project root specified")
         return root / self.config_dir / self.commands_dir
 
-    def get_command_filename(self, job_name: str, step_id: str) -> str:
+    def get_meta_command_filename(self, job_name: str) -> str:
         """
-        Get the filename for a command.
+        Get the filename for a job's meta-command.
 
+        The meta-command is the primary user interface for a job.
+        Can be overridden for different file formats.
+
+        Args:
+            job_name: Name of the job
+
+        Returns:
+            Meta-command filename (e.g., "job_name.md")
+        """
+        return f"{job_name}.md"
+
+    def get_step_command_filename(self, job_name: str, step_id: str, exposed: bool = False) -> str:
+        """
+        Get the filename for a step command.
+
+        Step commands are hidden by default (uw. prefix) unless exposed=True.
         Can be overridden for different file formats (e.g., TOML for Gemini).
 
         Args:
             job_name: Name of the job
             step_id: ID of the step
+            exposed: If True, command is visible (no uw. prefix). Default: False.
 
         Returns:
-            Command filename (e.g., "job_name.step_id.md")
+            Command filename (e.g., "uw.job_name.step_id.md" or "job_name.step_id.md" if exposed)
         """
-        return f"{job_name}.{step_id}.md"
+        prefix = "" if exposed else "uw."
+        return f"{prefix}{job_name}.{step_id}.md"
 
     def detect(self, project_root: Path | None = None) -> bool:
         """
@@ -346,6 +365,7 @@ class GeminiAdapter(AgentAdapter):
     display_name = "Gemini CLI"
     config_dir = ".gemini"
     command_template = "command-job-step.toml.jinja"
+    meta_command_template = "command-job-meta.toml.jinja"
 
     # Gemini CLI can reload with /memory refresh
     reload_instructions: ClassVar[str] = (
@@ -356,21 +376,39 @@ class GeminiAdapter(AgentAdapter):
     # Hooks are global/project-level in settings.json, not per-command
     hook_name_mapping: ClassVar[dict[CommandLifecycleHook, str]] = {}
 
-    def get_command_filename(self, job_name: str, step_id: str) -> str:
+    def get_meta_command_filename(self, job_name: str) -> str:
+        """
+        Get the filename for a Gemini job's meta-command.
+
+        Gemini uses TOML files and colon namespacing via subdirectories.
+        For job "my_job", creates: my_job/index.toml
+
+        Args:
+            job_name: Name of the job
+
+        Returns:
+            Meta-command filename path (e.g., "my_job/index.toml")
+        """
+        return f"{job_name}/index.toml"
+
+    def get_step_command_filename(self, job_name: str, step_id: str, exposed: bool = False) -> str:
         """
-        Get the filename for a Gemini command.
+        Get the filename for a Gemini step command.
 
         Gemini uses TOML files and colon namespacing via subdirectories.
-        For job "my_job" and step "step_one", creates: my_job/step_one.toml
+        Step commands are hidden by default (uw. prefix) unless exposed=True.
+        For job "my_job" and step "step_one", creates: my_job/uw.step_one.toml
 
         Args:
             job_name: Name of the job
             step_id: ID of the step
+            exposed: If True, command is visible (no uw. prefix). Default: False.
 
         Returns:
-            Command filename path (e.g., "my_job/step_one.toml")
+            Command filename path (e.g., "my_job/uw.step_one.toml" or "my_job/step_one.toml" if exposed)
         """
-        return f"{job_name}/{step_id}.toml"
+        prefix = "" if exposed else "uw."
+        return f"{job_name}/{prefix}{step_id}.toml"
 
     def sync_hooks(self, project_path: Path, hooks: dict[str, list[dict[str, Any]]]) -> int:
         """
diff --git a/src/deepwork/core/generator.py b/src/deepwork/core/generator.py
index 380ab5b5..8531f76c 100644
--- a/src/deepwork/core/generator.py
+++ b/src/deepwork/core/generator.py
@@ -199,8 +199,115 @@ def _build_step_context(
             "is_standalone": is_standalone,
             "hooks": hooks,  # New: all hooks by platform event name
             "stop_hooks": stop_hooks,  # Backward compat: after_agent hooks only
+            "quality_criteria": step.quality_criteria,  # Declarative criteria with framing
         }
 
+    def _build_meta_command_context(
+        self, job: JobDefinition, adapter: AgentAdapter
+    ) -> dict[str, Any]:
+        """
+        Build template context for a job's meta-command.
+
+        Args:
+            job: Job definition
+            adapter: Agent adapter for platform-specific configuration
+
+        Returns:
+            Template context dictionary
+        """
+        # Build step info for the meta-command
+        steps_info = []
+        for step in job.steps:
+            command_filename = adapter.get_step_command_filename(job.name, step.id, step.exposed)
+            # Extract just the command name (without path and extension)
+            # For Claude: uw.job_name.step_id.md -> uw.job_name.step_id
+            # For Gemini: job_name/uw.step_id.toml -> job_name:uw.step_id
+            if adapter.name == "gemini":
+                # Gemini uses colon for namespacing: job_name:step_id or job_name:uw.step_id
+                parts = command_filename.replace(".toml", "").split("/")
+                command_name = ":".join(parts)
+            else:
+                # Claude uses dot for namespacing: uw.job_name.step_id
+                command_name = command_filename.replace(".md", "")
+
+            steps_info.append(
+                {
+                    "id": step.id,
+                    "name": step.name,
+                    "description": step.description,
+                    "command_name": command_name,
+                    "dependencies": step.dependencies,
+                    "exposed": step.exposed,
+                }
+            )
+
+        return {
+            "job_name": job.name,
+            "job_version": job.version,
+            "job_summary": job.summary,
+            "job_description": job.description,
+            "total_steps": len(job.steps),
+            "steps": steps_info,
+        }
+
+    def generate_meta_command(
+        self,
+        job: JobDefinition,
+        adapter: AgentAdapter,
+        output_dir: Path | str,
+    ) -> Path:
+        """
+        Generate the meta-command file for a job.
+
+        The meta-command is the primary user interface for a job, routing
+        user intent to the appropriate step.
+
+        Args:
+            job: Job definition
+            adapter: Agent adapter for the target platform
+            output_dir: Directory to write command file to
+
+        Returns:
+            Path to generated meta-command file
+
+        Raises:
+            GeneratorError: If generation fails
+        """
+        output_dir = Path(output_dir)
+
+        # Create commands subdirectory if needed
+        commands_dir = output_dir / adapter.commands_dir
+        commands_dir.mkdir(parents=True, exist_ok=True)
+
+        # Build context
+        context = self._build_meta_command_context(job, adapter)
+
+        # Load and render template
+        env = self._get_jinja_env(adapter)
+        try:
+            template = env.get_template(adapter.meta_command_template)
+        except TemplateNotFound as e:
+            raise GeneratorError(f"Meta-command template not found: {e}") from e
+
+        try:
+            rendered = template.render(**context)
+        except Exception as e:
+            raise GeneratorError(f"Meta-command template rendering failed: {e}") from e
+
+        # Write meta-command file
+        command_filename = adapter.get_meta_command_filename(job.name)
+        command_path = commands_dir / command_filename
+
+        # Ensure parent directories exist (for Gemini's job_name/index.toml structure)
+        command_path.parent.mkdir(parents=True, exist_ok=True)
+
+        try:
+            safe_write(command_path, rendered)
+        except Exception as e:
+            raise GeneratorError(f"Failed to write meta-command file: {e}") from e
+
+        return command_path
+
     def generate_step_command(
         self,
         job: JobDefinition,
@@ -250,10 +357,13 @@ def generate_step_command(
         except Exception as e:
             raise GeneratorError(f"Template rendering failed: {e}") from e
 
-        # Write command file
-        command_filename = adapter.get_command_filename(job.name, step.id)
+        # Write command file (hidden by default unless step.exposed is True)
+        command_filename = adapter.get_step_command_filename(job.name, step.id, step.exposed)
         command_path = commands_dir / command_filename
 
+        # Ensure parent directories exist (for Gemini's job_name/step_id.toml structure)
+        command_path.parent.mkdir(parents=True, exist_ok=True)
+
         try:
             safe_write(command_path, rendered)
         except Exception as e:
@@ -268,7 +378,7 @@ def generate_all_commands(
         output_dir: Path | str,
     ) -> list[Path]:
         """
-        Generate slash-command files for all steps in a job.
+        Generate all command files for a job: meta-command and step commands.
 
         Args:
             job: Job definition
@@ -276,13 +386,18 @@ def generate_all_commands(
             output_dir: Directory to write command files to
 
         Returns:
-            List of paths to generated command files
+            List of paths to generated command files (meta-command first, then steps)
 
         Raises:
             GeneratorError: If generation fails
         """
         command_paths = []
 
+        # Generate meta-command first (job-level entry point)
+        meta_command_path = self.generate_meta_command(job, adapter, output_dir)
+        command_paths.append(meta_command_path)
+
+        # Generate step commands (hidden by default unless step.exposed is True)
         for step in job.steps:
             command_path = self.generate_step_command(job, step, adapter, output_dir)
             command_paths.append(command_path)
diff --git a/src/deepwork/core/parser.py b/src/deepwork/core/parser.py
index 42fba818..3585d02e 100644
--- a/src/deepwork/core/parser.py
+++ b/src/deepwork/core/parser.py
@@ -108,6 +108,12 @@ class Step:
     # Event names: after_agent, before_tool, before_prompt
     hooks: dict[str, list[HookAction]] = field(default_factory=dict)
 
+    # If true, step command is visible (no uw. prefix). Default: false (hidden).
+    exposed: bool = False
+
+    # Declarative quality criteria rendered with standard evaluation framing
+    quality_criteria: list[str] = field(default_factory=list)
+
     @property
     def stop_hooks(self) -> list[HookAction]:
         """
@@ -144,6 +150,8 @@ def from_dict(cls, data: dict[str, Any]) -> "Step":
             outputs=data["outputs"],
             dependencies=data.get("dependencies", []),
             hooks=hooks,
+            exposed=data.get("exposed", False),
+            quality_criteria=data.get("quality_criteria", []),
         )
 
 
diff --git a/src/deepwork/schemas/job_schema.py b/src/deepwork/schemas/job_schema.py
index 4127abba..29c4df1f 100644
--- a/src/deepwork/schemas/job_schema.py
+++ b/src/deepwork/schemas/job_schema.py
@@ -203,6 +203,19 @@
                         "description": "DEPRECATED: Use hooks.after_agent instead. Stop hooks for quality validation loops.",
                         "items": HOOK_ACTION_SCHEMA,
                     },
+                    "exposed": {
+                        "type": "boolean",
+                        "description": "If true, step command is visible (no uw. prefix). Default: false (hidden).",
+                        "default": False,
+                    },
+                    "quality_criteria": {
+                        "type": "array",
+                        "description": "Declarative quality criteria. Rendered with standard evaluation framing.",
+                        "items": {
+                            "type": "string",
+                            "minLength": 1,
+                        },
+                    },
                 },
                 "additionalProperties": False,
             },
diff --git a/src/deepwork/standard_jobs/deepwork_jobs/job.yml b/src/deepwork/standard_jobs/deepwork_jobs/job.yml
index e95aa2c0..d311ab29 100644
--- a/src/deepwork/standard_jobs/deepwork_jobs/job.yml
+++ b/src/deepwork/standard_jobs/deepwork_jobs/job.yml
@@ -36,22 +36,15 @@ steps:
     outputs:
       - job.yml
     dependencies: []
-    hooks:
-      after_agent:
-        - prompt: |
-            Verify the job.yml output meets ALL quality criteria before completing:
-
-            1. **User Understanding**: Did you fully understand the user's workflow by asking structured questions?
-            2. **Structured Questions Used**: Did you ask structured questions (using the AskUserQuestion tool) to gather user input?
-            3. **Clear Inputs/Outputs**: Does every step have clearly defined inputs and outputs?
-            4. **Logical Dependencies**: Do step dependencies make sense and avoid circular references?
-            5. **Concise Summary**: Is the summary under 200 characters and descriptive?
-            6. **Rich Description**: Does the description provide enough context for future refinement?
-            7. **Valid Schema**: Does the job.yml follow the required schema (name, version, summary, steps)?
-            8. **File Created**: Has the job.yml file been created in `.deepwork/jobs/[job_name]/job.yml`?
-
-            If ANY criterion is not met, continue working to address it.
-            If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
+    quality_criteria:
+      - "**User Understanding**: Did the agent fully understand the user's workflow by asking structured questions?"
+      - "**Structured Questions Used**: Did the agent ask structured questions (using the AskUserQuestion tool) to gather user input?"
+      - "**Clear Inputs/Outputs**: Does every step have clearly defined inputs and outputs?"
+      - "**Logical Dependencies**: Do step dependencies make sense and avoid circular references?"
+      - "**Concise Summary**: Is the summary under 200 characters and descriptive?"
+      - "**Rich Description**: Does the description provide enough context for future refinement?"
+      - "**Valid Schema**: Does the job.yml follow the required schema (name, version, summary, steps)?"
+      - "**File Created**: Has the job.yml file been created in `.deepwork/jobs/[job_name]/job.yml`?"
 
   - id: implement
     name: "Implement Job Steps"
@@ -64,51 +57,36 @@ steps:
       - steps/
     dependencies:
       - define
-    hooks:
-      after_agent:
-        - prompt: |
-            Verify the implementation meets ALL quality criteria before completing:
-
-            1. **Directory Structure**: Is `.deepwork/jobs/[job_name]/` created correctly?
-            2. **Complete Instructions**: Are ALL step instruction files complete (not stubs or placeholders)?
-            3. **Specific & Actionable**: Are instructions tailored to each step's purpose, not generic?
-            4. **Output Examples**: Does each instruction file show what good output looks like?
-            5. **Quality Criteria**: Does each instruction file define quality criteria for its outputs?
-            6. **Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase "ask structured questions"?
-            7. **Sync Complete**: Has `deepwork sync` been run successfully?
-            8. **Commands Available**: Are the slash-commands generated in `.claude/commands/`?
-            9. **Rules Considered**: Have you thought about whether rules would benefit this job?
-               - If relevant rules were identified, did you explain them and offer to run `/deepwork_rules.define`?
-               - Not every job needs rules - only suggest when genuinely helpful.
-
-            If ANY criterion is not met, continue working to address it.
-            If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
+    quality_criteria:
+      - "**Directory Structure**: Is `.deepwork/jobs/[job_name]/` created correctly?"
+      - "**Complete Instructions**: Are ALL step instruction files complete (not stubs or placeholders)?"
+      - "**Specific & Actionable**: Are instructions tailored to each step's purpose, not generic?"
+      - "**Output Examples**: Does each instruction file show what good output looks like?"
+      - "**Quality Criteria**: Does each instruction file define quality criteria for its outputs?"
+      - "**Ask Structured Questions**: Do step instructions that gather user input explicitly use the phrase \"ask structured questions\"?"
+      - "**Sync Complete**: Has `deepwork sync` been run successfully?"
+      - "**Commands Available**: Are the slash-commands generated in `.claude/commands/`?"
+      - "**Rules Considered**: Has the agent thought about whether rules would benefit this job? If relevant rules were identified, did they explain them and offer to run `/deepwork_rules.define`? Not every job needs rules - only suggest when genuinely helpful."
 
   - id: learn
     name: "Learn from Job Execution"
     description: "Reflect on conversation to improve job instructions and capture learnings"
     instructions_file: steps/learn.md
+    exposed: true
     inputs:
       - name: job_name
         description: "Name of the job that was run (optional - will auto-detect from conversation)"
     outputs:
       - AGENTS.md
     dependencies: []
-    hooks:
-      after_agent:
-        - prompt: |
-            Verify the learning process meets ALL quality criteria before completing:
-
-            1. **Conversation Analyzed**: Did you review the conversation for DeepWork job executions?
-            2. **Confusion Identified**: Did you identify points of confusion, errors, or inefficiencies?
-            3. **Instructions Improved**: Were job instructions updated to address identified issues?
-            4. **Instructions Concise**: Are instructions free of redundancy and unnecessary verbosity?
-            5. **Shared Content Extracted**: Is lengthy/duplicated content extracted into referenced files?
-            6. **Bespoke Learnings Captured**: Were run-specific learnings added to AGENTS.md?
-            7. **File References Used**: Do AGENTS.md entries reference other files where appropriate?
-            8. **Working Folder Correct**: Is AGENTS.md in the correct working folder for the job?
-            9. **Generalizable Separated**: Are generalizable improvements in instructions, not AGENTS.md?
-            10. **Sync Complete**: Has `deepwork sync` been run if instructions were modified?
-
-            If ANY criterion is not met, continue working to address it.
-            If ALL criteria are satisfied, include `<promise>✓ Quality Criteria Met</promise>` in your response.
+    quality_criteria:
+      - "**Conversation Analyzed**: Did the agent review the conversation for DeepWork job executions?"
+      - "**Confusion Identified**: Did the agent identify points of confusion, errors, or inefficiencies?"
+      - "**Instructions Improved**: Were job instructions updated to address identified issues?"
+      - "**Instructions Concise**: Are instructions free of redundancy and unnecessary verbosity?"
+      - "**Shared Content Extracted**: Is lengthy/duplicated content extracted into referenced files?"
+      - "**Bespoke Learnings Captured**: Were run-specific learnings added to AGENTS.md?"
+      - "**File References Used**: Do AGENTS.md entries reference other files where appropriate?"
+      - "**Working Folder Correct**: Is AGENTS.md in the correct working folder for the job?"
+      - "**Generalizable Separated**: Are generalizable improvements in instructions, not AGENTS.md?"
+      - "**Sync Complete**: Has `deepwork sync` been run if instructions were modified?"
diff --git a/src/deepwork/templates/claude/command-job-meta.md.jinja b/src/deepwork/templates/claude/command-job-meta.md.jinja
new file mode 100644
index 00000000..66de2041
--- /dev/null
+++ b/src/deepwork/templates/claude/command-job-meta.md.jinja
@@ -0,0 +1,53 @@
+---
+description: {{ job_summary }}
+---
+
+# {{ job_name }}
+
+You are executing the **{{ job_name }}** job. {{ job_summary }}
+
+{% if job_description %}
+{{ job_description }}
+{% endif %}
+
+## Available Steps
+
+This job has {{ total_steps }} step(s):
+
+{% for step in steps %}
+### {{ step.id }}
+**{{ step.name }}**: {{ step.description }}
+- Command: `{{ step.command_name }}`
+{% if step.dependencies %}
+- Requires: {{ step.dependencies | join(', ') }}
+{% endif %}
+{% endfor %}
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and run through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/{{ job_name }}`
+
+2. **Identify the starting step** based on intent:
+{% for step in steps %}
+   - {{ step.id }}: {{ step.description }}
+{% endfor %}
+
+3. **Run the workflow** starting from the identified step:
+   - Invoke the starting step using the Skill tool
+   - When that step completes, **automatically continue** to the next step in the workflow
+   - Continue until the workflow is complete or the user intervenes
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Use AskUserQuestion to let them choose
+
+**Critical**:
+- You MUST invoke each step using the Skill tool. Do not copy/paste step instructions.
+- After each step completes, check if there's a next step and invoke it automatically.
+- The workflow continues until all dependent steps are complete.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/{{ job_name }}/job.yml`
diff --git a/src/deepwork/templates/claude/command-job-step.md.jinja b/src/deepwork/templates/claude/command-job-step.md.jinja
index ff7c2837..5793dc3f 100644
--- a/src/deepwork/templates/claude/command-job-step.md.jinja
+++ b/src/deepwork/templates/claude/command-job-step.md.jinja
@@ -1,23 +1,19 @@
 ---
 description: {{ step_description }}
-{% if hooks %}
+{% if quality_criteria or hooks %}
 hooks:
-{% for event_name, event_hooks in hooks.items() %}
-  {{ event_name }}:
+{% if quality_criteria %}
+  Stop:
     - hooks:
-{% for hook in event_hooks %}
-{% if hook.type == "script" %}
-        - type: command
-          command: ".deepwork/jobs/{{ job_name }}/{{ hook.path }}"
-{% else %}
         - type: prompt
           prompt: |
-{% if event_name == "Stop" %}
             You must evaluate whether Claude has met all the below quality criteria for the request.
 
             ## Quality Criteria
 
-            {{ hook.content | indent(12) }}
+{% for criterion in quality_criteria %}
+            {{ loop.index }}. {{ criterion }}
+{% endfor %}
 
             ## Instructions
 
@@ -27,13 +23,24 @@ hooks:
             If the agent has included `<promise>✓ Quality Criteria Met</promise>` in their response AND
             all criteria appear to be met, respond with: {"ok": true}
 
-            If criteria are NOT met AND the promise tag is missing, respond with:
-            {"ok": false, "reason": "Continue working. [specific feedback on what's wrong]"}
+            If criteria are NOT met OR the promise tag is missing, respond with:
+            {"ok": false, "reason": "**AGENT: TAKE ACTION** - [which criteria failed and why]"}
+{% endif %}
+{% for event_name, event_hooks in hooks.items() %}
+{% if not (event_name == "Stop" and quality_criteria) %}
+  {{ event_name }}:
+    - hooks:
+{% for hook in event_hooks %}
+{% if hook.type == "script" %}
+        - type: command
+          command: ".deepwork/jobs/{{ job_name }}/{{ hook.path }}"
 {% else %}
+        - type: prompt
+          prompt: |
             {{ hook.content | indent(12) }}
 {% endif %}
-{% endif %}
 {% endfor %}
+{% endif %}
 {% endfor %}
 {% endif %}
 ---
@@ -122,20 +129,24 @@ Ensure all outputs are:
 No specific files are output by this command.
 {% endif %}
 
-{% if stop_hooks %}
+{% if quality_criteria or stop_hooks %}
 ## Quality Validation Loop
 
 This step uses an iterative quality validation loop. After completing your work, stop hook(s) will evaluate whether the outputs meet quality criteria. If criteria are not met, you will be prompted to continue refining.
 
+{% if quality_criteria %}
+### Quality Criteria
+
+{% for criterion in quality_criteria %}
+{{ loop.index }}. {{ criterion }}
+{% endfor %}
+{% endif %}
+
 {% for hook in stop_hooks %}
 {% if hook.type == "script" %}
 **Validation Script**: `.deepwork/jobs/{{ job_name }}/{{ hook.path }}`
 
 The validation script will be executed automatically when you attempt to complete this step.
-{% else %}
-### Quality Criteria{% if stop_hooks | length > 1 %} ({{ loop.index }}){% endif %}
-
-{{ hook.content }}
 {% endif %}
 {% endfor %}
 
diff --git a/src/deepwork/templates/gemini/command-job-meta.toml.jinja b/src/deepwork/templates/gemini/command-job-meta.toml.jinja
new file mode 100644
index 00000000..d0ffefa1
--- /dev/null
+++ b/src/deepwork/templates/gemini/command-job-meta.toml.jinja
@@ -0,0 +1,61 @@
+# {{ job_name }}
+#
+# {{ job_summary }}
+#
+# Generated by DeepWork - do not edit manually
+
+description = "{{ job_summary | replace('"', '\\"') }}"
+
+prompt = """
+# {{ job_name }}
+
+You are executing the **{{ job_name }}** job. {{ job_summary }}
+
+{% if job_description %}
+{{ job_description }}
+{% endif %}
+
+## Available Steps
+
+This job has {{ total_steps }} step(s):
+
+{% for step in steps %}
+### {{ step.id }}
+**{{ step.name }}**: {{ step.description }}
+- Command: `/{{ step.command_name }}`
+{% if step.dependencies %}
+- Requires: {{ step.dependencies | join(', ') }}
+{% endif %}
+{% endfor %}
+
+## Instructions
+
+This is a **multi-step workflow**. Determine the starting point and guide the user through the steps in sequence.
+
+1. **Analyze user intent** from the text that follows `/{{ job_name }}`
+
+2. **Identify the starting step** based on intent:
+{% for step in steps %}
+   - {{ step.id }}: {{ step.description }}
+{% endfor %}
+
+3. **Guide the user through the workflow**:
+   - Tell the user to run the starting step command
+   - After each step completes, guide them to the next step in the workflow
+   - Continue until the workflow is complete
+
+   Step commands:
+{% for step in steps %}
+   - {{ step.id }}: `/{{ step.command_name }}`
+{% endfor %}
+
+4. **If intent is ambiguous**, ask the user which step to start from:
+   - Present the available steps as numbered options
+   - Let them choose
+
+**Note**: Gemini CLI requires manual command invocation. After each step completes, remind the user to run the next step command.
+
+## Context Files
+
+- Job definition: `.deepwork/jobs/{{ job_name }}/job.yml`
+"""
diff --git a/src/deepwork/templates/gemini/command-job-step.toml.jinja b/src/deepwork/templates/gemini/command-job-step.toml.jinja
index bb5d69be..47081b8f 100644
--- a/src/deepwork/templates/gemini/command-job-step.toml.jinja
+++ b/src/deepwork/templates/gemini/command-job-step.toml.jinja
@@ -92,18 +92,18 @@ Ensure all outputs are:
 No specific files are output by this command.
 {% endif %}
 
-{% if stop_hooks %}
+{% if quality_criteria or stop_hooks %}
 ## Quality Validation
 
 This step has quality criteria that should be verified before completion.
 
-{% for hook in stop_hooks %}
-{% if hook.type != "script" %}
-### Quality Criteria{% if stop_hooks | length > 1 %} ({{ loop.index }}){% endif %}
+{% if quality_criteria %}
+### Quality Criteria
 
-{{ hook.content }}
-{% endif %}
+{% for criterion in quality_criteria %}
+{{ loop.index }}. {{ criterion }}
 {% endfor %}
+{% endif %}
 
 **Note**: Gemini CLI does not support automated validation hooks. Please manually verify the criteria above before proceeding.
 
diff --git a/tests/e2e/test_claude_code_integration.py b/tests/e2e/test_claude_code_integration.py
index 0f75fac5..5becbbec 100644
--- a/tests/e2e/test_claude_code_integration.py
+++ b/tests/e2e/test_claude_code_integration.py
@@ -94,12 +94,14 @@ def test_generate_fruits_commands_in_temp_project(self) -> None:
 
             command_paths = generator.generate_all_commands(job, adapter, commands_dir)
 
-            # Validate commands were generated
-            assert len(command_paths) == 2
+            # Validate commands were generated (meta + steps)
+            assert len(command_paths) == 3  # 1 meta + 2 steps
 
-            identify_cmd = commands_dir / "commands" / "fruits.identify.md"
-            classify_cmd = commands_dir / "commands" / "fruits.classify.md"
+            meta_cmd = commands_dir / "commands" / "fruits.md"
+            identify_cmd = commands_dir / "commands" / "uw.fruits.identify.md"
+            classify_cmd = commands_dir / "commands" / "uw.fruits.classify.md"
 
+            assert meta_cmd.exists()
             assert identify_cmd.exists()
             assert classify_cmd.exists()
 
@@ -127,7 +129,8 @@ def test_command_structure_matches_claude_code_expectations(self) -> None:
             adapter = ClaudeAdapter()
             generator.generate_all_commands(job, adapter, commands_dir)
 
-            identify_cmd = commands_dir / "commands" / "fruits.identify.md"
+            # Step commands now have uw. prefix
+            identify_cmd = commands_dir / "commands" / "uw.fruits.identify.md"
             content = identify_cmd.read_text()
 
             # Claude Code expects specific sections
@@ -152,13 +155,14 @@ def test_dependency_chain_in_commands(self) -> None:
             adapter = ClaudeAdapter()
             generator.generate_all_commands(job, adapter, commands_dir)
 
+            # Step commands now have uw. prefix
             # First step should have no prerequisites
-            identify_cmd = commands_dir / "commands" / "fruits.identify.md"
+            identify_cmd = commands_dir / "commands" / "uw.fruits.identify.md"
             identify_content = identify_cmd.read_text()
             assert "## Prerequisites" not in identify_content
 
             # Second step should reference first step
-            classify_cmd = commands_dir / "commands" / "fruits.classify.md"
+            classify_cmd = commands_dir / "commands" / "uw.fruits.classify.md"
             classify_content = classify_cmd.read_text()
             assert "## Prerequisites" in classify_content
             assert "identify" in classify_content.lower()
diff --git a/tests/fixtures/jobs/exposed_step_job/job.yml b/tests/fixtures/jobs/exposed_step_job/job.yml
new file mode 100644
index 00000000..d480daeb
--- /dev/null
+++ b/tests/fixtures/jobs/exposed_step_job/job.yml
@@ -0,0 +1,29 @@
+name: exposed_job
+version: "0.1.0"
+summary: "A job with exposed and hidden steps for testing"
+description: |
+  A test job demonstrating the exposed step feature.
+
+  This job has two steps: one hidden by default and one explicitly exposed.
+
+changelog:
+  - version: "0.1.0"
+    changes: "Initial version"
+
+steps:
+  - id: hidden_step
+    name: "Hidden Step"
+    description: "A step that is hidden by default"
+    instructions_file: steps/hidden_step.md
+    outputs:
+      - hidden_output.md
+    dependencies: []
+
+  - id: exposed_step
+    name: "Exposed Step"
+    description: "A step that is explicitly exposed"
+    instructions_file: steps/exposed_step.md
+    exposed: true
+    outputs:
+      - exposed_output.md
+    dependencies: []
diff --git a/tests/fixtures/jobs/exposed_step_job/steps/exposed_step.md b/tests/fixtures/jobs/exposed_step_job/steps/exposed_step.md
new file mode 100644
index 00000000..309cba54
--- /dev/null
+++ b/tests/fixtures/jobs/exposed_step_job/steps/exposed_step.md
@@ -0,0 +1,7 @@
+# Exposed Step Instructions
+
+This step is explicitly exposed (visible command).
+
+## Task
+
+Perform a task that can be run directly by users.
diff --git a/tests/fixtures/jobs/exposed_step_job/steps/hidden_step.md b/tests/fixtures/jobs/exposed_step_job/steps/hidden_step.md
new file mode 100644
index 00000000..89dc0674
--- /dev/null
+++ b/tests/fixtures/jobs/exposed_step_job/steps/hidden_step.md
@@ -0,0 +1,7 @@
+# Hidden Step Instructions
+
+This step is hidden by default (has uw. prefix).
+
+## Task
+
+Perform a task that produces output.
diff --git a/tests/integration/test_fruits_workflow.py b/tests/integration/test_fruits_workflow.py
index 168c94d4..ee80380b 100644
--- a/tests/integration/test_fruits_workflow.py
+++ b/tests/integration/test_fruits_workflow.py
@@ -80,11 +80,14 @@ def test_fruits_command_generation(self, fixtures_dir: Path, temp_dir: Path) ->
 
         command_paths = generator.generate_all_commands(job, adapter, commands_dir)
 
-        assert len(command_paths) == 2
+        # Now includes meta-command + step commands
+        assert len(command_paths) == 3  # 1 meta + 2 steps
 
         # Verify command files exist
-        identify_cmd = commands_dir / "commands" / "fruits.identify.md"
-        classify_cmd = commands_dir / "commands" / "fruits.classify.md"
+        meta_cmd = commands_dir / "commands" / "fruits.md"
+        identify_cmd = commands_dir / "commands" / "uw.fruits.identify.md"
+        classify_cmd = commands_dir / "commands" / "uw.fruits.classify.md"
+        assert meta_cmd.exists()
         assert identify_cmd.exists()
         assert classify_cmd.exists()
 
@@ -100,7 +103,8 @@ def test_fruits_identify_command_content(self, fixtures_dir: Path, temp_dir: Pat
 
         generator.generate_all_commands(job, adapter, commands_dir)
 
-        identify_cmd = commands_dir / "commands" / "fruits.identify.md"
+        # Step commands now have uw. prefix
+        identify_cmd = commands_dir / "commands" / "uw.fruits.identify.md"
         content = identify_cmd.read_text()
 
         # Check header
@@ -130,7 +134,8 @@ def test_fruits_classify_command_content(self, fixtures_dir: Path, temp_dir: Pat
 
         generator.generate_all_commands(job, adapter, commands_dir)
 
-        classify_cmd = commands_dir / "commands" / "fruits.classify.md"
+        # Step commands now have uw. prefix
+        classify_cmd = commands_dir / "commands" / "uw.fruits.classify.md"
         content = classify_cmd.read_text()
 
         # Check header
diff --git a/tests/integration/test_full_workflow.py b/tests/integration/test_full_workflow.py
index 270c91c7..7d05634b 100644
--- a/tests/integration/test_full_workflow.py
+++ b/tests/integration/test_full_workflow.py
@@ -27,10 +27,17 @@ def test_parse_and_generate_workflow(self, fixtures_dir: Path, temp_dir: Path) -
 
         command_paths = generator.generate_all_commands(job, adapter, commands_dir)
 
-        assert len(command_paths) == 4
+        # Now includes meta-command + step commands
+        assert len(command_paths) == 5  # 1 meta + 4 steps
 
-        # Verify all command files exist and have correct content
-        for i, command_path in enumerate(command_paths):
+        # First command is the meta-command
+        assert command_paths[0].exists()
+        meta_content = command_paths[0].read_text()
+        assert f"# {job.name}" in meta_content
+        assert "Available Steps" in meta_content
+
+        # Verify all step command files exist and have correct content
+        for i, command_path in enumerate(command_paths[1:]):  # Skip meta-command
             assert command_path.exists()
             content = command_path.read_text()
 
@@ -56,10 +63,11 @@ def test_simple_job_workflow(self, fixtures_dir: Path, temp_dir: Path) -> None:
 
         command_paths = generator.generate_all_commands(job, adapter, commands_dir)
 
-        assert len(command_paths) == 1
+        # Now includes meta-command + step commands
+        assert len(command_paths) == 2  # 1 meta + 1 step
 
-        # Verify command content
-        content = command_paths[0].read_text()
+        # Verify step command content (skip meta-command at index 0)
+        content = command_paths[1].read_text()
         assert "# simple_job.single_step" in content
         # Single step with no dependencies is treated as standalone
         assert "Standalone command" in content
@@ -78,19 +86,21 @@ def test_command_generation_with_dependencies(self, fixtures_dir: Path, temp_dir
 
         command_paths = generator.generate_all_commands(job, adapter, commands_dir)
 
+        # command_paths[0] is meta-command, steps start at index 1
+
         # Check first step (no prerequisites)
-        step1_content = command_paths[0].read_text()
+        step1_content = command_paths[1].read_text()
         assert "## Prerequisites" not in step1_content
         assert "/competitive_research.primary_research" in step1_content  # Next step
 
         # Check second step (has prerequisites and next step)
-        step2_content = command_paths[1].read_text()
+        step2_content = command_paths[2].read_text()
         assert "## Prerequisites" in step2_content
         assert "/competitive_research.identify_competitors" in step2_content
         assert "/competitive_research.secondary_research" in step2_content  # Next step
 
         # Check last step (has prerequisites, no next step)
-        step4_content = command_paths[3].read_text()
+        step4_content = command_paths[4].read_text()
         assert "## Prerequisites" in step4_content
         assert "## Workflow Complete" in step4_content
         assert "## Next Step" not in step4_content
@@ -107,15 +117,17 @@ def test_command_generation_with_file_inputs(self, fixtures_dir: Path, temp_dir:
 
         command_paths = generator.generate_all_commands(job, adapter, commands_dir)
 
+        # command_paths[0] is meta-command, steps start at index 1
+
         # Check step with file input
-        step2_content = command_paths[1].read_text()  # primary_research
+        step2_content = command_paths[2].read_text()  # primary_research (index 2)
         assert "## Inputs" in step2_content
         assert "### Required Files" in step2_content
         assert "competitors.md" in step2_content
         assert "from step `identify_competitors`" in step2_content
 
         # Check step with multiple file inputs
-        step4_content = command_paths[3].read_text()  # comparative_report
+        step4_content = command_paths[4].read_text()  # comparative_report (index 4)
         assert "primary_research.md" in step4_content
         assert "secondary_research.md" in step4_content
 
@@ -131,8 +143,10 @@ def test_command_generation_with_user_inputs(self, fixtures_dir: Path, temp_dir:
 
         command_paths = generator.generate_all_commands(job, adapter, commands_dir)
 
+        # command_paths[0] is meta-command, steps start at index 1
+
         # Check step with user inputs
-        step1_content = command_paths[0].read_text()  # identify_competitors
+        step1_content = command_paths[1].read_text()  # identify_competitors (index 1)
         assert "## Inputs" in step1_content
         assert "### User Parameters" in step1_content
         assert "market_segment" in step1_content
diff --git a/tests/integration/test_install_flow.py b/tests/integration/test_install_flow.py
index 23037f65..66bcdc20 100644
--- a/tests/integration/test_install_flow.py
+++ b/tests/integration/test_install_flow.py
@@ -41,11 +41,20 @@ def test_install_with_claude(self, mock_claude_project: Path) -> None:
 
         # Verify core commands were created
         claude_dir = mock_claude_project / ".claude" / "commands"
-        assert (claude_dir / "deepwork_jobs.define.md").exists()
+        # Meta-command
+        assert (claude_dir / "deepwork_jobs.md").exists()
+        # Hidden step command (uw. prefix)
+        assert (claude_dir / "uw.deepwork_jobs.define.md").exists()
+        # Exposed step command (no uw. prefix - learn has exposed: true)
         assert (claude_dir / "deepwork_jobs.learn.md").exists()
 
-        # Verify command content
-        define_command = (claude_dir / "deepwork_jobs.define.md").read_text()
+        # Verify meta-command content
+        meta_command = (claude_dir / "deepwork_jobs.md").read_text()
+        assert "# deepwork_jobs" in meta_command
+        assert "Available Steps" in meta_command
+
+        # Verify hidden step command content
+        define_command = (claude_dir / "uw.deepwork_jobs.define.md").read_text()
         assert "# deepwork_jobs.define" in define_command
         assert "Define Job Specification" in define_command
 
@@ -106,11 +115,15 @@ def test_install_with_multiple_platforms_auto_detect(
 
         # Verify commands were created for both platforms
         claude_dir = mock_multi_platform_project / ".claude" / "commands"
-        assert (claude_dir / "deepwork_jobs.define.md").exists()
+        # Meta-command and hidden step commands
+        assert (claude_dir / "deepwork_jobs.md").exists()
+        assert (claude_dir / "uw.deepwork_jobs.define.md").exists()
 
         # Gemini uses job_name/step_id.toml structure
         gemini_dir = mock_multi_platform_project / ".gemini" / "commands"
-        assert (gemini_dir / "deepwork_jobs" / "define.toml").exists()
+        # Meta-command (index.toml) and hidden step commands
+        assert (gemini_dir / "deepwork_jobs" / "index.toml").exists()
+        assert (gemini_dir / "deepwork_jobs" / "uw.define.toml").exists()
 
     def test_install_with_specified_platform_when_missing(self, mock_git_repo: Path) -> None:
         """Test that install fails when specified platform is not present."""
@@ -149,7 +162,9 @@ def test_install_is_idempotent(self, mock_claude_project: Path) -> None:
         assert (deepwork_dir / "config.yml").exists()
 
         claude_dir = mock_claude_project / ".claude" / "commands"
-        assert (claude_dir / "deepwork_jobs.define.md").exists()
+        # Meta-command and step commands
+        assert (claude_dir / "deepwork_jobs.md").exists()
+        assert (claude_dir / "uw.deepwork_jobs.define.md").exists()
         assert (claude_dir / "deepwork_jobs.learn.md").exists()
 
     def test_install_creates_rules_directory(self, mock_claude_project: Path) -> None:
diff --git a/tests/unit/test_adapters.py b/tests/unit/test_adapters.py
index a1d06f32..d052f001 100644
--- a/tests/unit/test_adapters.py
+++ b/tests/unit/test_adapters.py
@@ -119,11 +119,27 @@ def test_get_commands_dir_raises_without_root(self) -> None:
         with pytest.raises(AdapterError, match="No project root specified"):
             adapter.get_commands_dir()
 
-    def test_get_command_filename(self) -> None:
-        """Test get_command_filename."""
+    def test_get_meta_command_filename(self) -> None:
+        """Test get_meta_command_filename."""
         adapter = ClaudeAdapter()
 
-        result = adapter.get_command_filename("my_job", "step_one")
+        result = adapter.get_meta_command_filename("my_job")
+
+        assert result == "my_job.md"
+
+    def test_get_step_command_filename_hidden_by_default(self) -> None:
+        """Test get_step_command_filename returns hidden filename by default."""
+        adapter = ClaudeAdapter()
+
+        result = adapter.get_step_command_filename("my_job", "step_one")
+
+        assert result == "uw.my_job.step_one.md"
+
+    def test_get_step_command_filename_exposed(self) -> None:
+        """Test get_step_command_filename returns visible filename when exposed."""
+        adapter = ClaudeAdapter()
+
+        result = adapter.get_step_command_filename("my_job", "step_one", exposed=True)
 
         assert result == "my_job.step_one.md"
 
@@ -246,22 +262,41 @@ def test_get_commands_dir_raises_without_root(self) -> None:
         with pytest.raises(AdapterError, match="No project root specified"):
             adapter.get_commands_dir()
 
-    def test_get_command_filename(self) -> None:
-        """Test get_command_filename returns TOML with subdirectory."""
+    def test_get_meta_command_filename(self) -> None:
+        """Test get_meta_command_filename returns index.toml in subdirectory."""
         adapter = GeminiAdapter()
 
-        result = adapter.get_command_filename("my_job", "step_one")
+        result = adapter.get_meta_command_filename("my_job")
+
+        # Gemini uses subdirectories with index.toml for meta-commands
+        assert result == "my_job/index.toml"
+
+    def test_get_step_command_filename_hidden_by_default(self) -> None:
+        """Test get_step_command_filename returns hidden TOML with subdirectory."""
+        adapter = GeminiAdapter()
+
+        result = adapter.get_step_command_filename("my_job", "step_one")
 
         # Gemini uses subdirectories for namespacing (colon becomes path)
+        # Hidden steps have uw. prefix
+        assert result == "my_job/uw.step_one.toml"
+
+    def test_get_step_command_filename_exposed(self) -> None:
+        """Test get_step_command_filename returns visible TOML when exposed."""
+        adapter = GeminiAdapter()
+
+        result = adapter.get_step_command_filename("my_job", "step_one", exposed=True)
+
+        # Exposed steps have no uw. prefix
         assert result == "my_job/step_one.toml"
 
-    def test_get_command_filename_with_underscores(self) -> None:
-        """Test get_command_filename with underscores in names."""
+    def test_get_step_command_filename_with_underscores(self) -> None:
+        """Test get_step_command_filename with underscores in names."""
         adapter = GeminiAdapter()
 
-        result = adapter.get_command_filename("competitive_research", "identify_competitors")
+        result = adapter.get_step_command_filename("competitive_research", "identify_competitors")
 
-        assert result == "competitive_research/identify_competitors.toml"
+        assert result == "competitive_research/uw.identify_competitors.toml"
 
     def test_hook_name_mapping_is_empty(self) -> None:
         """Test that Gemini has no command-level hooks."""
diff --git a/tests/unit/test_generator.py b/tests/unit/test_generator.py
index f83d20a2..839cd6fd 100644
--- a/tests/unit/test_generator.py
+++ b/tests/unit/test_generator.py
@@ -36,7 +36,7 @@ def test_init_raises_for_missing_templates_dir(self, temp_dir: Path) -> None:
             CommandGenerator(nonexistent)
 
     def test_generate_step_command_simple_job(self, fixtures_dir: Path, temp_dir: Path) -> None:
-        """Test generating command for simple job step."""
+        """Test generating command for simple job step (hidden by default)."""
         job_dir = fixtures_dir / "jobs" / "simple_job"
         job = parse_job_definition(job_dir)
 
@@ -46,7 +46,8 @@ def test_generate_step_command_simple_job(self, fixtures_dir: Path, temp_dir: Pa
         command_path = generator.generate_step_command(job, job.steps[0], adapter, temp_dir)
 
         assert command_path.exists()
-        assert command_path.name == "simple_job.single_step.md"
+        # Step commands are hidden by default (uw. prefix)
+        assert command_path.name == "uw.simple_job.single_step.md"
 
         content = command_path.read_text()
         assert "# simple_job.single_step" in content
@@ -176,7 +177,7 @@ def test_generate_step_command_raises_for_missing_instructions(
             instructions_file.write_text(original_content)
 
     def test_generate_all_commands(self, fixtures_dir: Path, temp_dir: Path) -> None:
-        """Test generating commands for all steps in a job."""
+        """Test generating commands for all steps in a job (meta + hidden steps)."""
         job_dir = fixtures_dir / "jobs" / "complex_job"
         job = parse_job_definition(job_dir)
 
@@ -185,15 +186,78 @@ def test_generate_all_commands(self, fixtures_dir: Path, temp_dir: Path) -> None
 
         command_paths = generator.generate_all_commands(job, adapter, temp_dir)
 
-        assert len(command_paths) == 4
+        # Now includes meta-command plus step commands
+        assert len(command_paths) == 5  # 1 meta + 4 steps
         assert all(p.exists() for p in command_paths)
 
-        # Check filenames
+        # Check filenames - meta-command first, then hidden step commands
         expected_names = [
-            "competitive_research.identify_competitors.md",
-            "competitive_research.primary_research.md",
-            "competitive_research.secondary_research.md",
-            "competitive_research.comparative_report.md",
+            "competitive_research.md",  # Meta-command
+            "uw.competitive_research.identify_competitors.md",  # Hidden steps
+            "uw.competitive_research.primary_research.md",
+            "uw.competitive_research.secondary_research.md",
+            "uw.competitive_research.comparative_report.md",
+        ]
+        actual_names = [p.name for p in command_paths]
+        assert actual_names == expected_names
+
+    def test_generate_meta_command(self, fixtures_dir: Path, temp_dir: Path) -> None:
+        """Test generating meta-command for a job."""
+        job_dir = fixtures_dir / "jobs" / "complex_job"
+        job = parse_job_definition(job_dir)
+
+        generator = CommandGenerator()
+        adapter = ClaudeAdapter()
+
+        meta_command_path = generator.generate_meta_command(job, adapter, temp_dir)
+
+        assert meta_command_path.exists()
+        assert meta_command_path.name == "competitive_research.md"
+
+        content = meta_command_path.read_text()
+        # Check meta-command content
+        assert "# competitive_research" in content
+        assert "Available Steps" in content
+        assert "identify_competitors" in content
+        assert "primary_research" in content
+        assert "Skill tool" in content
+
+    def test_generate_step_command_exposed_step(self, fixtures_dir: Path, temp_dir: Path) -> None:
+        """Test generating command for exposed step (no uw. prefix)."""
+        job_dir = fixtures_dir / "jobs" / "exposed_step_job"
+        job = parse_job_definition(job_dir)
+
+        generator = CommandGenerator()
+        adapter = ClaudeAdapter()
+
+        # Generate the exposed step (index 1)
+        command_path = generator.generate_step_command(job, job.steps[1], adapter, temp_dir)
+
+        assert command_path.exists()
+        # Exposed step should NOT have uw. prefix
+        assert command_path.name == "exposed_job.exposed_step.md"
+
+    def test_generate_all_commands_with_exposed_steps(
+        self, fixtures_dir: Path, temp_dir: Path
+    ) -> None:
+        """Test generating all commands with mix of hidden and exposed steps."""
+        job_dir = fixtures_dir / "jobs" / "exposed_step_job"
+        job = parse_job_definition(job_dir)
+
+        generator = CommandGenerator()
+        adapter = ClaudeAdapter()
+
+        command_paths = generator.generate_all_commands(job, adapter, temp_dir)
+
+        # Meta-command + 2 steps (1 hidden, 1 exposed)
+        assert len(command_paths) == 3
+        assert all(p.exists() for p in command_paths)
+
+        # Check filenames - hidden step has uw. prefix, exposed doesn't
+        expected_names = [
+            "exposed_job.md",  # Meta-command
+            "uw.exposed_job.hidden_step.md",  # Hidden step
+            "exposed_job.exposed_step.md",  # Exposed step (no uw. prefix)
         ]
         actual_names = [p.name for p in command_paths]
         assert actual_names == expected_names
diff --git a/tests/unit/test_parser.py b/tests/unit/test_parser.py
index 11e06665..cfdba8fc 100644
--- a/tests/unit/test_parser.py
+++ b/tests/unit/test_parser.py
@@ -86,6 +86,33 @@ def test_from_dict_with_inputs(self) -> None:
         assert step.inputs[1].is_file_input()
         assert step.dependencies == ["step0"]
 
+    def test_from_dict_exposed_default_false(self) -> None:
+        """Test that exposed defaults to False."""
+        data = {
+            "id": "step1",
+            "name": "Step 1",
+            "description": "First step",
+            "instructions_file": "steps/step1.md",
+            "outputs": ["output.md"],
+        }
+        step = Step.from_dict(data)
+
+        assert step.exposed is False
+
+    def test_from_dict_exposed_true(self) -> None:
+        """Test creating step with exposed=True."""
+        data = {
+            "id": "step1",
+            "name": "Step 1",
+            "description": "First step",
+            "instructions_file": "steps/step1.md",
+            "outputs": ["output.md"],
+            "exposed": True,
+        }
+        step = Step.from_dict(data)
+
+        assert step.exposed is True
+
 
 class TestJobDefinition:
     """Tests for JobDefinition dataclass."""
@@ -284,6 +311,19 @@ def test_parses_file_inputs(self, fixtures_dir: Path) -> None:
         assert step.inputs[0].file == "competitors.md"
         assert step.inputs[0].from_step == "identify_competitors"
 
+    def test_parses_exposed_steps(self, fixtures_dir: Path) -> None:
+        """Test parsing job with exposed and hidden steps."""
+        job_dir = fixtures_dir / "jobs" / "exposed_step_job"
+        job = parse_job_definition(job_dir)
+
+        assert len(job.steps) == 2
+        # First step is hidden by default
+        assert job.steps[0].id == "hidden_step"
+        assert job.steps[0].exposed is False
+        # Second step is explicitly exposed
+        assert job.steps[1].id == "exposed_step"
+        assert job.steps[1].exposed is True
+
     def test_raises_for_missing_directory(self, temp_dir: Path) -> None:
         """Test parsing fails for missing directory."""
         nonexistent = temp_dir / "nonexistent"