You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/deepwork/standard_jobs/deepwork_jobs/job.yml
+40-45Lines changed: 40 additions & 45 deletions
Original file line number
Diff line number
Diff line change
@@ -54,9 +54,9 @@ steps:
54
54
reviews:
55
55
- run_each: job.yml
56
56
quality_criteria:
57
-
"Intermediate Deliverables": "Does the job break out across the logical steps such that there are reviewable intermediate deliverables?"
57
+
"Intermediate Deliverables": "The job breaks out across logical steps with reviewable intermediate deliverables."
58
58
"Reviews": |
59
-
Are there reviews defined for each step? Do particularly critical documents have their own reviews?
59
+
Reviews are defined for each step. Particularly critical documents have their own reviews.
60
60
Note that the reviewers do not have transcript access, so if the criteria are about the conversation,
61
61
then add a `.deepwork/tmp/[step_summary].md` step output file so the agent has a communication channel to the reviewer.
62
62
@@ -78,13 +78,13 @@ steps:
78
78
- run_each: step_instruction_files
79
79
additional_review_guidance: "Read the job.yml file in the same job directory for context on how this instruction file fits into the larger workflow."
80
80
quality_criteria:
81
-
"Complete Instructions": "Is the instruction file complete (no stubs or placeholders)?"
82
-
"Specific & Actionable": "Are instructions tailored to the step's purpose, not generic?"
83
-
"Output Examples": "Does the instruction file show what good output looks like? This can be either template examples, or negative examples of what not to do. Only required if the step has ouputs"
84
-
"Quality Criteria": "Does the instruction file define quality criteria for its outputs?"
85
-
"Ask Structured Questions": "If this step gathers user input, do instructions explicitly use the phrase 'ask structured questions'? If the step has no user inputs, this criterion passes automatically."
86
-
"Prompt Engineering": "Does the instruction file follow Anthropic's best practices for prompt engineering?"
87
-
"No Redundant Info": "Does the instruction file avoid duplicating information that belongs in the job.yml's common_job_info_provided_to_all_steps_at_runtime section? Shared context (project background, terminology, conventions) should be in common_job_info, not repeated in each step."
81
+
"Complete Instructions": "The instruction file is complete (no stubs or placeholders)."
82
+
"Specific & Actionable": "Instructions are tailored to the step's purpose, not generic."
83
+
"Output Examples": "The instruction file shows what good output looks like. This can be either template examples, or negative examples of what not to do. Only required if the step has outputs."
84
+
"Quality Criteria": "The instruction file defines quality criteria for its outputs."
85
+
"Ask Structured Questions": "If this step gathers user input, instructions explicitly use the phrase 'ask structured questions'. If the step has no user inputs, this criterion passes automatically."
86
+
"Prompt Engineering": "The instruction file follows Anthropic's best practices for prompt engineering."
87
+
"No Redundant Info": "The instruction file avoids duplicating information that belongs in the job.yml's common_job_info_provided_to_all_steps_at_runtime section. Shared context (project background, terminology, conventions) is in common_job_info, not repeated in each step."
88
88
89
89
- id: test
90
90
name: "Test the New Workflow"
@@ -106,11 +106,11 @@ steps:
106
106
reviews:
107
107
- run_each: step
108
108
quality_criteria:
109
-
"Workflow Invoked": "Was the new workflow actually run on the user's test case via MCP?"
110
-
"Output Critiqued": "Did the agent identify up to 3 top issues with the output?"
111
-
"User Feedback Gathered": "Did the agent ask the user about each issue and gather additional feedback?"
112
-
"Corrections Made": "Were all requested corrections applied to the output?"
113
-
"User Satisfied": "Did the user confirm the output meets their needs?"
109
+
"Workflow Invoked": "The new workflow was actually run on the user's test case via MCP."
110
+
"Output Critiqued": "The agent identified up to 3 top issues with the output."
111
+
"User Feedback Gathered": "The agent asked the user about each issue and gathered additional feedback."
112
+
"Corrections Made": "All requested corrections were applied to the output."
113
+
"User Satisfied": "The user confirmed the output meets their needs."
114
114
115
115
- id: iterate
116
116
name: "Iterate on Workflow Design"
@@ -170,14 +170,14 @@ steps:
170
170
reviews:
171
171
- run_each: step
172
172
quality_criteria:
173
-
"Conversation Analyzed": "Did the agent review the conversation for DeepWork job executions?"
174
-
"Confusion Identified": "Did the agent identify points of confusion, errors, or inefficiencies?"
175
-
"Instructions Improved": "Were job instructions updated to address identified issues?"
176
-
"Instructions Concise": "Are instructions free of redundancy and unnecessary verbosity?"
177
-
"Shared Content Extracted": "Is lengthy/duplicated content extracted into referenced files?"
178
-
"Bespoke Learnings Captured": "Were run-specific learnings added to AGENTS.md?"
179
-
"File References Used": "Do AGENTS.md entries reference other files where appropriate?"
180
-
"Working Folder Correct": "Is AGENTS.md in the correct working folder for the job?"
173
+
"Conversation Analyzed": "The agent reviewed the conversation for DeepWork job executions."
174
+
"Confusion Identified": "The agent identified points of confusion, errors, or inefficiencies."
175
+
"Instructions Improved": "Job instructions were updated to address identified issues."
176
+
"Instructions Concise": "Instructions are free of redundancy and unnecessary verbosity."
177
+
"Shared Content Extracted": "Lengthy/duplicated content is extracted into referenced files."
178
+
"Bespoke Learnings Captured": "Run-specific learnings were added to AGENTS.md."
179
+
"File References Used": "AGENTS.md entries reference other files where appropriate."
180
+
"Working Folder Correct": "AGENTS.md is in the correct working folder for the job."
181
181
182
182
- id: fix_settings
183
183
name: "Fix Settings Files"
@@ -193,15 +193,14 @@ steps:
193
193
reviews:
194
194
- run_each: step
195
195
quality_criteria:
196
-
"DeepWork Skills Removed": "Are `Skill(...)` entries matching jobs in `.deepwork/jobs/` removed?"
197
-
"Non-DeepWork Skills Preserved": "Are skills NOT matching DeepWork jobs left intact?"
198
-
"Stale make_new_job.sh Removed": "Are stale `Bash(...)` permissions referencing `.deepwork/jobs/deepwork_jobs/make_new_job.sh` removed?"
199
-
"Rules Hooks Removed": "Are all DeepWork Rules hooks and permissions removed?"
200
-
"Duplicate Hooks Removed": "Are duplicate hook entries consolidated or removed?"
"Orphaned Steps Fixed": "For jobs with no workflows, is there a single workflow (named after the job) containing all steps? For jobs with existing workflows, does each orphan get its own workflow (named after the step)?"
232
-
"Valid YAML": "Are all job.yml files valid YAML?"
227
+
"Exposed Field Addressed": "`exposed: true` fields are removed or noted as deprecated."
228
+
"Stop Hooks Migrated": "`stop_hooks` are migrated to `hooks.after_agent` format."
229
+
"Removed Steps Cleaned": "References to removed steps (like `review_job_spec`) are updated."
230
+
"Orphaned Steps Fixed": "For jobs with no workflows, there is a single workflow (named after the job) containing all steps. For jobs with existing workflows, each orphan gets its own workflow (named after the step)."
231
+
"job.ymls are readable": "Calling `get_workflows` from the Deepwork tool shows all expected jobs. If any are missing, its YML is likely bad."
233
232
234
233
- id: errata
235
234
name: "Clean Up Errata"
@@ -244,13 +243,9 @@ steps:
244
243
- fix_jobs
245
244
reviews:
246
245
- run_each: step
247
-
additional_review_guidance: "Check the .deepwork/jobs/ directory and .claude/skills/ directory to verify the cleanup was done correctly."
246
+
additional_review_guidance: "You should do this in a small number or turns - tee up every data request you need in your first call. Do not invoke sub-agents."
248
247
quality_criteria:
249
-
"Legacy Job Skills Removed": "Are legacy skill folders for each job removed from `.claude/skills/` and `.gemini/skills/`?"
250
-
"Deepwork Skill Preserved": "Does the `deepwork` skill folder still exist in `.claude/skills/deepwork/`?"
Copy file name to clipboardExpand all lines: src/deepwork/standard_jobs/deepwork_jobs/steps/define.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -203,18 +203,18 @@ For final outputs, reviews let you make sure the output meets the user's expecta
203
203
204
204
**Reviews format:**
205
205
206
-
Each review specifies `run_each` (what to review) and `quality_criteria` (a map of criterion name to question):
206
+
Each review specifies `run_each` (what to review) and `quality_criteria` (a map of criterion name to a statement describing the expected state after the step completes — NOT a question):
207
207
208
208
```yaml
209
209
reviews:
210
210
- run_each: step # Review all outputs together
211
211
quality_criteria:
212
-
"Consistent Style": "Do all files follow the same structure?"
213
-
"Complete Coverage": "Are all required topics covered?"
212
+
"Consistent Style": "All files follow the same structure."
213
+
"Complete Coverage": "All required topics are covered."
214
214
- run_each: report_files # Review each file in a 'files'-type output individually
215
215
quality_criteria:
216
-
"Well Written": "Is the content clear and well-organized?"
217
-
"Data-Backed": "Are claims supported by data?"
216
+
"Well Written": "Content is clear and well-organized."
217
+
"Data-Backed": "Claims are supported by data."
218
218
```
219
219
220
220
**`run_each` options:**
@@ -229,11 +229,11 @@ reviews:
229
229
- run_each: report_files
230
230
additional_review_guidance: "Read the comparison_matrix.md file for context on whether claims in the report are supported by the analysis data."
231
231
quality_criteria:
232
-
"Data-Backed": "Are recommendations supported by the competitive analysis data?"
232
+
"Data-Backed": "Recommendations are supported by the competitive analysis data."
233
233
- run_each: step_instruction_files
234
234
additional_review_guidance: "Read the job.yml file in the same job directory for context on how this instruction file fits into the larger workflow."
235
235
quality_criteria:
236
-
"Complete Instructions": "Is the instruction file complete?"
236
+
"Complete Instructions": "The instruction file is complete."
Copy file name to clipboardExpand all lines: src/deepwork/standard_jobs/deepwork_jobs/steps/fix_jobs.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -223,15 +223,15 @@ steps:
223
223
224
224
### Step 7: Migrate `quality_criteria` to `reviews`
225
225
226
-
The flat `quality_criteria` field on steps has been replaced by the `reviews` array. Each review specifies `run_each` (what to review) and `quality_criteria` as a map of criterion name to question.
226
+
The flat `quality_criteria` field on steps has been replaced by the `reviews` array. Each review specifies `run_each` (what to review) and `quality_criteria` as a map of criterion name to a statement describing the expected state (not a question).
227
227
228
228
**Before (deprecated):**
229
229
```yaml
230
230
steps:
231
231
- id: my_step
232
232
quality_criteria:
233
-
- "**Complete**: Is the output complete?"
234
-
- "**Accurate**: Is the data accurate?"
233
+
- "**Complete**: The output is complete."
234
+
- "**Accurate**: The data is accurate."
235
235
```
236
236
237
237
**After (current format):**
@@ -241,13 +241,13 @@ steps:
241
241
reviews:
242
242
- run_each: step
243
243
quality_criteria:
244
-
"Complete": "Is the output complete?"
245
-
"Accurate": "Is the data accurate?"
244
+
"Complete": "The output is complete."
245
+
"Accurate": "The data is accurate."
246
246
```
247
247
248
248
**Migration rules:**
249
249
250
-
1. **Parse the old format**: Each string typically follows `**Name**: Question` format. Extract the name (bold text) as the map key and the question as the value.
250
+
1. **Parse the old format**: Each string typically follows `**Name**: Question/Statement` format. Extract the name (bold text) as the map key and convert the value to a statement of expected state (not a question).
251
251
2. **Choose `run_each`**: Default to `step` (reviews all outputs together). If the step has a single primary output, consider using that output name instead.
252
252
3. **For steps with no quality_criteria**: Use `reviews: []`
253
253
4. **Remove the old field**: Delete the `quality_criteria` array entirely after migration.
Copy file name to clipboardExpand all lines: src/deepwork/standard_jobs/deepwork_jobs/steps/learn.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -88,7 +88,7 @@ For each generalizable learning:
88
88
- Include helpful examples
89
89
- Clarify ambiguous instructions
90
90
- Update quality criteria if needed
91
-
- If you identify problems in the outcomes of steps, those usually should be reflected in an update to the `reviews` for that step in `job.yml` (adjusting criteria names, questions, or `run_each` targeting)
91
+
- If you identify problems in the outcomes of steps, those usually should be reflected in an update to the `reviews` for that step in `job.yml` (adjusting criteria names, statements, or `run_each` targeting)
92
92
93
93
3.**Keep instructions concise**
94
94
- Avoid redundancy - don't repeat the same guidance in multiple places
0 commit comments