Skip to content

Commit e5dca7b

Browse files
committed
more improvements
1 parent 9f74cdb commit e5dca7b

File tree

10 files changed

+77
-82
lines changed

10 files changed

+77
-82
lines changed

flake.lock

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/deepwork/standard_jobs/deepwork_jobs/job.yml

Lines changed: 40 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,9 @@ steps:
5454
reviews:
5555
- run_each: job.yml
5656
quality_criteria:
57-
"Intermediate Deliverables": "Does the job break out across the logical steps such that there are reviewable intermediate deliverables?"
57+
"Intermediate Deliverables": "The job breaks out across logical steps with reviewable intermediate deliverables."
5858
"Reviews": |
59-
Are there reviews defined for each step? Do particularly critical documents have their own reviews?
59+
Reviews are defined for each step. Particularly critical documents have their own reviews.
6060
Note that the reviewers do not have transcript access, so if the criteria are about the conversation,
6161
then add a `.deepwork/tmp/[step_summary].md` step output file so the agent has a communication channel to the reviewer.
6262
@@ -78,13 +78,13 @@ steps:
7878
- run_each: step_instruction_files
7979
additional_review_guidance: "Read the job.yml file in the same job directory for context on how this instruction file fits into the larger workflow."
8080
quality_criteria:
81-
"Complete Instructions": "Is the instruction file complete (no stubs or placeholders)?"
82-
"Specific & Actionable": "Are instructions tailored to the step's purpose, not generic?"
83-
"Output Examples": "Does the instruction file show what good output looks like? This can be either template examples, or negative examples of what not to do. Only required if the step has ouputs"
84-
"Quality Criteria": "Does the instruction file define quality criteria for its outputs?"
85-
"Ask Structured Questions": "If this step gathers user input, do instructions explicitly use the phrase 'ask structured questions'? If the step has no user inputs, this criterion passes automatically."
86-
"Prompt Engineering": "Does the instruction file follow Anthropic's best practices for prompt engineering?"
87-
"No Redundant Info": "Does the instruction file avoid duplicating information that belongs in the job.yml's common_job_info_provided_to_all_steps_at_runtime section? Shared context (project background, terminology, conventions) should be in common_job_info, not repeated in each step."
81+
"Complete Instructions": "The instruction file is complete (no stubs or placeholders)."
82+
"Specific & Actionable": "Instructions are tailored to the step's purpose, not generic."
83+
"Output Examples": "The instruction file shows what good output looks like. This can be either template examples, or negative examples of what not to do. Only required if the step has outputs."
84+
"Quality Criteria": "The instruction file defines quality criteria for its outputs."
85+
"Ask Structured Questions": "If this step gathers user input, instructions explicitly use the phrase 'ask structured questions'. If the step has no user inputs, this criterion passes automatically."
86+
"Prompt Engineering": "The instruction file follows Anthropic's best practices for prompt engineering."
87+
"No Redundant Info": "The instruction file avoids duplicating information that belongs in the job.yml's common_job_info_provided_to_all_steps_at_runtime section. Shared context (project background, terminology, conventions) is in common_job_info, not repeated in each step."
8888

8989
- id: test
9090
name: "Test the New Workflow"
@@ -106,11 +106,11 @@ steps:
106106
reviews:
107107
- run_each: step
108108
quality_criteria:
109-
"Workflow Invoked": "Was the new workflow actually run on the user's test case via MCP?"
110-
"Output Critiqued": "Did the agent identify up to 3 top issues with the output?"
111-
"User Feedback Gathered": "Did the agent ask the user about each issue and gather additional feedback?"
112-
"Corrections Made": "Were all requested corrections applied to the output?"
113-
"User Satisfied": "Did the user confirm the output meets their needs?"
109+
"Workflow Invoked": "The new workflow was actually run on the user's test case via MCP."
110+
"Output Critiqued": "The agent identified up to 3 top issues with the output."
111+
"User Feedback Gathered": "The agent asked the user about each issue and gathered additional feedback."
112+
"Corrections Made": "All requested corrections were applied to the output."
113+
"User Satisfied": "The user confirmed the output meets their needs."
114114

115115
- id: iterate
116116
name: "Iterate on Workflow Design"
@@ -170,14 +170,14 @@ steps:
170170
reviews:
171171
- run_each: step
172172
quality_criteria:
173-
"Conversation Analyzed": "Did the agent review the conversation for DeepWork job executions?"
174-
"Confusion Identified": "Did the agent identify points of confusion, errors, or inefficiencies?"
175-
"Instructions Improved": "Were job instructions updated to address identified issues?"
176-
"Instructions Concise": "Are instructions free of redundancy and unnecessary verbosity?"
177-
"Shared Content Extracted": "Is lengthy/duplicated content extracted into referenced files?"
178-
"Bespoke Learnings Captured": "Were run-specific learnings added to AGENTS.md?"
179-
"File References Used": "Do AGENTS.md entries reference other files where appropriate?"
180-
"Working Folder Correct": "Is AGENTS.md in the correct working folder for the job?"
173+
"Conversation Analyzed": "The agent reviewed the conversation for DeepWork job executions."
174+
"Confusion Identified": "The agent identified points of confusion, errors, or inefficiencies."
175+
"Instructions Improved": "Job instructions were updated to address identified issues."
176+
"Instructions Concise": "Instructions are free of redundancy and unnecessary verbosity."
177+
"Shared Content Extracted": "Lengthy/duplicated content is extracted into referenced files."
178+
"Bespoke Learnings Captured": "Run-specific learnings were added to AGENTS.md."
179+
"File References Used": "AGENTS.md entries reference other files where appropriate."
180+
"Working Folder Correct": "AGENTS.md is in the correct working folder for the job."
181181

182182
- id: fix_settings
183183
name: "Fix Settings Files"
@@ -193,15 +193,14 @@ steps:
193193
reviews:
194194
- run_each: step
195195
quality_criteria:
196-
"DeepWork Skills Removed": "Are `Skill(...)` entries matching jobs in `.deepwork/jobs/` removed?"
197-
"Non-DeepWork Skills Preserved": "Are skills NOT matching DeepWork jobs left intact?"
198-
"Stale make_new_job.sh Removed": "Are stale `Bash(...)` permissions referencing `.deepwork/jobs/deepwork_jobs/make_new_job.sh` removed?"
199-
"Rules Hooks Removed": "Are all DeepWork Rules hooks and permissions removed?"
200-
"Duplicate Hooks Removed": "Are duplicate hook entries consolidated or removed?"
201-
"Hardcoded Paths Removed": "Are user-specific hardcoded paths (like `/Users/*/...`) removed?"
202-
"Deprecated Commands Removed": "Are deprecated commands like `deepwork hook *` removed?"
203-
"Valid JSON": "Is settings.json still valid JSON after modifications?"
204-
"Backup Created": "Was a backup of the original settings created before modifications?"
196+
"DeepWork Skills Removed": "`Skill(...)` entries matching jobs in `.deepwork/jobs/` are removed."
197+
"Non-DeepWork Skills Preserved": "Skills NOT matching DeepWork jobs are left intact."
198+
"Stale make_new_job.sh Removed": "Stale `Bash(...)` permissions referencing `.deepwork/jobs/deepwork_jobs/make_new_job.sh` are removed."
199+
"Rules Hooks Removed": "All DeepWork Rules hooks and permissions are removed."
200+
"Duplicate Hooks Removed": "Duplicate hook entries are consolidated or removed."
201+
"Hardcoded Paths Removed": "User-specific hardcoded paths (like `/Users/*/...`) are removed."
202+
"Deprecated Commands Removed": "Deprecated commands like `deepwork hook *` are removed."
203+
"Backup Created": "A backup of the original settings was created before modifications."
205204

206205
- id: fix_jobs
207206
name: "Fix Job Definitions"
@@ -225,11 +224,11 @@ steps:
225224
- run_each: step
226225
additional_review_guidance: "Read the .claude/settings.json file for context on what settings were cleaned up in the prior step."
227226
quality_criteria:
228-
"Exposed Field Addressed": "Are `exposed: true` fields removed or noted as deprecated?"
229-
"Stop Hooks Migrated": "Are `stop_hooks` migrated to `hooks.after_agent` format?"
230-
"Removed Steps Cleaned": "Are references to removed steps (like `review_job_spec`) updated?"
231-
"Orphaned Steps Fixed": "For jobs with no workflows, is there a single workflow (named after the job) containing all steps? For jobs with existing workflows, does each orphan get its own workflow (named after the step)?"
232-
"Valid YAML": "Are all job.yml files valid YAML?"
227+
"Exposed Field Addressed": "`exposed: true` fields are removed or noted as deprecated."
228+
"Stop Hooks Migrated": "`stop_hooks` are migrated to `hooks.after_agent` format."
229+
"Removed Steps Cleaned": "References to removed steps (like `review_job_spec`) are updated."
230+
"Orphaned Steps Fixed": "For jobs with no workflows, there is a single workflow (named after the job) containing all steps. For jobs with existing workflows, each orphan gets its own workflow (named after the step)."
231+
"job.ymls are readable": "Calling `get_workflows` from the Deepwork tool shows all expected jobs. If any are missing, its YML is likely bad."
233232

234233
- id: errata
235234
name: "Clean Up Errata"
@@ -244,13 +243,9 @@ steps:
244243
- fix_jobs
245244
reviews:
246245
- run_each: step
247-
additional_review_guidance: "Check the .deepwork/jobs/ directory and .claude/skills/ directory to verify the cleanup was done correctly."
246+
additional_review_guidance: "You should do this in a small number or turns - tee up every data request you need in your first call. Do not invoke sub-agents."
248247
quality_criteria:
249-
"Legacy Job Skills Removed": "Are legacy skill folders for each job removed from `.claude/skills/` and `.gemini/skills/`?"
250-
"Deepwork Skill Preserved": "Does the `deepwork` skill folder still exist in `.claude/skills/deepwork/`?"
251-
"Temp Files Cleaned": "Are `.deepwork/tmp/` contents cleaned appropriately?"
252-
"Rules Folder Removed": "Is `.deepwork/rules/` folder backed up and removed (fully deprecated)?"
253-
"Rules Job Removed": "Is `.deepwork/jobs/deepwork_rules/` removed if present?"
254-
"Config Version Updated": "Is `.deepwork/config.yml` using current version format?"
255-
"DeepWork Re-installed": "Was `deepwork install` run after cleanup, and does it complete without errors?"
256-
"Git Status Clean": "Are changes ready to be committed (no untracked garbage files)?"
248+
"Legacy Job Skills Removed": "Legacy skill folders for each job are removed from `.claude/skills/` and `.gemini/skills/`."
249+
"Deepwork Skill Preserved": "The `deepwork` skill folder still exists in `.claude/skills/deepwork/`."
250+
"Rules Folder Removed": "`.deepwork/rules/` folder is gone."
251+
"Rules Job Removed": "`.deepwork/jobs/deepwork_rules/` is gone."

src/deepwork/standard_jobs/deepwork_jobs/research_report_job_best_practices.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -150,16 +150,16 @@ reviews:
150150
# Content review - is the analysis sound?
151151
- run_each: final_report.md
152152
quality_criteria:
153-
"Claims Cited": "Is every factual claim backed by a specific source or query from the dataroom?"
154-
"Questions Answered": "Are all research questions from the scoping document addressed?"
155-
"Depth": "Does the analysis go beyond surface-level observations to root causes or actionable insights?"
153+
"Claims Cited": "Every factual claim is backed by a specific source or query from the dataroom."
154+
"Questions Answered": "All research questions from the scoping document are addressed."
155+
"Depth": "The analysis goes beyond surface-level observations to root causes or actionable insights."
156156

157157
# Presentation review - is the output polished?
158158
- run_each: final_report.md
159159
quality_criteria:
160-
"Readable Flow": "Does the document flow logically for someone reading it without prior context?"
161-
"Audience Fit": "Is the language and detail level appropriate for the intended audience?"
162-
"Visual Quality": "Do all charts, tables, and figures render correctly and add value?"
160+
"Readable Flow": "The document flows logically for someone reading it without prior context."
161+
"Audience Fit": "The language and detail level are appropriate for the intended audience."
162+
"Visual Quality": "All charts, tables, and figures render correctly and add value."
163163
```
164164
165165
### Capability Considerations

src/deepwork/standard_jobs/deepwork_jobs/steps/define.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -203,18 +203,18 @@ For final outputs, reviews let you make sure the output meets the user's expecta
203203

204204
**Reviews format:**
205205

206-
Each review specifies `run_each` (what to review) and `quality_criteria` (a map of criterion name to question):
206+
Each review specifies `run_each` (what to review) and `quality_criteria` (a map of criterion name to a statement describing the expected state after the step completes — NOT a question):
207207

208208
```yaml
209209
reviews:
210210
- run_each: step # Review all outputs together
211211
quality_criteria:
212-
"Consistent Style": "Do all files follow the same structure?"
213-
"Complete Coverage": "Are all required topics covered?"
212+
"Consistent Style": "All files follow the same structure."
213+
"Complete Coverage": "All required topics are covered."
214214
- run_each: report_files # Review each file in a 'files'-type output individually
215215
quality_criteria:
216-
"Well Written": "Is the content clear and well-organized?"
217-
"Data-Backed": "Are claims supported by data?"
216+
"Well Written": "Content is clear and well-organized."
217+
"Data-Backed": "Claims are supported by data."
218218
```
219219
220220
**`run_each` options:**
@@ -229,11 +229,11 @@ reviews:
229229
- run_each: report_files
230230
additional_review_guidance: "Read the comparison_matrix.md file for context on whether claims in the report are supported by the analysis data."
231231
quality_criteria:
232-
"Data-Backed": "Are recommendations supported by the competitive analysis data?"
232+
"Data-Backed": "Recommendations are supported by the competitive analysis data."
233233
- run_each: step_instruction_files
234234
additional_review_guidance: "Read the job.yml file in the same job directory for context on how this instruction file fits into the larger workflow."
235235
quality_criteria:
236-
"Complete Instructions": "Is the instruction file complete?"
236+
"Complete Instructions": "The instruction file is complete."
237237
```
238238

239239
**When to use `additional_review_guidance`:**

src/deepwork/standard_jobs/deepwork_jobs/steps/fix_jobs.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -223,15 +223,15 @@ steps:
223223

224224
### Step 7: Migrate `quality_criteria` to `reviews`
225225

226-
The flat `quality_criteria` field on steps has been replaced by the `reviews` array. Each review specifies `run_each` (what to review) and `quality_criteria` as a map of criterion name to question.
226+
The flat `quality_criteria` field on steps has been replaced by the `reviews` array. Each review specifies `run_each` (what to review) and `quality_criteria` as a map of criterion name to a statement describing the expected state (not a question).
227227

228228
**Before (deprecated):**
229229
```yaml
230230
steps:
231231
- id: my_step
232232
quality_criteria:
233-
- "**Complete**: Is the output complete?"
234-
- "**Accurate**: Is the data accurate?"
233+
- "**Complete**: The output is complete."
234+
- "**Accurate**: The data is accurate."
235235
```
236236

237237
**After (current format):**
@@ -241,13 +241,13 @@ steps:
241241
reviews:
242242
- run_each: step
243243
quality_criteria:
244-
"Complete": "Is the output complete?"
245-
"Accurate": "Is the data accurate?"
244+
"Complete": "The output is complete."
245+
"Accurate": "The data is accurate."
246246
```
247247

248248
**Migration rules:**
249249

250-
1. **Parse the old format**: Each string typically follows `**Name**: Question` format. Extract the name (bold text) as the map key and the question as the value.
250+
1. **Parse the old format**: Each string typically follows `**Name**: Question/Statement` format. Extract the name (bold text) as the map key and convert the value to a statement of expected state (not a question).
251251
2. **Choose `run_each`**: Default to `step` (reviews all outputs together). If the step has a single primary output, consider using that output name instead.
252252
3. **For steps with no quality_criteria**: Use `reviews: []`
253253
4. **Remove the old field**: Delete the `quality_criteria` array entirely after migration.

src/deepwork/standard_jobs/deepwork_jobs/steps/implement.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,9 +66,9 @@ If a step in the job.yml has `reviews` defined, the generated instruction file s
6666
reviews:
6767
- run_each: research_notes.md
6868
quality_criteria:
69-
"Sufficient Data": "Does each competitor have at least 3 data points?"
70-
"Sources Cited": "Are sources cited for key claims?"
71-
"Current Information": "Is the information current (within last year)?"
69+
"Sufficient Data": "Each competitor has at least 3 data points."
70+
"Sources Cited": "Sources are cited for key claims."
71+
"Current Information": "Information is current (within last year)."
7272
```
7373
7474
**The instruction file should include:**

src/deepwork/standard_jobs/deepwork_jobs/steps/iterate.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -86,15 +86,15 @@ Review and update quality reviews in two places:
8686
reviews:
8787
- run_each: step
8888
quality_criteria:
89-
"Formatted Correctly": "Is the report formatted correctly?"
89+
"Formatted Correctly": "The report is formatted correctly."
9090

9191
# After
9292
reviews:
9393
- run_each: report.md
9494
quality_criteria:
95-
"Distinct Colors": "Does the report use distinct colors for each data series in charts?"
96-
"Readable Tables": "Do tables have sufficient padding and font size for readability?"
97-
"Clear Summary": "Is the executive summary understandable by non-technical readers?"
95+
"Distinct Colors": "The report uses distinct colors for each data series in charts."
96+
"Readable Tables": "Tables have sufficient padding and font size for readability."
97+
"Clear Summary": "The executive summary is understandable by non-technical readers."
9898
```
9999
100100
### Step 5: Consider Alternative Tools

src/deepwork/standard_jobs/deepwork_jobs/steps/learn.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ For each generalizable learning:
8888
- Include helpful examples
8989
- Clarify ambiguous instructions
9090
- Update quality criteria if needed
91-
- If you identify problems in the outcomes of steps, those usually should be reflected in an update to the `reviews` for that step in `job.yml` (adjusting criteria names, questions, or `run_each` targeting)
91+
- If you identify problems in the outcomes of steps, those usually should be reflected in an update to the `reviews` for that step in `job.yml` (adjusting criteria names, statements, or `run_each` targeting)
9292

9393
3. **Keep instructions concise**
9494
- Avoid redundancy - don't repeat the same guidance in multiple places

0 commit comments

Comments
 (0)