diff --git a/doc/learning_agents_architecture.md b/doc/learning_agents_architecture.md new file mode 100644 index 00000000..770018c4 --- /dev/null +++ b/doc/learning_agents_architecture.md @@ -0,0 +1,303 @@ +# LearningAgents Architecture + +## Overview + +LearningAgents are auto-improving AI sub-agents that accumulate domain knowledge across sessions. They are implemented as a Claude Code plugin (`learning_agents`) that adds hooks, skills, and agents to enable a closed-loop learning cycle: use an agent → track its sessions → identify mistakes → investigate root causes → incorporate learnings back into the agent. + +This design is inspired by the "experts" system from PR #192 but restructured as a standalone Claude Code plugin with session-level issue tracking and automated learning workflows. + +## Core Concepts + +### LearningAgent +A sub-agent with a persistent knowledge base that improves over time. Defined in `.deepwork/learning-agents//` with structured expertise, topics, and learnings. Each LearningAgent gets a corresponding `.claude/agents/` file that dynamically loads its current knowledge at invocation time. + +### Learning Cycle +The feedback loop that makes agents improve: +1. **Use** — Agent is invoked via Task tool during normal work +2. **Track** — Post-Task hook records the session for later review +3. **Identify** — Transcript is reviewed for issues/mistakes +4. **Investigate** — Root causes are determined from transcript evidence +5. **Incorporate** — Learnings are folded back into the agent's knowledge base + +### Session Logs +Temporary per-session records in `.deepwork/tmp/agent_sessions/` that track which LearningAgents were used and flag sessions needing learning. These are transient working files, not permanent records. + +## Plugin Structure + +``` +learning_agents/ +├── .claude-plugin/ +│ └── plugin.json # Plugin manifest +├── skills/ +│ ├── learning-agents/SKILL.md # Dispatch skill (user-facing) +│ ├── create-agent/SKILL.md # Create new LearningAgent +│ ├── learn/SKILL.md # Run learning cycle on all pending sessions +│ ├── identify/SKILL.md # [hidden] Find issues in a session transcript +│ ├── report-issue/SKILL.md # [hidden] Create an issue file +│ ├── investigate-issues/SKILL.md # [hidden] Research issue root causes +│ └── incorporate-learnings/SKILL.md # [hidden] Integrate learnings into agent +├── scripts/ +│ └── create_agent.sh # Setup script for new LearningAgent scaffolding +├── hooks/ +│ ├── hooks.json # Hook configuration +│ ├── post_task.sh # After Task: track LearningAgent usage +│ └── session_stop.sh # On Stop: suggest learning cycle if needed +├── agents/ +│ └── learning-agent-expert.md # LearningAgentExpert — knows how LearningAgents work +└── doc/ + ├── learning_agent_file_structure.md # Structure of .deepwork/learning-agents/ + ├── learning_agent_post_task_reminder.md # Reminder shown after LearningAgent use + ├── issue_yml_format.md # Issue file schema + └── learning_log_folder_structure.md # Structure of .deepwork/tmp/agent_sessions/ +``` + +## Data Layout + +### LearningAgent Definition (persistent, Git-tracked) + +``` +.deepwork/learning-agents// +├── core-knowledge.md # Core expertise in second person (required) +├── topics/ +│ └── .md # Detailed reference docs (frontmatter + body) +├── learnings/ +│ └── .md # Experience-based insights (frontmatter + body) +└── additional_learning_guidelines/ # Per-agent learning cycle customization + ├── README.md + ├── issue_identification.md # Extra guidance for identify step + ├── issue_investigation.md # Extra guidance for investigate step + └── learning_from_issues.md # Extra guidance for incorporate step +``` + +See `learning_agent_file_structure.md` for full schema details. + +### Generated Agent File + +``` +.claude/agents/.md +``` + +Created by `create-agent`. Uses Claude Code's `!`command`` dynamic context injection to include the agent's current `core-knowledge.md` and a topic index at invocation time. Topics are listed as an index (filename + name from frontmatter) rather than included in full, keeping the agent prompt focused. Learnings are referenced by directory path with a note about their purpose — the agent can read individual files as needed. The `!`command`` syntax runs shell commands before the content is sent to Claude — the command output replaces the placeholder. + +Example structure: +```markdown +--- +name: +description: "" +--- + +# Core Knowledge + +!`cat .deepwork/learning-agents//core-knowledge.md` + +# Topics + +Located in `.deepwork/learning-agents//topics/` + +!`for f in .deepwork/learning-agents//topics/*.md; do [ -f "$f" ] || continue; desc=$(awk '/^---/{c++; next} c==1 && /^name:/{sub(/^name: *"?/,""); sub(/"$/,""); print; exit}' "$f"); echo "- $(basename "$f"): $desc"; done` + +# Learnings + +Learnings are incident post-mortems from past agent sessions capturing mistakes, root causes, and generalizable insights. Review them before starting work to avoid repeating past mistakes. Located in `.deepwork/learning-agents//learnings/`. +``` + +### Session Logs (transient, gitignored) + +``` +.deepwork/tmp/agent_sessions/// +├── needs_learning_as_of_timestamp # Flag file (body = ISO 8601 timestamp) +├── learning_last_performed_timestamp # When learning was last run on this conversation +├── agent_used # Body = LearningAgent folder name +└── .issue.yml # Issues found during learning +``` + +See `learning_log_folder_structure.md` for full details. + +## Hooks + +### PostToolUse → Task (post_task.sh) + +Fires after every `Task` tool call. The script: + +1. Extracts the agent name from the Task's `tool_input` +2. Checks if a matching folder exists in `.deepwork/learning-agents/` — if not, exits silently +3. Creates `.deepwork/tmp/agent_sessions///needs_learning_as_of_timestamp` with current timestamp +4. Creates `.deepwork/tmp/agent_sessions///agent_used` with the agent name +5. Outputs the post-task reminder as feedback (contents of `learning_agent_post_task_reminder.md`) + +**Hook config:** +```json +{ + "PostToolUse": [ + { + "matcher": "Task", + "hooks": [ + { + "type": "command", + "command": "${CLAUDE_PLUGIN_ROOT}/hooks/post_task.sh" + } + ] + } + ] +} +``` + +### Stop (session_stop.sh) + +Fires when the main agent finishes responding. The script: + +1. Searches for any `needs_learning_as_of_timestamp` files under `.deepwork/tmp/agent_sessions/` +2. If none found, exits silently (exit 0, no output) +3. If found, outputs feedback suggesting a learning cycle — does NOT block (exit 0 with stdout) + +**Hook config:** +```json +{ + "Stop": [ + { + "matcher": "", + "hooks": [ + { + "type": "command", + "command": "${CLAUDE_PLUGIN_ROOT}/hooks/session_stop.sh" + } + ] + } + ] +} +``` + +## Skills + +### User-Facing Skills + +#### learning-agents (dispatch) +Entry point skill. Parses user intent and dispatches to the appropriate sub-skill: +- `/learning-agents create ` → invokes `create-agent` +- `/learning-agents learn` → invokes `learn` +- `/learning-agents report_issue
` → invokes `report-issue` + +#### create-agent +Creates a new LearningAgent. The skill first invokes a setup script, then guides the user through filling in the content. + +**Step 1 — Setup script (`scripts/create_agent.sh`)** + +The skill invokes a bundled shell script that handles all the boilerplate: + +1. Creates the LearningAgent directory structure: + ``` + .deepwork/learning-agents// + ├── core-knowledge.md # Stubbed with TODO placeholder + ├── topics/ # Empty directory + ├── learnings/ # Empty directory + └── additional_learning_guidelines/ + ├── README.md # Explains each file's purpose + ├── issue_identification.md # Empty (customize for identify step) + ├── issue_investigation.md # Empty (customize for investigate step) + └── learning_from_issues.md # Empty (customize for incorporate step) + ``` + +2. Creates the Claude Code agent frontmatter file at `.claude/agents/.md` with: + - TODO entries for `name` and `description` in the frontmatter (to be filled in by the agent in step 2) + - Body that uses `!`command`` dynamic includes to pull content from the LearningAgent directory at invocation time. Topics are included as an index (filename + name), not in full. Learnings are referenced by directory path: + ```markdown + --- + name: TODO + description: "TODO" + --- + + # Core Knowledge + + !`cat .deepwork/learning-agents//core-knowledge.md` + + # Topics + + Located in `.deepwork/learning-agents//topics/` + + !`for f in .deepwork/learning-agents//topics/*.md; do ...extract name from frontmatter...; echo "- $filename: $name"; done` + + # Learnings + + Learnings are incident post-mortems from past agent sessions. Located in `.deepwork/learning-agents//learnings/`. + ``` + + This means the agent file never needs regeneration — it always reflects the latest knowledge at invocation time. Topics are indexed rather than fully included to keep the prompt focused. + +**Step 2 — Fill in agent identity** + +After the script runs, the skill prompts the user to describe what the agent is an expert in, then: +- Updates `core-knowledge.md` with the agent's expertise +- Updates the `.claude/agents/.md` frontmatter (`name` and `description`) to reflect the agent's domain + +**Step 3 — Seed initial knowledge** + +Fills in key files in the LearningAgent directory — initial topics and/or learnings if the user provides seed knowledge about the domain. + +#### learn +Runs the full learning cycle on all sessions needing it. Workflow: +1. Uses `!`find ...`` to inject a list of all paths containing a `needs_learning_as_of_timestamp` file into the prompt +2. For each such folder, spawns a Task with the `LearningAgentExpert` agent using **Sonnet model** to run the `identify` skill +3. After identification completes, spawns a Task with the `LearningAgentExpert` to run `investigate-issues` then `incorporate-learnings` in sequence + +### Hidden Skills (used by LearningAgentExpert during learning) + +#### identify +Reviews a session transcript to find issues. Takes the session/agent_id folder path as an argument. +- Uses `!`cat ...`` to inject `learning_last_performed_timestamp` value into the prompt +- Reads the transcript and identifies mistakes, underperformance, or knowledge gaps +- Calls `report-issue` for each issue found +- If `learning_last_performed_timestamp` exists, starts scanning from that point + +#### report-issue +Creates an `.issue.yml` file in the session's agent log folder. Sets initial status to `identified` with the issue description and observed timestamps. See `issue_yml_format.md` for the schema. + +#### investigate-issues +Processes all `identified` issues in a session folder: +1. Finds issues with status `identified` (includes example bash command in skill prompt) +2. Uses `!`cat $0/agent_used`` to inject the agent name, then `!`cat`` to load the agent's instructions — avoiding extra round trips +3. Reads the agent's full expertise and knowledge +4. For each issue: reads relevant transcript sections, determines root cause, updates status to `investigated` with `investigation_report` + +#### incorporate-learnings +Integrates investigated issues into the LearningAgent: +1. Finds issues with status `investigated` (includes example bash command) +2. For each issue, takes a learning action — one of: + - **Update core knowledge**: Modify `core-knowledge.md` to address the knowledge gap + - **Add a learning**: Create a new file in `learnings/` with the insight + - **Add/update a topic**: Create or update a file in `topics/` with reference docs + - **Update existing learning**: Amend an existing learning with new evidence +3. Updates issue status to `learned` +4. After all issues processed, deletes `needs_learning_as_of_timestamp` +5. Updates `learning_last_performed_timestamp` in the session log folder to current time + +## Agents + +### LearningAgentExpert + +A standard (non-learning) agent defined in the plugin. Its prompt dynamically includes all the LearningAgent documentation via `!`cat ...``: + +- `learning_agent_file_structure.md` +- `issue_yml_format.md` +- `learning_log_folder_structure.md` +- `learning_agent_post_task_reminder.md` + +This agent is used by the `learn` skill to execute `identify`, `investigate-issues`, and `incorporate-learnings` sub-skills. It understands the full LearningAgent system and can work with any LearningAgent's files. + +This is a **normal** agent (not a LearningAgent) because it ships with the plugin and should not evolve per-repo — it gets updated with the package. + +## Design Decisions + +### Why a Plugin (not a DeepWork Job) +LearningAgents are a cross-cutting concern that enhances any agent usage, not a specific multi-step workflow. A plugin can install hooks, skills, and agents in one unit. Jobs are for repeatable multi-step work products. + +### Why `!`command`` Dynamic Includes +Static agent files go stale as learnings accumulate. By using Claude Code's `!`command`` dynamic context injection, the agent file always reflects the latest knowledge without requiring regeneration after every learning cycle. The shell commands run before content is sent to Claude, so Claude receives the actual data. + +### Why Session-Level Issue Tracking +Issues are tied to specific transcripts for evidence. Storing them alongside session logs keeps the evidence linkage clear. Once learnings are incorporated, the session logs can be cleaned up without losing the persistent knowledge (which lives in the agent's `learnings/` directory). + +### Why Sonnet for Learning Tasks +The `learn` skill spawns identification tasks using the Sonnet model. Transcript review is high-volume, pattern-matching work that doesn't require the most capable model. This keeps learning cycles fast and cost-effective. + +### Why Hidden Skills +Skills like `identify`, `report-issue`, `investigate-issues`, and `incorporate-learnings` are implementation details of the learning cycle. They're invoked by the `learn` skill via Task delegation, not directly by users. Hiding them keeps the user-facing skill surface clean. diff --git a/learning_agents/.claude-plugin/plugin.json b/learning_agents/.claude-plugin/plugin.json new file mode 100644 index 00000000..f88fd19e --- /dev/null +++ b/learning_agents/.claude-plugin/plugin.json @@ -0,0 +1,9 @@ +{ + "name": "learning-agents", + "description": "Auto-improving AI sub-agents that learn from their mistakes across sessions", + "version": "0.1.0", + "author": { + "name": "DeepWork" + }, + "repository": "https://github.com/Unsupervisedcom/deepwork" +} diff --git a/learning_agents/agents/learning-agent-expert.md b/learning_agents/agents/learning-agent-expert.md new file mode 100644 index 00000000..115b926b --- /dev/null +++ b/learning_agents/agents/learning-agent-expert.md @@ -0,0 +1,46 @@ +--- +name: learning-agent-expert +description: "Expert on the LearningAgent system. Operates on LearningAgent files: identifies issues in transcripts, investigates root causes, and incorporates learnings into agent definitions." +--- + +# LearningAgent System Expert + +You are the meta-expert that operates on LearningAgent files. You understand the file structure, issue format, learning log lifecycle, and how to improve agents based on session transcripts. + +## Reference Documentation + +### Agent File Structure + +!`cat ${CLAUDE_PLUGIN_ROOT}/doc/learning_agent_file_structure.md` + +### Issue File Format + +!`cat ${CLAUDE_PLUGIN_ROOT}/doc/issue_yml_format.md` + +### Learning Log Folder Structure + +!`cat ${CLAUDE_PLUGIN_ROOT}/doc/learning_log_folder_structure.md` + +### Post-Task Reminder + +!`cat ${CLAUDE_PLUGIN_ROOT}/doc/learning_agent_post_task_reminder.md` + +## Your Role + +You are invoked by the learning cycle skills (`identify`, `investigate-issues`, `incorporate-learnings`) to process session transcripts and improve LearningAgent definitions. You have deep knowledge of: + +1. **How LearningAgent files are structured** — `core-knowledge.md`, `topics/`, `learnings/` directories +2. **The issue lifecycle** — `identified` -> `investigated` -> `learned` +3. **What makes a good learning** — specific, actionable, grounded in evidence from transcripts +4. **How to update agent expertise** — when to add topics vs learnings vs amend `core-knowledge.md` + +When processing transcripts, focus on: +- **Concrete mistakes**: Wrong outputs, incorrect assumptions, missed edge cases +- **Knowledge gaps**: Areas where the agent lacked necessary domain knowledge +- **Pattern failures**: Repeated errors suggesting a systemic issue +- **Missed context**: Information available but not utilized + +Avoid: +- Reporting trivial issues (typos, minor formatting) +- Creating learnings for one-off environmental issues (network timeouts, etc.) +- Duplicating existing learnings already in the agent's knowledge base diff --git a/learning_agents/doc/issue_yml_format.md b/learning_agents/doc/issue_yml_format.md new file mode 100644 index 00000000..a425cab8 --- /dev/null +++ b/learning_agents/doc/issue_yml_format.md @@ -0,0 +1,57 @@ +# Issue File Format + +Issue files track problems observed in LearningAgent sessions. They are stored in the session's agent log folder and processed during learning cycles. + +## Filename + +``` +.issue.yml +``` + +Use dashes and keep names brief but descriptive. Examples: +- `wrong-retry-strategy.issue.yml` +- `missed-edge-case-in-validation.issue.yml` +- `hallucinated-api-endpoint.issue.yml` + +## Fields + +```yaml +status: identified +seen_at_timestamps: + - "2025-01-15T14:32:00Z" + - "2025-01-15T14:45:00Z" +issue_description: | + Freeform text explaining the thing that went wrong. + This describes the PROBLEM, not the cause. +investigation_report: | + Freeform text explaining the root cause of the reported issue. + Should include specific line numbers of key evidence in the transcript. + NOT present when the issue is first created. +``` + +### status + +Tracks the issue through the learning lifecycle: + +| Status | Meaning | +|--------|---------| +| `identified` | Issue observed but not yet researched further | +| `investigated` | Root cause understood; we know why it happened | +| `learned` | Learning has been incorporated into the LearningAgent | + +### seen_at_timestamps + +Array of ISO 8601 timestamps where the issue **manifested** (not root cause lines). These are either: +- The exact timestamp from the transcript line numbers, if reviewing transcript files +- The current time, if reporting the issue in real-time + +### issue_description + +Freeform text describing the observable problem. Focus on **what went wrong**, not why. Be specific enough that someone reading this can understand the failure without seeing the transcript. + +### investigation_report + +Freeform text explaining the **root cause** of the issue. Added during the `investigate_issues` step (not present when the issue is first created). Should include: +- Specific line numbers from the transcript as evidence +- Why the agent behaved incorrectly +- What knowledge gap or instruction deficiency caused the issue diff --git a/learning_agents/doc/learning_agent_file_structure.md b/learning_agents/doc/learning_agent_file_structure.md new file mode 100644 index 00000000..1c360a22 --- /dev/null +++ b/learning_agents/doc/learning_agent_file_structure.md @@ -0,0 +1,117 @@ +# LearningAgent File Structure + +All LearningAgents live in `.deepwork/learning-agents/`. Each agent has its own subdirectory named with dashes (e.g., `rails-activejob`). + +## Directory Layout + +``` +.deepwork/learning-agents/ +└── / + ├── core-knowledge.md # Core expertise (required) + ├── topics/ # Detailed topic documentation + │ └── .md + ├── learnings/ # Experience-based insights + │ └── .md + └── additional_learning_guidelines/ # Per-agent learning cycle customization + ├── README.md # Explains each file's purpose + ├── issue_identification.md # Extra guidance for identify step + ├── issue_investigation.md # Extra guidance for investigate step + └── learning_from_issues.md # Extra guidance for incorporate step +``` + +## core-knowledge.md + +The agent's core expertise, written in second person ("You should...") because this text becomes the agent's system instructions. Structure it as: + +1. Identity statement ("You are an expert on...") +2. Core concepts and terminology +3. Common patterns and best practices +4. Pitfalls to avoid +5. Decision frameworks + +This file is plain markdown (no frontmatter needed). It is dynamically included into the agent's prompt at invocation time via the `.claude/agents/.md` file. + +The agent's discovery description (used to decide whether to invoke this agent) lives in the `.claude/agents/.md` frontmatter `description` field, not in this directory. + +The folder name is the source of truth for the agent's name. + +> **Note**: `learning_last_performed_timestamp` is tracked per-conversation in the session log folder (`.deepwork/tmp/agent_sessions/`), not here. See `learning_log_folder_structure.md`. + +## Topics vs Learnings + +Topics and learnings serve different purposes: + +- **Topics** are conceptual reference material about areas within the agent's domain. They document *how things work* — patterns, APIs, conventions, decision frameworks. Think of them as chapters in a reference manual. +- **Learnings** are detailed post-mortems of specific incidents where the agent made mistakes and something was learned. They document *what went wrong and why* — like debugging war stories. They're suited for complex experiences (e.g., a multi-step debugging session that uncovered a subtle interaction) where the narrative context matters for understanding the lesson. + +Both are injected into the agent's prompt at invocation time, but they serve different retrieval needs: topics answer "how should I approach X?" while learnings answer "what went wrong last time I tried X?" + +## topics/ Directory + +Reference documentation on conceptual areas within the agent's domain. Each topic is a markdown file with YAML frontmatter: + +```markdown +--- +name: Retry Handling +keywords: + - retry + - exponential backoff + - dead letter queue +last_updated: 2025-01-15 +--- + +Detailed documentation about retry handling... +``` + +**Frontmatter fields:** +- `name` (required): Human-readable topic name +- `keywords` (optional): Topic-specific search terms. Avoid broad domain terms. +- `last_updated` (optional): Date in YYYY-MM-DD format + +## learnings/ Directory + +Incident post-mortems from real agent sessions. Each learning captures a specific mistake or failure, what investigation revealed, and the generalizable insight. These are most valuable for complex experiences — multi-step debuggings, subtle misunderstandings, or surprising interactions — where the full narrative context helps the agent avoid repeating the same mistake. + +Each learning is a markdown file with YAML frontmatter: + +```markdown +--- +name: Job errors not going to Sentry +last_updated: 2025-01-20 +summarized_result: | + Brief 1-3 sentence summary of the key takeaway. +--- + +## Context +What was happening when the issue occurred. + +## Investigation +What was discovered during investigation. + +## Resolution +How the issue was resolved. + +## Key Takeaway +The generalizable insight that should inform future behavior. +``` + +**Frontmatter fields:** +- `name` (required): Descriptive name of the learning +- `last_updated` (optional): Date in YYYY-MM-DD format +- `summarized_result` (optional but recommended): 1-3 sentence summary + +## additional_learning_guidelines/ Directory + +Per-agent customization for the learning cycle. Each file is automatically included (via `!`cat``) in the corresponding hidden skill's Context section. Files are empty by default — add markdown content to guide learning behavior for this specific agent. + +- **`issue_identification.md`** — Included in the `identify` skill. Use to specify what kinds of issues matter most, what to ignore, or domain-specific mistake signals. +- **`issue_investigation.md`** — Included in the `investigate-issues` skill. Use to guide root cause analysis — common root causes, which knowledge areas to check first, investigation heuristics. +- **`learning_from_issues.md`** — Included in the `incorporate-learnings` skill. Use to guide how learnings are integrated — preferences for topics vs learnings, naming conventions, or areas of core-knowledge to keep concise. + +See the `README.md` in the directory for a quick reference. + +## Naming Conventions + +- **Folder names**: Use dashes (`rails-activejob`, `data-pipeline`) +- **Agent names in Claude Code**: Match folder names (`rails-activejob`) +- **Topic/learning filenames**: Use dashes, descriptive names (`retry-handling.md`) diff --git a/learning_agents/doc/learning_agent_post_task_reminder.md b/learning_agents/doc/learning_agent_post_task_reminder.md new file mode 100644 index 00000000..9308edd6 --- /dev/null +++ b/learning_agents/doc/learning_agent_post_task_reminder.md @@ -0,0 +1,5 @@ +If you need anything else related to the same topics from that agent, `resume` the same task rather than starting a new one — this preserves context and produces better results. + +If the agent made a mistake or underperformed your expectations, either: +1. **Resume** the conversation explaining the issue and asking for corrected output/action, or +2. Invoke `/learning-agents:report_issue
` diff --git a/learning_agents/doc/learning_log_folder_structure.md b/learning_agents/doc/learning_log_folder_structure.md new file mode 100644 index 00000000..1fac7fd1 --- /dev/null +++ b/learning_agents/doc/learning_log_folder_structure.md @@ -0,0 +1,53 @@ +# Learning Log Folder Structure + +Session-level agent interaction logs are stored in `.deepwork/tmp/agent_sessions/`. These directories track which LearningAgents were used and what issues were found during learning cycles. + +## Directory Layout + +``` +.deepwork/tmp/agent_sessions/ +└── / + └── / + ├── needs_learning_as_of_timestamp # Flag: learning needed (auto-created by hook) + ├── learning_last_performed_timestamp # When learning was last run on this conversation + ├── agent_used # Name of the LearningAgent (auto-created by hook) + └── .issue.yml # Issue files (created during learning cycle) +``` + +## Files + +### needs_learning_as_of_timestamp + +Created automatically by the post-Task hook whenever a LearningAgent is used. The file body contains a single ISO 8601 timestamp indicating when the agent was last invoked. This file serves as a flag: its presence means the session transcript has not yet been processed for learnings. + +Deleted by `incorporate_learnings` after all issues in the folder have been processed. + +### learning_last_performed_timestamp + +Updated by `incorporate_learnings` after processing issues in this conversation. Contains a single ISO 8601 timestamp. Used by the `identify` skill to skip already-processed portions of the transcript — if this file exists, identification starts scanning from that point forward rather than re-reading the entire transcript. + +### agent_used + +Created automatically by the post-Task hook. Contains the name of the LearningAgent that was used in this session (matching the folder name under `.deepwork/learning-agents/`). This links the session's agent_id back to the LearningAgent definition so learning skills can look up the agent's instructions and knowledge. + +### *.issue.yml + +Issue files created during the `identify` and `report_issue` skills. See `issue_yml_format.md` for the full schema. These files progress through statuses: `identified` → `investigated` → `learned`. + +## Lifecycle + +1. **Agent used**: Post-Task hook creates `needs_learning_as_of_timestamp` and `agent_used` +2. **Session ends**: Stop hook detects `needs_learning_as_of_timestamp` files and suggests running a learning cycle +3. **Learning cycle** (`/learning-agents learn`): + a. `identify` reads transcripts and creates `*.issue.yml` files with status `identified` + b. `investigate_issues` researches each issue and updates status to `investigated` + c. `incorporate_learnings` integrates learnings into the agent and updates status to `learned` + d. `needs_learning_as_of_timestamp` is deleted + e. `learning_last_performed_timestamp` is updated in this folder + +## Notes + +- The `session_id` comes from Claude Code's session identifier +- The `agent_id` is the unique agent ID assigned by Claude Code when spawning a Task +- The `.deepwork/tmp/` directory is intended for transient working files and can be gitignored +- Transcript files referenced by issues are Claude Code's own session transcripts (typically at `~/.claude/projects/.../sessions//transcript.jsonl`) diff --git a/learning_agents/hooks/hooks.json b/learning_agents/hooks/hooks.json new file mode 100644 index 00000000..b93c16a4 --- /dev/null +++ b/learning_agents/hooks/hooks.json @@ -0,0 +1,26 @@ +{ + "hooks": { + "PostToolUse": [ + { + "matcher": "Task", + "hooks": [ + { + "type": "command", + "command": "${CLAUDE_PLUGIN_ROOT}/hooks/post_task.sh" + } + ] + } + ], + "Stop": [ + { + "matcher": "", + "hooks": [ + { + "type": "command", + "command": "${CLAUDE_PLUGIN_ROOT}/hooks/session_stop.sh" + } + ] + } + ] + } +} diff --git a/learning_agents/hooks/post_task.sh b/learning_agents/hooks/post_task.sh new file mode 100755 index 00000000..8a50eebc --- /dev/null +++ b/learning_agents/hooks/post_task.sh @@ -0,0 +1,93 @@ +#!/bin/bash +# post_task.sh - PostToolUse hook for Task tool +# +# Detects when a LearningAgent is used via Task and creates session tracking +# files so the learning cycle can process the transcript later. +# +# Input (stdin): JSON with tool_input, tool_response, session_id +# Output (stdout): JSON with optional systemMessage +# Exit: Always 0 (non-blocking) + +set -euo pipefail + +# ============================================================================ +# READ STDIN +# ============================================================================ + +HOOK_INPUT="" +if [ ! -t 0 ]; then + HOOK_INPUT=$(cat) +fi + +if [ -z "$HOOK_INPUT" ]; then + echo '{}' + exit 0 +fi + +# ============================================================================ +# EXTRACT FIELDS +# ============================================================================ + +# Extract session_id +SESSION_ID=$(echo "$HOOK_INPUT" | jq -r '.session_id // empty' 2>/dev/null) +if [ -z "$SESSION_ID" ]; then + echo '{}' + exit 0 +fi + +# Extract agent name from tool_input.name (the name parameter passed to Task) +AGENT_NAME=$(echo "$HOOK_INPUT" | jq -r '.tool_input.name // empty' 2>/dev/null) +if [ -z "$AGENT_NAME" ]; then + echo '{}' + exit 0 +fi + +# Extract agent_id from tool_response +AGENT_ID=$(echo "$HOOK_INPUT" | jq -r '.tool_response.agentId // .tool_response.agent_id // empty' 2>/dev/null) +if [ -z "$AGENT_ID" ]; then + echo '{}' + exit 0 +fi + +# ============================================================================ +# CHECK IF THIS IS A LEARNING AGENT +# ============================================================================ + +AGENT_DIR=".deepwork/learning-agents/${AGENT_NAME}" +if [ ! -d "$AGENT_DIR" ]; then + echo '{}' + exit 0 +fi + +# ============================================================================ +# CREATE SESSION TRACKING FILES +# ============================================================================ + +SESSION_DIR=".deepwork/tmp/agent_sessions/${SESSION_ID}/${AGENT_ID}" +mkdir -p "$SESSION_DIR" + +# Write timestamp flag +date -u +"%Y-%m-%dT%H:%M:%SZ" > "${SESSION_DIR}/needs_learning_as_of_timestamp" + +# Write agent name for later lookup +echo "$AGENT_NAME" > "${SESSION_DIR}/agent_used" + +# ============================================================================ +# OUTPUT POST-TASK REMINDER +# ============================================================================ + +PLUGIN_ROOT="${CLAUDE_PLUGIN_ROOT:-$(cd "$(dirname "$0")/.." && pwd)}" +REMINDER="" +if [ -f "${PLUGIN_ROOT}/doc/learning_agent_post_task_reminder.md" ]; then + REMINDER=$(cat "${PLUGIN_ROOT}/doc/learning_agent_post_task_reminder.md" | sed 's/\\/\\\\/g; s/"/\\"/g; s/\t/\\t/g' | tr '\n' ' ') +fi + +if [ -n "$REMINDER" ]; then + cat << EOF +{"systemMessage":"${REMINDER}"} +EOF +else + echo '{}' +fi + +exit 0 diff --git a/learning_agents/hooks/session_stop.sh b/learning_agents/hooks/session_stop.sh new file mode 100755 index 00000000..3c16f7b1 --- /dev/null +++ b/learning_agents/hooks/session_stop.sh @@ -0,0 +1,60 @@ +#!/bin/bash +# session_stop.sh - Stop hook for session end +# +# Checks if any LearningAgents were used during the session and suggests +# running a learning cycle if there are unprocessed transcripts. +# +# Input (stdin): JSON with session info +# Output (stdout): JSON with optional systemMessage +# Exit: Always 0 (non-blocking) + +set -euo pipefail + +# ============================================================================ +# CHECK FOR PENDING LEARNING +# ============================================================================ + +if [ ! -d ".deepwork/tmp/agent_sessions" ]; then + echo '{}' + exit 0 +fi + +PENDING_FILES=$(find .deepwork/tmp/agent_sessions -name "needs_learning_as_of_timestamp") + +if [ -z "$PENDING_FILES" ]; then + echo '{}' + exit 0 +fi + +# Count unique agents with pending learning +AGENT_COUNT=0 +AGENT_NAMES="" +for f in $PENDING_FILES; do + DIR=$(dirname "$f") + if [ -f "${DIR}/agent_used" ]; then + AGENT_NAME=$(cat "${DIR}/agent_used") + AGENT_NAMES="${AGENT_NAMES} ${AGENT_NAME}" + AGENT_COUNT=$((AGENT_COUNT + 1)) + fi +done + +# Deduplicate agent names +UNIQUE_AGENTS=$(echo "$AGENT_NAMES" | tr ' ' '\n' | sort -u | tr '\n' ' ' | xargs) + +if [ "$AGENT_COUNT" -eq 0 ]; then + echo '{}' + exit 0 +fi + +# ============================================================================ +# OUTPUT LEARNING SUGGESTION +# ============================================================================ + +MESSAGE="LearningAgents used this session (${UNIQUE_AGENTS}) have unprocessed transcripts. Consider running '/learning-agents learn' to identify and incorporate learnings." +ESCAPED_MESSAGE=$(echo "$MESSAGE" | sed 's/\\/\\\\/g; s/"/\\"/g; s/\t/\\t/g' | tr '\n' ' ') + +cat << EOF +{"systemMessage":"${ESCAPED_MESSAGE}"} +EOF + +exit 0 diff --git a/learning_agents/scripts/create_agent.sh b/learning_agents/scripts/create_agent.sh new file mode 100755 index 00000000..aefc4712 --- /dev/null +++ b/learning_agents/scripts/create_agent.sh @@ -0,0 +1,111 @@ +#!/bin/bash +# create_agent.sh - Create a new LearningAgent scaffold +# +# Usage: create_agent.sh +# +# Creates: +# .deepwork/learning-agents//core-knowledge.md +# .deepwork/learning-agents//topics/.gitkeep +# .deepwork/learning-agents//learnings/.gitkeep +# .deepwork/learning-agents//additional_learning_guidelines/ +# .claude/agents/.md + +set -euo pipefail + +AGENT_NAME="${1:-}" + +if [ -z "$AGENT_NAME" ]; then + echo "Usage: create_agent.sh " >&2 + exit 1 +fi + +AGENT_DIR=".deepwork/learning-agents/${AGENT_NAME}" +CLAUDE_AGENT_FILE=".claude/agents/${AGENT_NAME}.md" + +# ============================================================================ +# CREATE LEARNING AGENT DIRECTORY +# ============================================================================ + +if [ -d "$AGENT_DIR" ]; then + echo "Agent directory already exists: ${AGENT_DIR}" >&2 +else + mkdir -p "${AGENT_DIR}/topics" "${AGENT_DIR}/learnings" "${AGENT_DIR}/additional_learning_guidelines" + + # Create .gitkeep files for empty directories + touch "${AGENT_DIR}/topics/.gitkeep" + touch "${AGENT_DIR}/learnings/.gitkeep" + + # Create empty additional learning guideline files + touch "${AGENT_DIR}/additional_learning_guidelines/issue_identification.md" + touch "${AGENT_DIR}/additional_learning_guidelines/issue_investigation.md" + touch "${AGENT_DIR}/additional_learning_guidelines/learning_from_issues.md" + + # Create README for additional learning guidelines + cat > "${AGENT_DIR}/additional_learning_guidelines/README.md" << 'ALG_README' +# Additional Learning Guidelines + +These files let you customize how the learning cycle works for this agent. Each file is automatically included in the corresponding learning skill. Leave empty to use default behavior, or add markdown instructions to guide the process. + +## Files + +- **issue_identification.md** — Included during the `identify` step. Use this to tell the reviewer what kinds of issues matter most for this agent, what to ignore, or domain-specific signals of mistakes. + +- **issue_investigation.md** — Included during the `investigate-issues` step. Use this to guide root cause analysis — e.g., common root causes in this domain, which parts of the agent's knowledge to check first, or investigation heuristics. + +- **learning_from_issues.md** — Included during the `incorporate-learnings` step. Use this to guide how learnings are integrated — e.g., preferences for topics vs learnings, naming conventions, or areas of core-knowledge that should stay concise. +ALG_README + + # Create core-knowledge.md with TODO placeholder + cat > "${AGENT_DIR}/core-knowledge.md" << 'CORE_KNOWLEDGE' +TODO: Complete current knowledge of this domain. +Written in second person ("You should...") because this text +becomes the agent's system instructions. Structure it as: +1. Identity statement ("You are an expert on...") +2. Core concepts and terminology +3. Common patterns and best practices +4. Pitfalls to avoid +5. Decision frameworks +CORE_KNOWLEDGE + + echo "Created agent directory: ${AGENT_DIR}" +fi + +# ============================================================================ +# CREATE CLAUDE CODE AGENT FILE +# ============================================================================ + +if [ -f "$CLAUDE_AGENT_FILE" ]; then + echo "Claude agent file already exists: ${CLAUDE_AGENT_FILE}" >&2 +else + mkdir -p "$(dirname "$CLAUDE_AGENT_FILE")" + + # Use quoted heredoc to keep backticks/dollars literal, then sed in agent name + cat > "$CLAUDE_AGENT_FILE" << 'AGENT_MD' +--- +name: TODO +description: "TODO" +--- + +# Core Knowledge + +!`cat .deepwork/learning-agents/__AGENT__/core-knowledge.md` + +# Topics + +Located in `.deepwork/learning-agents/__AGENT__/topics/` + +!`for f in .deepwork/learning-agents/__AGENT__/topics/*.md; do [ -f "$f" ] || continue; desc=$(awk '/^---/{c++; next} c==1 && /^name:/{sub(/^name: *"?/,""); sub(/"$/,""); print; exit}' "$f"); echo "- $(basename "$f"): $desc"; done` + +# Learnings + +Learnings are incident post-mortems from past agent sessions capturing mistakes, root causes, and generalizable insights. Review them before starting work to avoid repeating past mistakes. Located in `.deepwork/learning-agents/__AGENT__/learnings/`. +AGENT_MD + + # Replace placeholder with actual agent name (use .bak for GNU/BSD sed portability) + sed -i.bak "s/__AGENT__/${AGENT_NAME}/g" "$CLAUDE_AGENT_FILE" + rm -f "${CLAUDE_AGENT_FILE}.bak" + + echo "Created Claude agent file: ${CLAUDE_AGENT_FILE}" +fi + +echo "Agent scaffold created for: ${AGENT_NAME}" diff --git a/learning_agents/skills/create-agent/SKILL.md b/learning_agents/skills/create-agent/SKILL.md new file mode 100644 index 00000000..aa5189c1 --- /dev/null +++ b/learning_agents/skills/create-agent/SKILL.md @@ -0,0 +1,113 @@ +--- +name: create-agent +description: Creates a new LearningAgent with directory structure, core-knowledge.md, and Claude Code agent file. Guides the user through initial configuration. +disable-model-invocation: true +allowed-tools: Read, Edit, Write, Bash, Glob +--- + +# Create LearningAgent + +Create a new LearningAgent and guide the user through initial configuration. + +## Arguments + +`$ARGUMENTS` is the agent name. Use dashes for multi-word names (e.g., `rails-activejob`). If not provided, ask the user what to name the agent. + +## Procedure + +### Step 1: Validate and Run Scaffold Script + +If the name contains spaces or uppercase letters, normalize to lowercase dashes (e.g., "Rails ActiveJob" → `rails-activejob`). + +Check `.claude/agents/` for an existing file matching `.md`. If found, inform the user of the conflict and ask how to proceed. + +Run the scaffold script: + +```bash +${CLAUDE_PLUGIN_ROOT}/scripts/create_agent.sh $ARGUMENTS +``` + +If the script reports that directories already exist, inform the user and ask whether to proceed with updating the configuration or stop. + +### Step 2: Configure the Agent + +Ask the user about the agent's domain: + +- What domain or area of expertise does this agent cover? +- What kinds of tasks will it be delegated to handle? + +Based on their answers, update: + +1. **`.deepwork/learning-agents//core-knowledge.md`**: Replace the TODO content with the agent's core expertise in second person ("You should...", "You are an expert on..."). + + Example: + ``` + You are an expert on Rails ActiveJob. You understand the full job lifecycle, + supported adapters (Sidekiq, Resque, DelayedJob), retry configuration with + exponential backoff, and callback patterns. You should always check the adapter + documentation before recommending queue-specific features. + ``` + +2. **`.claude/agents/.md`** frontmatter: Replace the TODO placeholders: + - `name`: The agent's display name (human-readable, e.g., "Rails ActiveJob Expert") + - `description`: A concise description for deciding when to invoke this agent (2-3 sentences max). Example: "Expert on Rails ActiveJob patterns including job creation, queue configuration, and retry logic. Invoke when delegating background job tasks or debugging queue issues." + +### Step 3: Seed Initial Knowledge (Optional) + +Ask the user if they want to seed any initial topics or learnings. If yes, create files using these formats: + +**Topic file** (`.deepwork/learning-agents//topics/.md`): +```yaml +--- +name: "Topic Name" +keywords: + - keyword1 + - keyword2 +last_updated: "YYYY-MM-DD" +--- + +Detailed documentation about this topic area... +``` + +**Learning file** (`.deepwork/learning-agents//learnings/.md`): +```yaml +--- +name: "Learning Name" +last_updated: "YYYY-MM-DD" +summarized_result: | + One sentence summary of the key takeaway. +--- + +## Context +... +## Key Takeaway +... +``` + +### Step 4: Summary + +Output in this format: + +``` +## Agent Created: + +**Files created/modified:** +- `.deepwork/learning-agents//core-knowledge.md` — core expertise +- `.deepwork/learning-agents//topics/` — topic documentation +- `.deepwork/learning-agents//learnings/` — experience-based insights +- `.claude/agents/.md` — Claude Code agent file + +**Usage:** + Use the Task tool with `name: ""` to invoke this agent. + +**Learning cycle:** + The post-Task hook will automatically track sessions. Run `/learning-agents learn` + after sessions to identify and incorporate learnings. +``` + +## Guardrails + +- Do NOT overwrite existing files without user confirmation +- Do NOT create agents with names that conflict with existing Claude Code agents +- Use dashes consistently in folder names and `.claude/agents/` filenames +- Keep the `.claude/agents/` `description` field concise (2-3 sentences max) diff --git a/learning_agents/skills/identify/SKILL.md b/learning_agents/skills/identify/SKILL.md new file mode 100644 index 00000000..881ee616 --- /dev/null +++ b/learning_agents/skills/identify/SKILL.md @@ -0,0 +1,95 @@ +--- +name: identify +description: Reads a session transcript and identifies issues where a LearningAgent made mistakes, had knowledge gaps, or underperformed. Creates issue files for each problem found. +user-invocable: false +disable-model-invocation: true +allowed-tools: Read, Grep, Glob, Skill +--- + +# Identify Issues in Session Transcript + +You are an expert AI quality reviewer analyzing session transcripts to surface actionable issues in a LearningAgent's behavior. + +## Arguments + +`$ARGUMENTS` is the path to the session/agent_id folder (e.g., `.deepwork/tmp/agent_sessions///`). + +## Context + +**Agent used**: !`cat $ARGUMENTS/agent_used 2>/dev/null || echo "unknown"` + +**Last learning timestamp** (empty if never learned): !`cat $ARGUMENTS/learning_last_performed_timestamp 2>/dev/null` + +**Additional identification guidelines**: +!`cat .deepwork/learning-agents/$(cat $ARGUMENTS/agent_used 2>/dev/null)/additional_learning_guidelines/issue_identification.md 2>/dev/null` + +## Procedure + +### Step 1: Locate the Transcript + +Extract the session_id from `$ARGUMENTS` by taking the second-to-last path component. For example, from `.deepwork/tmp/agent_sessions/abc123/agent456/`, the session_id is `abc123`. + +Use Glob to find the transcript file by substituting the actual session_id: +``` +~/.claude/projects/**/sessions/abc123/*.jsonl +``` + +If no transcript is found, report the error (include the session_id and Glob pattern used) and stop. + +### Step 2: Read the Transcript + +Read the transcript file. The transcript is a JSONL file (one JSON object per line). Each line has a `type` field — agent turns appear as `type: "assistant"` messages and tool results appear as `type: "tool_result"`. Focus on assistant message content and tool call outcomes to evaluate agent behavior. + +If `learning_last_performed_timestamp` exists (shown in Context above), skip lines that occurred before that timestamp — only analyze new interactions since the last learning cycle. + +Focus on interactions involving the agent identified in `agent_used`. + +### Step 3: Identify Issues + +Look for these categories of problems: + +1. **Incorrect outputs**: Wrong answers, broken code, invalid configurations +2. **Knowledge gaps**: The agent didn't know something it should have +3. **Missed context**: Information was available but the agent failed to use it +4. **Poor judgment**: The agent made a questionable decision or took a suboptimal approach +5. **Pattern failures**: Repeated errors suggesting a systemic issue + +Skip trivial issues like: +- Minor formatting differences +- Environmental issues (network timeouts, tool failures) +- Issues already covered by existing learnings + +### Step 4: Report Each Issue + +For each issue identified, invoke the `report-issue` skill once per issue: + +``` +Skill learning-agents:report-issue $ARGUMENTS "" " +``` + +Example: `Skill learning-agents:report-issue .deepwork/tmp/agent_sessions/abc123/agent456/ "Knowledge gap: Agent did not know that date -v-30d is macOS-only syntax"` + +### Step 5: Summary + +Output in this format: + +``` +## Session Issue Summary + +**Session**: +**Agent**: +**Issues found**: + +| # | Category | Brief description | +|---|----------|-------------------| +| 1 | | | + +(or: "No actionable issues found. Agent performed well in this session.") +``` + +## Guardrails + +- Do NOT investigate root causes — that is the next step's job +- Do NOT modify the agent's knowledge base +- Do NOT create duplicate issues for the same problem +- Focus on actionable issues that can lead to concrete improvements diff --git a/learning_agents/skills/incorporate-learnings/SKILL.md b/learning_agents/skills/incorporate-learnings/SKILL.md new file mode 100644 index 00000000..e99a51ec --- /dev/null +++ b/learning_agents/skills/incorporate-learnings/SKILL.md @@ -0,0 +1,150 @@ +--- +name: incorporate-learnings +description: Takes investigated issues and incorporates the learnings into the LearningAgent's knowledge base by updating core-knowledge.md, topics, or learnings files. +user-invocable: false +disable-model-invocation: true +allowed-tools: Read, Grep, Glob, Edit, Write +--- + +# Incorporate Learnings + +Take investigated issues and integrate the lessons learned into the LearningAgent's knowledge base. + +## Arguments + +`$ARGUMENTS` is the path to the session log folder (e.g., `.deepwork/tmp/agent_sessions///`). + +## Context + +**Agent used**: !`cat $ARGUMENTS/agent_used 2>/dev/null || echo "unknown"` + +If `agent_used` is "unknown", stop and report an error — the session folder is missing required metadata. + +**Additional incorporation guidelines**: +!`cat .deepwork/learning-agents/$(cat $ARGUMENTS/agent_used 2>/dev/null)/additional_learning_guidelines/learning_from_issues.md 2>/dev/null` + +## Procedure + +### Step 1: Find Investigated Issues + +List all issue files with status `investigated`: + +```bash +grep -l 'status: investigated' $ARGUMENTS/*.issue.yml +``` + +If no investigated issues are found, report that and skip to Step 5 (still update tracking files). + +### Step 2: Read Agent Knowledge Base + +Read the current state of the agent's knowledge: + +- `.deepwork/learning-agents//core-knowledge.md` +- `.deepwork/learning-agents//topics/*.md` +- `.deepwork/learning-agents//learnings/*.md` + +Where `` is from `$ARGUMENTS/agent_used`. + +### Step 3: Incorporate Each Issue + +For each investigated issue, read the issue file (both `issue_description` and `investigation_report`) and determine the best way to incorporate the learning. Apply options in this priority order: + +#### Option D (first priority): Amend existing content + +Check first. If a closely related file already exists in the agent's knowledge base that covers the same area, edit that file rather than creating a new one. + +Example: Issue "Agent used wrong retry count" when `topics/retry-handling.md` already exists → update the existing topic with the correct information. + +#### Option A: Update `core-knowledge.md` + +Use when the issue is a **universal one-liner** — something fundamental the agent should always know that can be expressed in 1-2 sentences. + +Example: Issue "Agent called a python program directly that only works with `uv run`" → add a bullet to `core-knowledge.md`: "Always use `uv run` when invoking `util.py`." + +#### Option B: Add a new topic in `topics/` + +Use when the issue reveals a new or existing **conceptual area** needing 1+ paragraphs of reference material that is not always needed, but often enough to track. + +```markdown +--- +name: +keywords: + - +last_updated: +--- + + +``` + +Example: Issue "Agent didn't understand retry backoff patterns" → create `topics/retry-backoff.md` with documentation on exponential backoff, jitter, and dead letter queues. + +#### Option C: Add a new learning in `learnings/` + +Use when the **narrative context of how the issue unfolded** is needed to understand the resolution — multi-step debugging sessions, surprising interactions, or subtle misunderstandings. + +```markdown +--- +name: +last_updated: +summarized_result: | + <1-3 sentence summary of the key takeaway> +--- + +## Context + + +## Investigation + + +## Resolution + + +## Key Takeaway + +``` + +Example: Issue "Agent spent 20 minutes debugging a permissions error that was actually caused by a stale Docker volume" → create a learning capturing the full debugging narrative and the insight about checking Docker volumes early. + +**IMPORTANT**: If you add a `learnings` entry, you may want to also add a brief note to a Topic with reference to the learning too. + +#### Option D: Do nothing +If you decide that the issue would have been hard to prevent, or if it seems extremely unlikely that it will be encountered again, forgo any changes and just move on to step 4. + +### Step 4: Update Issue Status + +For each incorporated issue, use Edit to change `status: investigated` to `status: learned` in the issue file. + +### Step 5: Update Session Tracking + +Always run this step, even if no issues were incorporated. + +1. Delete `needs_learning_as_of_timestamp` if it exists: + ```bash + [ -f $ARGUMENTS/needs_learning_as_of_timestamp ] && rm $ARGUMENTS/needs_learning_as_of_timestamp + ``` + +2. Write the current timestamp to `learning_last_performed_timestamp`: + ```bash + date -u +"%Y-%m-%dT%H:%M:%SZ" > $ARGUMENTS/learning_last_performed_timestamp + ``` + +### Step 6: Summary + +Output in this format: + +``` +## Incorporation Summary + +- **Issues processed**: +- | created learning | amended > +- → could not incorporate: +``` + +## Guardrails + +- Do NOT create overly broad or vague learnings — be specific and actionable +- Do NOT duplicate existing knowledge — check before adding +- Do NOT remove existing content unless it is directly contradicted by new evidence +- Keep `core-knowledge.md` concise — move detailed content to topics or learnings +- Use today's date for `last_updated` fields +- Always run Step 5 (update tracking files) even if no issues were incorporated diff --git a/learning_agents/skills/investigate-issues/SKILL.md b/learning_agents/skills/investigate-issues/SKILL.md new file mode 100644 index 00000000..e33bdc92 --- /dev/null +++ b/learning_agents/skills/investigate-issues/SKILL.md @@ -0,0 +1,102 @@ +--- +name: investigate-issues +description: Investigates identified issues in a LearningAgent session by reading the transcript, determining root causes, and updating issue files with investigation reports. +user-invocable: false +disable-model-invocation: true +allowed-tools: Read, Grep, Glob, Edit +--- + +# Investigate Issues + +Research identified issues from a LearningAgent session to determine their root causes. + +## Arguments + +`$ARGUMENTS` is the path to the session log folder (e.g., `.deepwork/tmp/agent_sessions///`). + +## Context + +**Agent used**: !`cat $ARGUMENTS/agent_used 2>/dev/null || echo "unknown"` + +**Agent core knowledge**: +!`cat .deepwork/learning-agents/$(cat $ARGUMENTS/agent_used 2>/dev/null)/core-knowledge.md 2>/dev/null` + +**Additional investigation guidelines**: +!`cat .deepwork/learning-agents/$(cat $ARGUMENTS/agent_used 2>/dev/null)/additional_learning_guidelines/issue_investigation.md 2>/dev/null` + +## Procedure + +### Step 1: Find Identified Issues + +List all issue files with status `identified`: + +```bash +grep -l 'status: identified' $ARGUMENTS/*.issue.yml +``` + +If no identified issues are found, report that and stop. + +### Step 2: Locate the Transcript + +Extract the session_id from `$ARGUMENTS` by taking the second-to-last path component (e.g., from `.deepwork/tmp/agent_sessions/abc123/agent456/`, the session_id is `abc123`). + +Use Glob to find the transcript file by substituting the actual extracted session_id: +``` +~/.claude/projects/**/sessions//*.jsonl +``` + +If no transcript file is found, report the missing path and stop. Do not proceed to investigate without transcript evidence. + +### Step 3: Investigate Each Issue + +For each issue file with status `identified`: + +1. **Read the issue file** to understand what went wrong +2. **Search the transcript** for relevant sections — grep for keywords from `issue_description` or locate lines near timestamps in `seen_at_timestamps` +3. **Determine root cause** using this taxonomy: + - **Knowledge gap**: Missing or incomplete content in `core-knowledge.md` + - **Missing documentation**: A topic file does not exist or lacks needed detail + - **Incorrect instruction**: An existing instruction leads the agent to wrong behavior + - **Missing runtime context**: Information that should have been injected at runtime was absent +4. **Write the investigation report** explaining: + - Specific evidence from the transcript (reference line numbers) + - The root cause analysis + - What knowledge gap or instruction deficiency led to the issue + +### Step 4: Update Issue Files + +For each investigated issue, use Edit to update the issue file: + +1. Change `status: identified` to `status: investigated` +2. Add the `investigation_report` field with your findings: + +```yaml +status: investigated +seen_at_timestamps: + - "2025-01-15T14:32:00Z" +issue_description: | + +investigation_report: | + +``` + +### Step 5: Summary + +Output in this format for each issue: + +``` +**Issue**: +**Root cause**: +**Recommended update type**: +``` + +## Guardrails + +- Do NOT modify the agent's knowledge base — that is the incorporate step's job +- Do NOT change the `issue_description` — only add the `investigation_report` +- Do NOT skip issues — investigate every `identified` issue in the folder +- Be specific about evidence — reference transcript line numbers +- Focus on actionable root causes, not blame diff --git a/learning_agents/skills/learn/SKILL.md b/learning_agents/skills/learn/SKILL.md new file mode 100644 index 00000000..87663e18 --- /dev/null +++ b/learning_agents/skills/learn/SKILL.md @@ -0,0 +1,83 @@ +--- +name: learn +description: Runs the learning cycle on all LearningAgent sessions with pending transcripts. Identifies issues, investigates root causes, and incorporates learnings into agent definitions. +disable-model-invocation: true +allowed-tools: Read, Glob, Grep, Bash, Task, Skill +--- + +# Learning Cycle + +Process unreviewed LearningAgent session transcripts to identify issues, investigate root causes, and incorporate learnings into agent definitions. + +## Arguments + +This skill takes no arguments. It automatically discovers all pending sessions. + +## Pending Sessions + +!`find .deepwork/tmp/agent_sessions -name needs_learning_as_of_timestamp 2>/dev/null` + +## Procedure + +### Step 1: Find Pending Sessions + +Check for pending learning sessions. The dynamic include above lists all `needs_learning_as_of_timestamp` files. If the list is empty (or the `.deepwork/tmp/agent_sessions` directory does not exist), inform the user that there are no pending sessions to learn from and stop. + +For each pending file, extract: +- The session folder path (parent directory of `needs_learning_as_of_timestamp`, e.g., `.deepwork/tmp/agent_sessions/sess-abc/agent-123/`) +- The agent name (read the `agent_used` file in that folder) + +### Step 2: Process Each Session + +For each pending session folder, run the learning cycle in sequence. The Task pseudo-code below shows the parameters to pass to the Task tool: + +#### 2a: Identify Issues + +``` +Task tool call: + name: "identify-issues" + subagent_type: general-purpose + model: sonnet + prompt: "Run the identify skill on the session folder: .deepwork/tmp/agent_sessions/// + Use: Skill learning-agents:identify .deepwork/tmp/agent_sessions///" +``` + +#### 2b: Investigate and Incorporate + +After identification completes, spawn another Task to run investigation and incorporation in sequence: + +``` +Task tool call: + name: "investigate-and-incorporate" + subagent_type: general-purpose + model: sonnet + prompt: "Run these two skills in sequence on the session folder: .deepwork/tmp/agent_sessions/// + 1. First: Skill learning-agents:investigate-issues .deepwork/tmp/agent_sessions/// + 2. Then: Skill learning-agents:incorporate-learnings .deepwork/tmp/agent_sessions///" +``` + +#### Handling failures + +If a sub-skill Task fails for a session, log the failure, skip that session, and continue processing remaining sessions. Do not mark `needs_learning_as_of_timestamp` as resolved for failed sessions. + +### Step 3: Summary + +Output in this format: + +``` +## Learning Cycle Summary + +- **Sessions processed**: +- **Total issues identified**: +- **Agents updated**: +- **Key learnings**: + - : +- **Skipped sessions** (if any): +``` + +## Guardrails + +- Process sessions one at a time to avoid conflicts when multiple sessions involve the same agent +- If a session's transcript cannot be found, skip it and report the issue +- Do NOT modify agent files directly — always delegate to the learning cycle skills +- Use Sonnet model for Task spawns to balance cost and quality diff --git a/learning_agents/skills/learning-agents/SKILL.md b/learning_agents/skills/learning-agents/SKILL.md new file mode 100644 index 00000000..95bd3c9e --- /dev/null +++ b/learning_agents/skills/learning-agents/SKILL.md @@ -0,0 +1,64 @@ +--- +name: learning-agents +description: Dispatch entry point for the LearningAgents plugin. Routes to sub-commands for creating agents, running learning cycles, and reporting issues. +--- + +# LearningAgents + +Manage auto-improving AI sub-agents that learn from their mistakes across sessions. + +## Arguments + +`$ARGUMENTS` is the text after `/learning-agents` (e.g., for `/learning-agents create foo`, `$ARGUMENTS` is `create foo`). + +## Routing + +Split `$ARGUMENTS` on the first whitespace. The first token is the sub-command (case-insensitive); the remainder is passed to the sub-skill. Accept both underscores and dashes in sub-command names (e.g., `report_issue` and `report-issue` are equivalent). + +### `create ` + +Create a new LearningAgent scaffold. + +Invoke: `Skill learning-agents:create-agent ` + +Example: `$ARGUMENTS = "create rails-activejob"` → `Skill learning-agents:create-agent rails-activejob` + +### `learn` + +Run the learning cycle on all pending session transcripts. Any arguments after `learn` are ignored. + +Invoke: `Skill learning-agents:learn` + +### `report_issue
` + +Report an issue with a LearningAgent from the current session. + +Invoke: `Skill learning-agents:report-issue
` + +To construct the session folder path: search `.deepwork/tmp/agent_sessions/` for a subdirectory whose name contains the provided `agentId`. The path structure is `.deepwork/tmp/agent_sessions///`. If no match is found, inform the user. If multiple matches exist, use the most recently modified one. + +Example: `$ARGUMENTS = "report_issue abc123 Used wrong retry strategy"` → find folder matching `abc123` under `.deepwork/tmp/agent_sessions/`, then `Skill learning-agents:report-issue .deepwork/tmp/agent_sessions/sess-xyz/abc123/ Used wrong retry strategy` + +### No arguments or ambiguous input + +Display available sub-commands: + +``` +LearningAgents - Auto-improving AI sub-agents + +Available commands: + /learning-agents create Create a new LearningAgent + /learning-agents learn Run learning cycle on pending sessions + /learning-agents report_issue
Report an issue with an agent + +Examples: + /learning-agents create rails-activejob + /learning-agents learn + /learning-agents report_issue abc123 "Used wrong retry strategy for background jobs" +``` + +## Guardrails + +- Always route to the appropriate skill — do NOT implement sub-command logic inline +- If `$ARGUMENTS` doesn't match any known sub-command, show the help text above +- Pass arguments through to sub-skills exactly as provided diff --git a/learning_agents/skills/prompt-review/SKILL.md b/learning_agents/skills/prompt-review/SKILL.md new file mode 100644 index 00000000..3222c139 --- /dev/null +++ b/learning_agents/skills/prompt-review/SKILL.md @@ -0,0 +1,151 @@ +--- +name: prompt-review +description: Reviews a prompt/instruction file against Anthropic prompt engineering best practices. Use when evaluating skill files, agent definitions, or instruction chunks for quality. +allowed-tools: Read, Glob, Grep, WebFetch +--- + +# Prompt Engineering Review + +Review a prompt or instruction file against Anthropic's prompt engineering best practices and provide structured, actionable feedback. + +## Arguments + +`$ARGUMENTS` is the path to the file to review. If not provided, ask the user which file to review. + +## Procedure + +### Step 1: Fetch Current Best Practices + +Fetch the latest Anthropic prompt engineering guidance to ground your review in current recommendations: + +``` +WebFetch https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview +``` + +Use the fetched content as your primary reference for evaluation criteria. If the fetch fails, proceed using your built-in knowledge of Anthropic prompt engineering best practices. + +### Step 2: Read the Target File + +Read the file at the path specified in `$ARGUMENTS`. If the path is relative, resolve it from the current working directory. + +If the file does not exist, inform the user and stop. + +### Step 3: Determine the Prompt Context + +Before evaluating, identify how this prompt will be used: + +- **Standalone system prompt**: The file is the complete prompt given to Claude +- **Instruction chunk / skill file**: The file is injected into a larger prompt context (e.g., via `!command` dynamic context injection, Claude Code skill files, or agent core-knowledge files) +- **Template with variables**: The file contains placeholders or Jinja2 templates that are filled at runtime + +This context affects how you evaluate the prompt. Instruction chunks, for example, must work well when composed with other instructions and should avoid conflicting with likely surrounding context. + +### Step 4: Evaluate Against Best Practices + +Evaluate the file against each of the following criteria. For each criterion, assess whether the prompt follows it well, partially, or poorly. + + + +1. **Clarity and Specificity** + - Are instructions unambiguous and precise? + - Does the prompt say exactly what it wants rather than what it does not want? + - Are success criteria clearly defined? + - Would a reader understand exactly what to do without needing to guess intent? + +2. **Structure and Formatting** + - Does the prompt use headers, lists, and sections to organize content? + - Are XML tags used appropriately to delineate sections (e.g., ``, ``, ``)? + - Is information ordered logically (context before task, general before specific)? + - Is the structure scannable without being overly verbose? + +3. **Role and Identity Prompting** + - If applicable, does the prompt establish a clear role or expertise for Claude? + - Is the role specific enough to guide behavior without being artificially constraining? + - Does the identity framing match the actual task requirements? + +4. **Examples and Demonstrations** + - Are examples provided for complex or ambiguous tasks? + - Do examples cover both typical cases and edge cases? + - Are examples formatted consistently with the expected output format? + - Do examples use realistic content rather than trivial placeholders? + +5. **Handling Ambiguity and Edge Cases** + - Does the prompt address what to do when inputs are incomplete or unexpected? + - Are fallback behaviors specified? + - Is there guidance for boundary conditions? + +6. **Output Format Specification** + - Is the expected output format clearly defined? + - Are there constraints on length, structure, or style? + - If structured output is needed, is the schema provided or described? + +7. **Composability (for instruction chunks)** + - Will this content work well when injected into a larger prompt? + - Does it avoid assumptions about surrounding context that may not hold? + - Does it avoid conflicting directives that might clash with other injected content? + - Is it self-contained enough to be understood without the surrounding prompt? + - Does it avoid redefining global behaviors (like persona) that the outer prompt may set? + +8. **Conciseness and Signal-to-Noise Ratio** + - Is every sentence earning its place in the prompt? + - Is there redundancy or filler that could be removed? + - Are instructions appropriately dense without sacrificing clarity? + - Does the prompt avoid over-explaining obvious points? + +9. **Variable and Placeholder Usage** + - Are dynamic inputs clearly marked and documented? + - Is it obvious what data will be substituted at runtime? + - Are variable names descriptive and consistent? + +10. **Task Decomposition** + - For complex tasks, does the prompt break work into clear steps? + - Is the sequence of operations logical? + - Are dependencies between steps explicit? + + + +### Step 5: Produce the Review + +Output the review in the following format. Be direct and specific. Every recommendation must point to a concrete line or section in the file and explain exactly what to change. + + + +## Prompt Review: `{filename}` + +**Prompt type**: {standalone system prompt | instruction chunk | template} +**Overall grade**: {A | B | C | D | F} + +> One-sentence summary of the prompt's overall quality. + +### Strengths + +- {Specific strength with brief explanation} +- {Specific strength with brief explanation} + +### Issues Found + +For each issue: + +#### {Issue title} + +- **Severity**: {Critical | High | Medium | Low} +- **Criterion**: {Which of the 10 criteria this relates to} +- **Location**: {Line number, section, or quote from the file} +- **Problem**: {What is wrong and why it matters} +- **Recommendation**: {Exact change to make, with a before/after example if helpful} + +### Summary of Recommendations + +| Priority | Recommendation | Effort | +|----------|---------------|--------| +| {1, 2, ...} | {Brief description} | {Small / Medium / Large} | + + + +## Grading Rubric + +- **A**: Follows nearly all best practices. Minor improvements only. Production-ready. +- **B**: Follows most best practices. A few notable gaps. Effective but could be stronger. +- **C**: Misses several important practices. Will work but likely produces inconsistent results. +- **D**: Significant issues. Missing structure, clarity, or key prompt engineering patterns. +- **F**: Fundamentally flawed. Needs a rewrite to be effective. diff --git a/learning_agents/skills/report-issue/SKILL.md b/learning_agents/skills/report-issue/SKILL.md new file mode 100644 index 00000000..6ca643c0 --- /dev/null +++ b/learning_agents/skills/report-issue/SKILL.md @@ -0,0 +1,75 @@ +--- +name: report-issue +description: Creates an issue file tracking a problem observed in a LearningAgent session. Used by the identify skill and can be invoked directly to report issues in real-time. +user-invocable: false +disable-model-invocation: true +--- + +# Report Issue + +Create an issue file documenting a problem observed in a LearningAgent session. + +## Arguments + +- `$0`: Path to the session log folder (e.g., `.deepwork/tmp/agent_sessions///`) +- `$1`: Brief description of the issue observed + +If `$0` is not provided or does not point to an existing directory, stop and output: "Error: session log folder path is required and must be an existing directory." + +If `$1` is not provided or is empty, stop and output: "Error: issue description is required." + +## Procedure + +### Step 1: Determine Issue Name + +From the issue description in `$1`, derive a short kebab-case name of 3-6 words maximum. Focus on the most distinctive noun and verb from the failure. Avoid filler words like "the", "a", "in", "with". + +Examples: +- `wrong-retry-strategy` +- `missed-validation-edge-case` +- `hallucinated-api-endpoint` + +### Step 2: Create Issue File + +Create the file at `$0/.issue.yml` with the following content: + +```yaml +status: identified +seen_at_timestamps: + - "" +issue_description: | + +``` + +Example of a completed issue file: + +```yaml +status: identified +seen_at_timestamps: + - "2026-02-17T14:32:00Z" +issue_description: | + The agent retried the tool call 5 times after receiving a 429 response, + but each retry was issued immediately with no backoff delay. All 5 calls + occurred within the same second. +``` + +The YAML block above is the authoritative template. See [issue_yml_format.md](../../doc/issue_yml_format.md) for additional schema details. + +### Step 3: Confirm + +Output a two-line confirmation: + +``` +Created: +Recorded: +``` + +## Guardrails + +- Do NOT add an `investigation_report` field — that is added during the investigate step +- Do NOT set status to anything other than `identified` +- Do NOT modify any other files in the session folder +- Keep the `issue_description` factual and observable — describe symptoms, not root causes