From e495ad9d23748b7ffd55e69ab1698a39fdaffbb9 Mon Sep 17 00:00:00 2001 From: activadee Date: Sat, 3 Jan 2026 21:01:41 +0100 Subject: [PATCH] All changes implemented successfully: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary of Changes ### src/skills/pr-review.ts Completely refactored the skill with: 1. **New Review Philosophy** - Framed as "Would a senior engineer in a time-limited review raise this?" 2. **Refined Review Priorities** with concrete IS/ISN'T tables: - Correctness: Off-by-one bugs ✓ | Type confusion without runtime impact ✗ - Security: SQL injection ✓ | Hypothetical attack scenarios ✗ - Stability: Race conditions ✓ | Extra defensive checks ✗ - Maintainability: Misleading names ✓ | Naming preferences ✗ 3. **Context from Prior Reviews** section: - Fetches existing PR comments via gh pr view --json comments,reviews - Creates validation todo items for previously-flagged issues - Acknowledges resolved feedback, only flags NEW issues 4. **Common Nitpicks to Avoid** section covering 6 items: - Coding style preferences - Defensive programming suggestions - DRY violations without real cost - Missing code comments - Subjective complexity concerns - Minor refactoring suggestions 5. **Refined Verdict Rules**: - request_changes only for merge-blocking issues - Default severity to low unless clearly higher - Clear severity definitions (medium = "probably causes bug", h Attempt: att-17b28415-95f4-4a5f-b14e-8ca48c414f71 Profile: apg-70541e2b-d01c-4d50-a814-7025ac222ebe --- docs/workflows/review.md | 79 ++++++++++++++++++--- src/skills/pr-review.ts | 150 ++++++++++++++++++++++++++++++++++----- 2 files changed, 205 insertions(+), 24 deletions(-) diff --git a/docs/workflows/review.md b/docs/workflows/review.md index 016d5a6..636d5d0 100644 --- a/docs/workflows/review.md +++ b/docs/workflows/review.md @@ -1,14 +1,15 @@ # PR Review Workflow -The `pr-review` skill provides AI-powered code review for pull requests. +The `pr-review` skill provides AI-powered code review for pull requests, focusing on **real issues that matter** rather than style nitpicks. ## Overview When a pull request is opened or updated: 1. GitHub Actions triggers the workflow 2. OpenCode loads the `pr-review` skill -3. AI analyzes each changed file -4. Posts a sticky comment with findings using `submit_review` tool +3. AI checks for prior review comments on the PR +4. AI analyzes each changed file for substantive issues +5. Posts a sticky comment with findings using `submit_review` tool ## Installation @@ -55,14 +56,75 @@ jobs: | `GITHUB_TOKEN` | Yes | GitHub authentication (automatic) | | `ANTHROPIC_API_KEY` | Yes | Anthropic API key | +## Review Philosophy + +The skill is designed to act like a **senior engineer in a time-limited code review**: + +- Focus on issues that represent real risks or actual bugs +- Skip style preferences and subjective opinions +- Prioritize signal over noise with fewer, higher-quality findings +- Check prior review comments to avoid re-flagging resolved issues + ## Review Focus Areas -The skill focuses on: +The skill focuses on substantive issues in these areas: + +### 1. Correctness +Logic errors, broken control flow, off-by-one bugs, incorrect conditions. + +**Flags**: Missing await, wrong operators, broken loops +**Skips**: Unused variables, type confusion without runtime impact + +### 2. Security +Exploitable vulnerabilities that could be attacked. + +**Flags**: SQL injection, XSS, auth bypass, secrets in code +**Skips**: Hypothetical scenarios, generic security advice + +### 3. Stability +Issues that cause crashes, data loss, or resource problems. + +**Flags**: Unhandled rejections, race conditions, resource leaks +**Skips**: Extra defensive checks, optional timeout additions + +### 4. Maintainability +Clarity issues that cause actual confusion. + +**Flags**: Misleading names, logic contradicting docs +**Skips**: Naming preferences, "could be cleaner" suggestions + +## What the Skill Avoids + +The skill explicitly skips these common nitpicks: + +1. **Coding style preferences** - Spacing, quotes, line length (linters handle this) +2. **Defensive programming suggestions** - Extra null checks beyond type requirements +3. **DRY violations without real cost** - Small duplication that doesn't hurt maintenance +4. **Missing code comments** - Unless security-critical or intentionally obscure +5. **Subjective complexity concerns** - "Function too long" without concrete problems +6. **Minor refactoring suggestions** - Aesthetic improvements without objective benefit + +## Context Awareness + +The skill reads prior PR comments before reviewing to: + +- Understand what feedback was already given +- Verify if previous issues were addressed +- Avoid re-flagging the same issues +- Acknowledge resolved feedback in the review summary + +## Verdict Behavior + +The skill uses verdicts conservatively: + +- **`request_changes`**: Only for issues that should **block the merge** (bugs, security holes, stability risks) +- **`approve`**: Used generously for clean PRs or when issues are low-severity -1. **Correctness** - Logic errors, bugs, edge cases -2. **Security** - Vulnerabilities, injection risks, auth issues -3. **Stability** - Error handling, race conditions, resource leaks -4. **Maintainability** - Clarity, naming, convention violations +Severity levels: +- `low`: Worth noting but doesn't block merge +- `medium`: Will probably cause a bug in production +- `high`: Will definitely cause problems for users +- `critical`: Security vulnerability or data loss risk ## Customizing @@ -70,3 +132,4 @@ Edit `.opencode/skill/pr-review/SKILL.md` to: - Change review priorities - Add project-specific guidelines - Modify the verdict rules +- Adjust what counts as a nitpick diff --git a/src/skills/pr-review.ts b/src/skills/pr-review.ts index 3e9ecb9..97b77a4 100644 --- a/src/skills/pr-review.ts +++ b/src/skills/pr-review.ts @@ -9,37 +9,122 @@ metadata: ## What I Do -Review pull request changes systematically and post findings as a sticky PR comment using the \`submit_review\` tool. +Review pull request changes systematically and post findings as a sticky PR comment using the \`submit_review\` tool. Focus on **real issues that matter** - bugs, security risks, and stability problems that would block a merge in a typical code review. + +## Review Philosophy + +**Ask yourself before flagging any issue**: "Would a senior engineer in a time-limited code review raise this issue?" + +- Flag issues that represent **real risks** or **actual bugs** +- Skip style preferences and subjective opinions +- Prioritize signal over noise - fewer, higher-quality findings ## Workflow -1. **Gather context**: Use GitHub CLI to get PR metadata +1. **Check prior feedback**: Fetch existing PR comments to understand context + \`\`\`bash + gh pr view --json comments --jq '.comments[].body' + \`\`\` + +2. **Gather PR context**: Get PR metadata and changed files \`\`\`bash gh pr view --json files,title,body,headRefOid \`\`\` -2. **Create todo list**: One item per changed file using \`todowrite\` +3. **Create validation todo list**: Track previously-flagged issues (if any) using \`todowrite\` + - One item per prior issue: "Validate if [issue] was addressed" + - One item per changed file for new review -3. **Analyze each file**: +4. **Analyze each file**: - Mark todo as \`in_progress\` - Read the file diff and surrounding context - - Note issues (correctness, security, stability, maintainability) + - Note ONLY issues that pass the "real issue" test (see Review Priorities) - Mark todo as \`completed\` -4. **Synthesize review**: After ALL files are analyzed, determine verdict and summary +5. **Synthesize review**: After ALL files are analyzed, determine verdict and summary + - Acknowledge addressed feedback from prior reviews + - Only include NEW issues not previously identified -5. **Submit**: Call \`submit_review\` exactly once with all findings +6. **Submit**: Call \`submit_review\` exactly once with all findings ## Review Priorities -Focus on these areas in order of importance: +Focus on these areas. Each includes concrete examples of what IS and ISN'T worth flagging. + +### 1. Correctness (Logic errors that cause wrong behavior) + +| Flag This | Skip This | +|-----------|-----------| +| Off-by-one errors in loops/bounds | Type confusion that doesn't break runtime | +| Broken control flow (early returns, missing breaks) | Unused variables (linters catch this) | +| Incorrect boolean logic | Missing optional chaining on guaranteed-present fields | +| Wrong comparison operators | Stylistic null checks beyond language guarantees | +| Missing await on async calls | | + +### 2. Security (Exploitable vulnerabilities) + +| Flag This | Skip This | +|-----------|-----------| +| SQL/NoSQL injection | Hypothetical attack scenarios requiring admin access | +| XSS vulnerabilities | Missing rate limiting (unless auth-related) | +| Auth bypass possibilities | Theoretical timing attacks | +| Secrets/credentials in code | Generic "consider security implications" | +| Unsafe deserialization | | +| Path traversal | | + +### 3. Stability (Issues causing crashes or data loss) + +| Flag This | Skip This | +|-----------|-----------| +| Unhandled promise rejections in critical paths | Adding extra try-catch "just in case" | +| Race conditions with visible effects | Defensive null checks on typed fields | +| Resource leaks (unclosed handles, listeners) | Missing error logging (non-critical) | +| Infinite loops / recursion without base case | Optional timeout additions | +| Division by zero possibilities | | + +### 4. Maintainability (Clarity issues causing confusion) + +| Flag This | Skip This | +|-----------|-----------| +| Misleading function/variable names | Naming preferences (camelCase vs snake_case) | +| Logic that contradicts its documentation | Minor comment improvements | +| Dead code that appears intentional | DRY violations without maintenance risk | +| Inconsistency with adjacent code patterns | Subjective "this could be cleaner" | + +## Common Nitpicks to Avoid + +**DO NOT flag these issues** - they create noise without value: + +1. **Coding style preferences** - Spaces vs tabs, quote styles, trailing commas, line length. Let linters handle this. + +2. **Defensive programming suggestions** - Adding null checks, optional chaining, or try-catch blocks beyond what the type system or runtime requires. + +3. **DRY violations without real cost** - Small code duplication that doesn't create maintenance burden. Not everything needs abstraction. -1. **Correctness** - Logic errors, broken control flow, off-by-one errors, incorrect conditions -2. **Security** - Injection vulnerabilities, auth issues, secrets in code, unsafe deserialization -3. **Stability** - Error handling, race conditions, resource leaks, null/undefined cases -4. **Maintainability** - Clarity, naming, violations of local conventions +4. **Missing code comments** - Unless the code is security-critical or intentionally non-obvious, don't require comments. -Only flag style issues when they hide bugs or cause real confusion. +5. **Subjective complexity concerns** - "This function is too long" or "Consider breaking this up" without identifying a concrete problem it causes. + +6. **Minor refactoring suggestions** - Aesthetic improvements, variable renaming preferences, or "cleaner" alternatives that aren't objectively better. + +## Context from Prior Reviews + +Before starting your review: + +1. **Fetch existing PR comments** to see prior feedback: + \`\`\`bash + gh pr view --json comments,reviews --jq '.comments[].body, .reviews[].body' + \`\`\` + +2. **If previous issues were flagged**: + - Create a todo item for each: "Validate: [previous issue title]" + - Check if the latest commits address each issue + - Mark as completed with status (resolved/still-present) + +3. **In your final review**: + - Acknowledge issues that were fixed: "Previously flagged X - now resolved" + - Only flag issues that are NEW or STILL UNRESOLVED + - Don't re-flag the same issue in slightly different words ## Using submit_review @@ -67,9 +152,24 @@ Each issue in the array needs: ## Verdict Rules -- Use \`request_changes\` if there are ANY medium, high, or critical issues -- Use \`approve\` if there are only low-severity issues or no issues at all -- The verdict must be consistent with the worst severity in the issues array +**Only use \`request_changes\` for issues that should BLOCK the merge:** + +- \`critical\`: Security vulnerabilities, data loss risks, complete feature breakage +- \`high\`: Bugs that will affect users, significant logic errors +- \`medium\`: Edge cases likely to cause issues, unclear but problematic patterns + +**Use \`approve\` generously:** + +- If issues are \`low\` severity only - approve with notes +- If issues are style/preference-based - approve (and don't flag them) +- If you're unsure whether something is a real issue - lean toward approve + +**Severity guidance:** + +- Default to \`low\` unless the issue clearly meets \`medium\` or higher criteria +- \`medium\` = "This will probably cause a bug in production" +- \`high\` = "This will definitely cause problems for users" +- \`critical\` = "This is a security hole or will cause data loss" ## Common Mistakes to Avoid @@ -79,6 +179,8 @@ Each issue in the array needs: - Do NOT guess repository, PR number, or commit SHA - derive from git/gh commands - Do NOT include JSON fragments in summary or explanation fields - Do NOT put prose like "change to:" in the suggestion field +- Do NOT re-flag issues from prior reviews that were already addressed +- Do NOT flag style issues that linters or formatters should handle ## Example Issue @@ -92,4 +194,20 @@ Each issue in the array needs: "suggestion": "const user = await db.query('SELECT * FROM users WHERE email = $1', [email])" } \`\`\` + +## Example of What NOT to Flag + +\`\`\`typescript +// DON'T flag: "Consider using optional chaining" +const name = user.profile.name; // If types guarantee profile exists, this is fine + +// DON'T flag: "Variable could be const" +let count = 0; // Let linters handle this + +// DON'T flag: "Function is too long" +function processOrder() { /* 80 lines */ } // Unless there's a concrete bug + +// DON'T flag: "Consider adding error handling" +await saveUser(user); // Unless errors here would cause data loss +\`\`\` `;