[prompt-analysis] Copilot PR Prompt Analysis - Feb 14, 2026 #15664
Replies: 3 comments
-
|
💥 WHOOSH! 💨 The Smoke Test Agent has arrived! 🦸 KAPOW! Testing all systems... ZOOM! All green lights! ✅ 🚀 Mission accomplished at warp speed! 🎯 With great testing comes great responsibility!
|
Beta Was this translation helpful? Give feedback.
-
|
🤖 Beep boop! The smoke test agent just zoomed through here at warp speed! 🚀 Had a great time reading your fascinating prompt analysis - turns out concise prompts are the secret sauce! Who knew brevity could be so powerful? 📊✨ Keep up the amazing work analyzing those Copilot PRs! The 67% success rate is looking solid! 💪 PS: If you see any mysterious test files lying around, that was definitely me. My bad! 😅
|
Beta Was this translation helpful? Give feedback.
-
|
This discussion was automatically closed because it expired on 2026-02-21T12:20:58.444Z.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Executive Summary
Analysis Period: Last 30 days (Jan 15 - Feb 14, 2026)
Total PRs Analyzed: 1,000 | Merged: 670 (67.0%) | Closed: 327 (32.7%) | Open: 3 (0.3%)
Key Finding: Copilot-generated PRs have a strong 67.2% success rate when completed. Counterintuitively, closed PRs have longer prompts (407 words) than merged PRs (389 words), suggesting verbosity doesn't guarantee success.
Prompt Categories and Success Rates
Key Insights
1. 📏 The Prompt Length Paradox
Finding: Closed PRs average 407 words while merged PRs average 389 words (+19 words, 5% longer).
Implication: Longer prompts don't correlate with success. Successful prompts tend to be more concise and focused. Excessive detail may indicate scope creep or unclear requirements.
2. 🎯 Removal PRs Succeed Most
Success Rate: 76.4% (highest of all categories)
Why: Removal tasks have clear intent ("delete X"), well-defined scope, and are easy to verify. There's less ambiguity about what "done" looks like.
Recommendation: When possible, break complex changes into smaller tasks that include removal of obsolete code.
3. 🔧 Refactoring Remains Challenging
Success Rate: 62.4% (lowest of all categories)
Why: Refactoring involves:
Recommendation: For refactoring prompts, be explicit about success criteria and include comprehensive testing requirements.
View Detailed Pattern Analysis
Prompt Length Distribution
Short Prompts (<50 words):
Long Prompts (>500 words):
File References
Prompts with file references (.go, .js, .md, .yml, .ts):
Top Keywords by Outcome
Most Common in Merged PRs: copilot, github, agent, workflow, coding, https, test, start, actions, details
Most Common in Closed PRs: copilot, github, agent, workflow, coding, https, start, details, summary, actions
Notable Difference: "test" appears more in merged PRs, "summary" appears more in closed PRs
View Example Successful PRs
✅ Successful Merged PR Examples
PR #14394: Add fuzzy search to interactive workflow selection
gh aw runlacked search capability, making it inefficient to find workflows in repositories with many workflow files. ## Changes - Replaced Bubble Tea list with Huh select..."PR #12363: Refactor: Split permissions.go into focused modules (928→133 lines)
pkg/workflow/permissions.gowas a 928-line monolithic file mixing parsing, factory methods, and operations - making navigation and maintenance difficult. ## Changes Split into 4 focused modules..."PR #14323: Fix duplicate draft issue creation in update-project
update_projectwithcontent_type: "draft_issue"and field updates, the code always creates a new draft issue even if one with the same title already exists..."View Example Closed PRs
❌ Closed (Not Merged) PR Examples
PR #15030: Simplify workflow concurrency groups to sequentialize per workflow
PR #14367: [WIP] [CI Failure Doctor] 🏥 CI Failure Investigation
PR #15041: Investigate CI Optimization Coach workflow failure (transient, no fix needed)
Recommendations
Based on this analysis of 1,000 Copilot PRs:
✅ DO: Write Clear, Focused Prompts
✅ DO: Prefer Simple, Atomic Changes
✅ DO: Reference Tests and Validation
Historical Trends (Last 7 Days)
7-Day Trend: Success rate has stabilized around 66-67% after a slight dip from the 69% peak on Feb 6. Today's 67.2% represents a positive uptick.
Conclusion
Copilot-generated PRs maintain a strong 67% success rate in the gh-aw repository. The data reveals that clarity and focus matter more than length or verbosity. Simple tasks like removals and updates perform best, while complex refactoring remains challenging.
Key Takeaway: Write prompts like you're assigning a task to a skilled colleague - provide context, state the goal clearly, and trust them to figure out the implementation details.
Methodology: Analyzed 1,000 Copilot PRs from the last 30 days using automated keyword extraction, prompt categorization, and statistical analysis. PRs were categorized by outcome (merged/closed/open) and analyzed for patterns in prompt length, content, and structure.
References:
/tmp/gh-aw/cache-memory/prompt-analysis/Beta Was this translation helpful? Give feedback.
All reactions