You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analysis of 1,000 Copilot-generated PRs from the last 30 days in github/gh-aw. The dataset shows a 69.6% overall merge rate, with clear patterns distinguishing successful from unsuccessful prompts.
Key Metrics
Metric
Value
Total PRs
1,000
Merged
693 (69.6%)
Closed
303 (30.4%)
Open
4
Non-WIP success rate
77.1%
WIP success rate
25.5%
Prompt Categories and Success Rates
Category
Total
Merged
Success Rate
Refactor
21
18
85.7% ✅
Documentation
95
68
71.6% ✅
Bug Fix
696
495
71.1% ✅
Testing
41
26
63.4%
Feature Addition
105
64
61.0%
Other
38
22
57.9%
✅ Successful Prompt Patterns
Common characteristics in merged PRs:
Average body length: 534 words
Structured body with ## Changes or ## Summary sections: 84% of merged vs 57% of closed
File references (e.g. `pkg/foo.go`): 74% of merged vs 63% of closed
PR #17087: feat: block macOS runners in agentic workflows with FAQ entry and remove smoke-macos workflow — explains the problem (container jobs not supported), concrete detection approach, and links changes together → Merged
PR #17086: Recursively clean git credentials from all checkouts in workspace and /tmp/ — clear problem statement (only root .git/config cleaned), specific scope ($GITHUB_WORKSPACE), concrete fix → Merged
PR #17065: Enforce no-secrets check on engine.env in addition to top-level env — identifies a gap in existing validation, minimal targeted fix → Merged
❌ Unsuccessful Prompt Patterns
Common characteristics in closed PRs:
Average body length: 474 words (shorter, less context)
[WIP] tag: 145 WIP PRs total — 107 closed (74%) vs 37 merged (25.5%)
Unstructured body (no ## sections): 43% of closed PRs lack sections
update verb in title: disproportionately high in closed (11.2% vs 6.9%)
failure in title: only 59% success, lower than baseline
Example unsuccessful prompts:
PR #16535: [WIP] Convert generate summary workflow to agentic format — WIP tag, generic placeholder body → Closed
PR #16850: Recompile workflows to sync lock files — maintenance-only task, minimal context, no clear problem → Closed
PR #16956: [WIP] Add github.blog to the default list of trusted domains — WIP with vague placeholder body ("I will get started...") → Closed
WIP Analysis Detail
Of the 145 PRs with [WIP] in the title:
37 merged (25.5% success rate)
107 closed (73.8% close rate)
In contrast, non-WIP PRs have a 77.1% merge rate — 3× higher.
The WIP pattern is the single strongest predictor of PR closure. WIP PRs often start with a Copilot-generated placeholder body ("Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date...") that provides no actionable context to reviewers.
Key Insights
Pattern 1: The [WIP] tag is the rejig docs #1 predictor of closure — 74% of WIP PRs are closed. Non-WIP PRs succeed at 3× the rate (77.1% vs 25.5%).
Pattern 2: PRs with a structured body (using ## Changes, ## Summary, etc.) are merged 84% of the time. Unstructured bodies appear in 43% of closed PRs — reviewers need context to approve.
Pattern 3: Longer, more detailed prompts perform better: 1,000+ word bodies achieve 78% success vs 60% for 51–150 word bodies. More context = more confidence for merging.
Pattern 4: Refactoring is the highest-success category at 85.7%, likely because refactor prompts tend to be precise about scope and measurable outcomes.
Recommendations
DO: Write prompts that clearly state the problem, the root cause, and the specific fix. Reference exact file paths and function names.
DO: Structure the PR body with ## Changes and explain why each change was made.
DO: Use action verbs fix or add in the title — they strongly correlate with merges.
AVOID: Using [WIP] tags — they signal unfinished work and trigger 74% closure rate.
AVOID: Vague verbs like update or improve in titles without specifics.
AVOID: Generic placeholder bodies — every PR should explain the problem it solves from the first draft.
Historical Trends (Last 7 Days)
Date
Merged
Closed
Success Rate
Notable
2026-02-20
693
303
69.6%
Today
2026-02-19
689
306
69.2%
Similar
2026-02-18
697
301
69.8%
Similar
2026-02-17
698
300
69.9%
Similar
2026-02-16
687
312
68.8%
Lower
2026-02-14
670
327
67.2%
Dip
2026-02-13
659
338
66.1%
Lower
Trend: The 30-day rolling success rate has improved from ~66% in early February to ~70% this week. Day-by-day recent success rates have been strong (84% on Feb 19, 83% on Feb 20), suggesting prompt quality is improving.
Daily Breakdown (Last 14 Days)
Date
Merged
Closed
Daily Rate
2026-02-07
47
41
53%
2026-02-08
31
25
55%
2026-02-09
22
25
47%
2026-02-10
27
15
64%
2026-02-11
51
15
77%
2026-02-12
54
11
83%
2026-02-13
51
18
74%
2026-02-14
59
16
79%
2026-02-15
20
6
77%
2026-02-16
36
16
69%
2026-02-17
33
12
73%
2026-02-18
32
23
58%
2026-02-19
48
9
84%
2026-02-20
5
1
83%
Note: Feb 9 was the lowest day at 47% — likely a batch of WIP/experimental PRs.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Analysis of 1,000 Copilot-generated PRs from the last 30 days in
github/gh-aw. The dataset shows a 69.6% overall merge rate, with clear patterns distinguishing successful from unsuccessful prompts.Key Metrics
Prompt Categories and Success Rates
✅ Successful Prompt Patterns
Common characteristics in merged PRs:
## Changesor## Summarysections: 84% of merged vs 57% of closed`pkg/foo.go`): 74% of merged vs 63% of closed[WIP]tag in titlefix:,feat:,docs:) →fix:prefix achieves 83% successAction verbs by success rate in titles:
fixaddremoveupdateBody length and success:
Example successful prompts:
feat: block macOS runners in agentic workflows with FAQ entry and remove smoke-macos workflow— explains the problem (container jobs not supported), concrete detection approach, and links changes together → MergedRecursively clean git credentials from all checkouts in workspace and /tmp/— clear problem statement (only root.git/configcleaned), specific scope ($GITHUB_WORKSPACE), concrete fix → MergedEnforce no-secrets check on engine.env in addition to top-level env— identifies a gap in existing validation, minimal targeted fix → Merged❌ Unsuccessful Prompt Patterns
Common characteristics in closed PRs:
##sections): 43% of closed PRs lack sectionsupdateverb in title: disproportionately high in closed (11.2% vs 6.9%)failurein title: only 59% success, lower than baselineExample unsuccessful prompts:
[WIP] Convert generate summary workflow to agentic format— WIP tag, generic placeholder body → ClosedRecompile workflows to sync lock files— maintenance-only task, minimal context, no clear problem → Closed[WIP] Add github.blog to the default list of trusted domains— WIP with vague placeholder body ("I will get started...") → ClosedWIP Analysis Detail
Of the 145 PRs with
[WIP]in the title:In contrast, non-WIP PRs have a 77.1% merge rate — 3× higher.
The WIP pattern is the single strongest predictor of PR closure. WIP PRs often start with a Copilot-generated placeholder body ("Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date...") that provides no actionable context to reviewers.
Key Insights
[WIP]tag is the rejig docs #1 predictor of closure — 74% of WIP PRs are closed. Non-WIP PRs succeed at 3× the rate (77.1% vs 25.5%).## Changes,## Summary, etc.) are merged 84% of the time. Unstructured bodies appear in 43% of closed PRs — reviewers need context to approve.Recommendations
## Changesand explain why each change was made.fixoraddin the title — they strongly correlate with merges.[WIP]tags — they signal unfinished work and trigger 74% closure rate.updateorimprovein titles without specifics.Historical Trends (Last 7 Days)
Trend: The 30-day rolling success rate has improved from ~66% in early February to ~70% this week. Day-by-day recent success rates have been strong (84% on Feb 19, 83% on Feb 20), suggesting prompt quality is improving.
Daily Breakdown (Last 14 Days)
Note: Feb 9 was the lowest day at 47% — likely a batch of WIP/experimental PRs.
References: §22211982021
Beta Was this translation helpful? Give feedback.
All reactions