You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Review validation of generated mcp server payload and fix the JSON schema or generated code. You need to take into account uninterpolated expressions...
Characteristic: Prompts structured with "---- This section details on the original issue you should resolve" wrapper
Sample Task Prompt
---- This section details on the original issue you should resolve P1 Fix Research Workflow - 90% Failure Rate, MCP Gateway Issue 15 Days Offline. Problem: Research workflow is non-operational, 90% failure rate...
Cluster 3: Agentic Workflow Maintenance
Tasks focused on agentic workflow system maintenance: debugging, recompilation, and agent configuration.
Create a smoke test workflow to validate common development tool availability. Objective: Create a minimal smoke test workflow that validates key tools are present in the sandbox environment...
Cluster 4: Safe-Outputs & API Tasks
Tasks centered on the safe-outputs subsystem — GitHub Projects API, PR creation, and related integrations.
Investigate issue in resolving project url in safe outputs. Warning: Direct projectV2 number query failed, falling back to projectsV2 list search. Request failed due to following response errors...
Cluster 5: CI Failure Investigation
CI Failure Doctor tasks — investigating specific failed workflow runs and implementing fixes.
Generate a W3C style specification of the GitHub MCP server and MCP gateway specification — allows defining allowed-repos to scope repos accessible by the GitHub MCP, supports wildcards...
Cluster 8: Targeted CI Fixes (with Job ID)
Structured CI fix tasks that include explicit job IDs and job URLs — high information density leads to higher success.
Fix the failing GitHub Actions workflow lint-go. Analyze the workflow logs, identify the root cause of the failure, and implement a fix. Job ID: 61501187403. Job URL: [link]. Custom agent used: ci-cleaner...
Trend Analysis: Historical Comparison
Date
Total Tasks
Merged
Success Rate
Clusters
Silhouette
2026-02-09
990
685
69.2%
8
0.086
2026-02-14
980
678
69.2%
3
—
2026-02-19
1,000
690
69.0%
8
0.229
2026-02-20
1,000
692
69.2%
8
0.090
The overall merge success rate has stabilized at ~69.2%. The 745 new PRs in this run show similar patterns to previously analyzed tasks.
Notable shifts vs Feb 19:
Dependency Updates cluster grew to 402 tasks (+compared to prior ~360)
Documentation cluster maintains its outsized success rate (93%)
Issue-Based Task Resolution remains the weakest cluster at 54.8%
Fix gh-aw binary availability for user-defined steps
Issue-Based
✅ Merged
24
Recommendations
Based on the clustering analysis:
Improve Issue-Based Task prompts — The "Issue-Based Task Resolution" cluster (22% of all tasks, 54.8% success) is the biggest opportunity. Prompts wrapped in ---- This section details on the original issue you should resolve often lack specific acceptance criteria. Adding explicit success conditions and expected outputs could significantly boost the merge rate.
Replicate Documentation task patterns — Documentation tasks achieve 93% success with small footprints (avg 4 files). These prompts tend to be clear, scoped, and verifiable. Apply the same specificity to other task types.
Always include Job ID/URL in CI fix tasks — The "Targeted CI Fixes (with Job ID)" cluster achieves 80.6% success despite high file complexity (avg 29 files). Providing the exact Job ID and Job URL in the prompt enables the agent to fetch precise context, reducing ambiguity.
Break down high-complexity Dependency Update tasks — With avg 28 files changed, complex dependency PRs have more failure modes. Where possible, split large package tree updates into smaller batches.
Add review requests for Issue-Based tasks — This cluster has the lowest avg reviews (0.37) and also the lowest success rate. Systematic human review checkpoints may help catch problems early.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Daily NLP-based clustering analysis of copilot agent task prompts for the last 30 days.
Summary
Key Insights
Cluster Overview
Detailed Cluster Analysis
Cluster 1: General Maintenance & Updates
The largest cluster — broad maintenance tasks spanning MCP updates, workflow upkeep, and agentic configuration.
Sample Task Prompt
Cluster 2: Issue-Based Task Resolution⚠️ Lowest Success
Tasks driven by GitHub issue descriptions — often complex, multi-part feature requests and workflow fixes from issue bodies.
"---- This section details on the original issue you should resolve"wrapperSample Task Prompt
Cluster 3: Agentic Workflow Maintenance
Tasks focused on agentic workflow system maintenance: debugging, recompilation, and agent configuration.
Sample Task Prompt
Cluster 4: Safe-Outputs & API Tasks
Tasks centered on the
safe-outputssubsystem — GitHub Projects API, PR creation, and related integrations.Sample Task Prompt
Cluster 5: CI Failure Investigation
CI Failure Doctor tasks — investigating specific failed workflow runs and implementing fixes.
Sample Task Prompt
Cluster 6: Security & Campaign Tasks
Security alert burndown campaigns, code security analysis, and project-wide security improvements.
Sample Task Prompt
Cluster 7: Documentation Writing ✅ Highest Success
Technical documentation, W3C-style specs, API guides, and README updates — consistently the most successful task type.
Sample Task Prompt
Cluster 8: Targeted CI Fixes (with Job ID)
Structured CI fix tasks that include explicit job IDs and job URLs — high information density leads to higher success.
Sample Task Prompt
Trend Analysis: Historical Comparison
The overall merge success rate has stabilized at ~69.2%. The 745 new PRs in this run show similar patterns to previously analyzed tasks.
Notable shifts vs Feb 19:
Recent 50 PRs: Full Data Table
@playwright/mcpversion is already updatedinstall plugin→plugin installFixes #N" as bullet point in PR bodyRecommendations
Based on the clustering analysis:
Improve Issue-Based Task prompts — The "Issue-Based Task Resolution" cluster (22% of all tasks, 54.8% success) is the biggest opportunity. Prompts wrapped in
---- This section details on the original issue you should resolveoften lack specific acceptance criteria. Adding explicit success conditions and expected outputs could significantly boost the merge rate.Replicate Documentation task patterns — Documentation tasks achieve 93% success with small footprints (avg 4 files). These prompts tend to be clear, scoped, and verifiable. Apply the same specificity to other task types.
Always include Job ID/URL in CI fix tasks — The "Targeted CI Fixes (with Job ID)" cluster achieves 80.6% success despite high file complexity (avg 29 files). Providing the exact Job ID and Job URL in the prompt enables the agent to fetch precise context, reducing ambiguity.
Break down high-complexity Dependency Update tasks — With avg 28 files changed, complex dependency PRs have more failure modes. Where possible, split large package tree updates into smaller batches.
Add review requests for Issue-Based tasks — This cluster has the lowest avg reviews (0.37) and also the lowest success rate. Systematic human review checkpoints may help catch problems early.
References:
Beta Was this translation helpful? Give feedback.
All reactions