[copilot-cli-research] Copilot CLI Deep Research - February 2026 #14680
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-02-16T15:39:54.044Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🔍 Copilot CLI Deep Research Report
Analysis Date: February 9, 2026
Repository: github/gh-aw
Scope: 208 total workflows, 71 using Copilot engine (34%)
Workflow Run: §21831314011
📊 Executive Summary
Research Topic: Copilot CLI Optimization Opportunities
Key Findings:
Primary Recommendation: Add documentation examples and templates demonstrating custom models, engine.agent configuration, and sandbox security patterns to improve feature adoption.
This research analyzed the gap between available Copilot CLI features (documented in code and docs) and actual usage patterns across 71 Copilot workflows. The findings reveal significant underutilization of advanced features, particularly around performance optimization (custom models), security hardening (sandbox/network restrictions), and specialized behavior (custom agents).
Critical Findings
🔴 High Priority Issues
1. Zero Custom Model Usage
engine.modelbeing fully supportedauto-triage-issues.md,ai-moderator.mdcould usegpt-5.1-codex-minifor simple classification tasks2. Minimal Sandbox Adoption (Security Risk)
3. Low Network Restriction Usage
network.alloweddespite firewall capabilities[defaults, github]or specific domains4. Zero engine.agent Usage
--agentflag feature is documented but completely unused across all 71 workflowsarchie.md,brave.mdhave distinct personas that could use custom agents🟡 Medium Priority Opportunities
5. No Custom CLI Arguments (engine.args)
engine.args--verbose,--debug, additional--add-dirpaths--verbosefor detailed logging6. No Custom Environment Variables (engine.env)
engine.env7. Low safe-inputs Adoption
1️⃣ Current State Analysis
View Copilot CLI Capabilities Inventory
Copilot CLI Capabilities Inventory
Version Information: Using latest Copilot CLI (version managed via
engine.version)Available Features (from
pkg/workflow/copilot_engine_execution.go):Core CLI Flags
--add-dir- Add directories to Copilot's context (automatically added: /tmp/gh-aw/, workspace)--log-level- Set logging verbosity (always set to "all")--log-dir- Set log output directory (always set to/tmp/gh-aw/sandbox/agent/logs/)--disable-builtin-mcps- Disable built-in MCP servers (always enabled)--model- Override AI model (supports custom models)--agent- Specify custom agent file (via engine.agent)--allow-tool- Grant tool permissions (auto-generated from tools config)--allow-all-tools- Grant all tool permissions (whenbash: ["*"]orbash: [":*"])--allow-all-paths- Allow write access to all paths (auto-enabled with edit tool)--share- Generate conversation markdown (ALWAYS auto-enabled by compiler)Engine Configuration Options
engine.id: copilotorengine: copilot- Select Copilot engineengine.version- Pin Copilot CLI version (default: latest)engine.model- Override default model (e.g.,gpt-5.1-codex-mini,claude-sonnet-4)engine.args- Custom CLI arguments (injected before--prompt)engine.env- Custom environment variablesengine.agent- Custom agent identifier (references.github/agents/*.agent.md)engine.command- Override copilot command (for testing/custom builds)Tool Integration
tools.github)tools.playwright)tools.serena)tools.agentic-workflows)tools.web-fetch)tools.cache-memory)tools.repo-memory)Network & Security Features
network.allowed- Firewall rules restricting network accesssandbox.agent: awf- Application Whitelisting Framework (AWF) for process isolationsandbox.agent: srt- Sandbox Runtime (SRT) for container-based isolationModel Selection
GH_AW_MODEL_AGENT_COPILOTGH_AW_MODEL_DETECTION_COPILOTgpt-5.1-codex-miniView Usage Statistics
Usage Statistics
Repository Overview:
.github/workflows/Engine Configuration Patterns:
engine: copilot: 71 workflows (100% of Copilot workflows)engine.id: copilot: 0 workflows (deprecated syntax not used)engine.model: 0 workflows ❌engine.agent: 0 workflows ❌engine.args: 0 workflows ❌engine.env: 0 workflows ❌Tool Usage (among 71 Copilot workflows):
tools.github: 71/71 (100%) - Universal GitHub API accesstools.bash: 53/71 (75%) - Shell command executiontools.edit: 52/71 (73%) - File editing capabilitiestools.playwright: 2/71 (3%) - Browser automationtools.serena: ~10/71 (14%) - Code analysistools.web-fetch: ~15/71 (21%) - Web content fetchingtools.cache-memory: 15/71 (21%) - Persistent cachingtools.repo-memory: 12/71 (17%) - Repository-scoped memorySecurity & Network:
safe-outputs: 56/71 (79%) - High adoption for controlled side effectssafe-inputs: 3/71 (4%) - Very low adoptionnetwork.allowed: 13/71 (18%) - Low network restriction usagesandbox.agent: 1/71 (0.6%) - Minimal sandbox usageTimeout Distribution (119 workflows analyzed):
2️⃣ Feature Usage Matrix
Key Insight: High adoption of core tools (github, bash, edit) and safe-outputs, but very low adoption of advanced features (custom models, agents, sandbox, network restrictions).
3️⃣ Missed Opportunities
View High Priority Opportunities
🔴 High Priority
Opportunity 1: Custom Models for Cost Optimization
What: Use cheaper/faster models for simple tasks via
engine.modelWhy It Matters:
gpt-5.1-codex-miniis significantly cheaper than default Claude Sonnet 4Where:
auto-triage-issues.md,ai-moderator.mdartifacts-summary.md,cli-consistency-checker.mdsafe-outputs.add-labelsHow to Implement:
Expected Benefits:
Opportunity 2: Sandbox Security for Untrusted Input
What: Enable
sandbox.agent: awffor workflows processing external/untrusted contentWhy It Matters:
Where:
ai-moderator.md(external PR comments)brave.md, any workflow withweb-fetchHow to Implement:
Expected Benefits:
Opportunity 3: Network Allowlisting by Default
What: Add
network.allowedto all workflows to restrict outbound connectionsWhy It Matters:
Where: All workflows except those explicitly needing broad network access
Recommended Pattern:
Expected Benefits:
[defaults, github]allowlistOpportunity 4: Custom Agents for Specialized Workflows
What: Use
engine.agentto reference custom agent files for workflows with distinct personasWhy It Matters:
Where:
archie.md(diagram generator),brave.md(search agent)agent-performance-analyzer.md,breaking-change-checker.mdHow to Implement:
.github/agents/diagram-specialist.agent.md:Expected Benefits:
View Medium Priority Opportunities
🟡 Medium Priority
Opportunity 5: Custom CLI Arguments for Debugging
What: Use
engine.argsto pass custom flags to Copilot CLIWhy It Matters:
--verboseor--debugfor troubleshooting--add-dirpaths for workflow-specific contextExample:
Use Cases:
Opportunity 6: Engine Environment Variables
What: Use
engine.envfor engine-specific configurationWhy It Matters:
Example:
Use Cases:
Opportunity 7: Expand safe-inputs Adoption
What: More workflows should use
safe-inputsfor input sanitizationCurrent: Only 3/71 workflows use safe-inputs
Opportunity: Any workflow processing user input should sanitize
Where:
Example:
Opportunity 8: More Aggressive Timeout Tuning
Current: Most workflows use 10-30 minute timeouts
Opportunity: Right-size timeouts based on actual workflow complexity
Recommendations:
Pattern:
Opportunity 9: Repo-Memory for Trend Analysis
Current: 12/71 workflows use repo-memory
Opportunity: More workflows could track trends over time
Where:
Example:
View Low Priority Opportunities
🟢 Low Priority
Opportunity 10: Playwright for Browser Testing
Current: Only 2/71 workflows use Playwright
Opportunity: Workflows testing web UIs or scraping complex sites
Use Case: Testing documentation sites, validating links, screenshot generation
Opportunity 11: Version Pinning for Stability
Current: Implicit use of
latestversionOpportunity: Pin specific Copilot CLI versions for reproducibility
Example:
Trade-off: Stability vs. missing new features
Opportunity 12: Custom Commands for Testing
Current: No workflows use
engine.commandOpportunity: Test custom Copilot builds or forks
Example:
Use Case: Internal testing only
4️⃣ Specific Workflow Recommendations
View Workflow-Specific Recommendations
High-Impact Workflows
Workflow:
ai-moderator.mdCurrent State: Simple spam detection with default model, no sandbox
Recommended Changes:
engine.model: gpt-5.1-codex-mini(cheaper for classification)sandbox.agent: awf(security isolation for untrusted content)network.allowed: [defaults, github](restrict network access)Expected Benefits: 60% cost reduction, improved security posture
Workflow:
archie.mdCurrent State: Diagram generation with default config
Recommended Changes:
.github/agents/diagram-specialist.agent.mdengine.agent: diagram-specialisttimeout-minutes: 10(currently 10, keep as-is)Expected Benefits: More consistent diagram quality, reusable agent definition
Workflow:
auto-triage-issues.mdCurrent State: Issue classification with default model
Recommended Changes:
engine.model: gpt-5.1-codex-mini(cheaper for labeling)network.allowed: [defaults, github]Expected Benefits: 60% cost reduction, faster labeling
Workflow:
breaking-change-checker.mdCurrent State: Comprehensive API analysis
Recommended Changes:
network.allowed: [defaults, github]timeout-minutes: 15(currently 10, may need more time)Expected Benefits: Better security isolation
Workflow:
brave.mdCurrent State: Web search with external API
Recommended Changes:
sandbox.agent: awf(isolate web content)network.allowed: [defaults, github, "api.search.brave.com"]Expected Benefits: Security isolation for web content
Template Recommendations
Simple Classification Template
Secure External Content Template
Specialized Agent Template
5️⃣ Trends & Insights
View Historical Trends
First Comprehensive Analysis
This is the first comprehensive Copilot CLI deep research for this repository. Future analyses will track:
Baseline Metrics (February 2026):
Next Analysis: Recommend quarterly deep research to track improvements
6️⃣ Best Practice Guidelines
Based on this research, here are recommended best practices for Copilot workflows:
1. Right-Size Your Model
engine.model: gpt-5.1-codex-minigpt-5.1-codex-mini2. Secure by Default
network.allowed: [defaults, github]unless you need external accesssandbox.agent: awffor workflows processing untrusted contentsafe-inputswhen handling secrets or user input3. Optimize Timeouts
4. Use Custom Agents for Personas
.github/agents/*.agent.mdfor specialized behaviorengine.agent: agent-name5. Leverage Safe-Outputs
maxlimitsexpiresfor temporary outputsgroupfor related issues6. Memory for Continuity
repo-memoryfor trend trackingcache-memoryfor temporary data7️⃣ Action Items
Immediate Actions (this week)
docs/src/content/docs/reference/engines.mdShort-term (this month)
gpt-5.1-codex-miniand measure cost savings.github/agents/files for common personas (diagram-specialist, security-analyzer, etc.)network.allowedto 20+ workflows that don't need external accessLong-term (this quarter)
View Supporting Evidence & Methodology
📚 References
Code Analysis
pkg/workflow/copilot_engine.go- Core engine interface and capabilitiespkg/workflow/copilot_engine_execution.go- CLI argument construction and executionpkg/workflow/copilot_engine_tools.go- Tool permissions and configurationpkg/workflow/copilot_mcp.go- MCP server integrationpkg/workflow/copilot_srt.go- Sandbox Runtime integrationDocumentation
Workflows Analyzed
.github/workflows/engine: copilotagent-performance-analyzer.md,ai-moderator.md,archie.md,auto-triage-issues.md,brave.md,breaking-change-checker.md, and 65 othersResearch Methodology
Phase 1: Capability Inventory
pkg/workflow/andpkg/cli/docs/src/content/docs/reference/engines.mdand related guidesTools Used:
find pkg -name 'copilot*.go'- Located all Copilot source filesPhase 2: Usage Pattern Analysis
grepto extract frontmatter patterns across all workflowsTools Used:
grep -l "engine: copilot" .github/workflows/*.md- Found Copilot workflowsPhase 3: Gap Analysis
Analysis Framework:
Phase 4: Recommendation Generation
Criteria:
Limitations
References:
Beta Was this translation helpful? Give feedback.
All reactions