[copilot-cli-research] Copilot CLI Deep Research - 2026-02-20 #17287

2026-02-20T21:20:21Z

github-actions[bot]
bot Feb 20, 2026

Analysis Date: 2026-02-20
Repository: github/gh-aw
Scope: 153 total workflow files, 73 using Copilot engine (47.7%)
Triggered by: @pelikhan
Workflow Run: §22241412824

📊 Executive Summary

This analysis compared all available Copilot CLI features (CLI flags, engine configuration options, MCP capabilities, sandboxing, agent files) against actual usage across 73 Copilot-engine workflows in this repository.

Key Findings:

🚨 engine.env is completely unused (0 workflows) despite being a documented, powerful feature
🚨 7 out of 10 agent files (70%) in .github/agents/ are never referenced by any workflow
⚠️ safe-inputs is nearly unused (only 1 workflow) despite being a key security feature
⚠️ grumpy-reviewer and contribution-checker agent files exist but are never used via engine.agent: - they could unlock more consistent, specialized behavior across review workflows
✅ toolsets are well-used — most workflows with GitHub tool specify explicit toolsets, which is excellent practice
✅ --share flag, --disable-builtin-mcps, cache-memory, and safe-outputs are used appropriately

Primary Recommendation: Enable engine.agent for review workflows (grumpy-reviewer, contribution-checker) and start using engine.env for secrets/config that are currently hardcoded or entirely missing.

🔴 Critical Findings

High Priority Issues

1. engine.env — Completely Unused (0 workflows)
The engine.env field allows passing custom environment variables to the Copilot CLI engine. It is documented and fully supported, but zero workflows use it. This means any workflow needing custom environment configuration has no clean way to provide it without resorting to args:.

2. 7/10 Agent Files (70%) Are Orphaned
The following agent files exist in .github/agents/ but are never referenced by any workflow via engine.agent::

agentic-workflows.agent.md
contribution-checker.agent.md
create-safe-output-type.agent.md
custom-engine-implementation.agent.md
grumpy-reviewer.agent.md
interactive-agent-designer.agent.md
w3c-specification-writer.agent.md

Only technical-doc-writer (2 workflows) and ci-cleaner (1 workflow) are actively used. The grumpy-reviewer agent file in particular is well-crafted but the grumpy-reviewer.md workflow does not reference it via engine.agent: — it includes the persona inline in the prompt instead.

3. safe-inputs Near-Zero Adoption
safe-inputs is only used by 1 workflow (security-review.md), despite being designed to allow structured, secure user input to agent workflows. Any workflow triggered by a slash command with user input could benefit from safe-inputs for input sanitization.

Medium Priority Opportunities

4. engine.args Barely Used (1 workflow)
Only unbloat-docs.md uses engine.args. This feature allows injecting custom CLI arguments before the --prompt flag, which is useful for advanced use cases. Notably, --add-dir for custom directories, --verbose, or special flags for debugging are all accessible but unused.

5. Model Selection Gap for Complex Workflows
Only 8 workflows specify an explicit model — most use gpt-5.1-codex-mini (cost optimization). High-complexity workflows like daily-repo-chronicle.md (45-min timeout), org-health-report.md (60-min timeout), and pr-triage-agent.md (30-min timeout) do not specify a model. These could benefit from explicitly selecting a more capable model.

6. Network Config Without AWF Sandbox
66 workflows have a network: section but only 13 use the AWF sandbox (sandbox.agent: awf). The network: config is respected only when the sandbox is enabled — otherwise it has no enforcement effect. These workflows may believe they have network restrictions that aren't actually enforced.

1️⃣ Copilot CLI Capabilities Inventory

View Full Capabilities Inventory

CLI Flags (auto-injected by gh-aw compiler)

Flag	Status	Notes
`--add-dir`	✅ Auto-added	`/tmp/gh-aw/`, `$GITHUB_WORKSPACE`, cache dirs
`--disable-builtin-mcps`	✅ Always enabled	Prevents interference from user MCP config
`--log-level all --log-dir`	✅ Auto-added	Logs collection
`--share (path)`	✅ Auto-added	Conversation markdown for step summary
`--allow-tool`	✅ Auto-generated	From `tools:` config
`--model`	⚠️ Optional	From `engine.model:` or `GH_AW_MODEL_AGENT_COPILOT` var
`--agent`	⚠️ Optional	From `engine.agent:` — mostly unused for agent files
`--prompt`	✅ Always set	From markdown content
`--allow-all-tools`	✅ Auto-generated	When `bash: ["*"]`
`--allow-all-paths`	✅ Auto-generated	When `edit:` tool enabled

Engine Configuration Options

Field	Usage	Notes
`engine.model`	8 workflows	Cost optimization (gpt-5.1-codex-mini)
`engine.version`	0 production	Only smoke tests
`engine.args`	1 workflow	Only `unbloat-docs.md`
`engine.agent`	2 workflows (real files)	`technical-doc-writer`, `ci-cleaner`
`engine.command`	0 workflows	No custom executable overrides
`engine.env`	0 workflows	🚨 Completely unused
`max-turns`	5 workflows	Claude/Codex only (not supported by Copilot yet)

Sandbox & Security Features

Feature	Usage	Notes
`sandbox.agent: awf`	13 workflows	Network firewalling
`network.allowed`	66 workflows	Often without AWF (no enforcement)
`safe-outputs`	~70 workflows	Widely used ✅
`safe-inputs`	1 workflow	`security-review.md` only
`lockdown` mode	Several workflows	GitHub lockdown mode

Tool Usage

Tool	Count	Notes
`github:`	~60 workflows	With toolsets ✅
`cache-memory:`	48 occurrences	Widely used ✅
`playwright:`	9 workflows	Good adoption
`web-fetch:`	11 workflows	Reasonable adoption
`web-search:`	2 workflows	Not supported by Copilot (Claude/Codex only)
`agentic-workflows:`	24 workflows	Good adoption

Available Copilot Engine Capabilities

Default version: 0.0.412
Default detection model: gpt-5.1-codex-mini
LLM gateway: Port 10002
Supports: web-fetch, plugins, firewall, LLM gateway
Does NOT support: max-turns, web-search (built-in)

2️⃣ Feature Usage Matrix

Feature Category	Total Available	Used	Not Used	Usage Rate
Core CLI flags	10	10	0	100% (auto)
Engine config fields	6	3	3	50%
Agent files	10	2	7	20%
GitHub toolsets	Many	Most	Few	~85% ✅
Sandbox (AWF)	Available	13	60	18%
safe-inputs	Available	1	72	1.4%
engine.env	Available	0	73	0%

3️⃣ Missed Opportunities

View High Priority Opportunities

🔴 Opportunity 1: Use `engine.env` for Environment Configuration

What: The engine.env field allows passing custom environment variables to the Copilot CLI at execution time. It is fully supported but never used.

Why It Matters: Some workflows may need to configure behavior via environment variables (custom API endpoints, feature flags, debug settings). Currently there's no clean way to do this.

Where: Any workflow needing environment-specific configuration.

How to Implement:

engine:
  id: copilot
  env:
    MY_API_ENDPOINT: (api.example.com/redacted)
    DEBUG_MODE: "false"
    CUSTOM_CONFIG: $\{\{ vars.MY_CUSTOM_CONFIG }}

Expected Benefits: Cleaner configuration management without needing to embed values in args or prompts.

🔴 Opportunity 2: Use Agent Files for Specialized Roles

What: 7 of 10 agent files in .github/agents/ are never used by any workflow. Specifically, grumpy-reviewer.agent.md, contribution-checker.agent.md, and interactive-agent-designer.agent.md define rich personas that could be leveraged.

Why It Matters: Agent files provide consistent, reusable system prompts and tool configurations. The grumpy-reviewer.md workflow defines its persona inline in the prompt rather than using the pre-built grumpy-reviewer.agent.md file, leading to duplication.

Where:

grumpy-reviewer.md → should reference agent: grumpy-reviewer
pr-nitpick-reviewer.md → could use a reviewer agent file
contribution-check.md → should reference agent: contribution-checker

How to Implement:

engine:
  id: copilot
  agent: grumpy-reviewer    # References .github/agents/grumpy-reviewer.agent.md

Expected Benefits: DRY principle for agent personas, consistent behavior, easier updates to reviewer personality without modifying each workflow.

🔴 Opportunity 3: Network Config Enforcement Gap

What: 66 workflows have network: configuration but only 13 have sandbox.agent: awf enabled. The network config (allowed domains, etc.) is only enforced when the AWF sandbox is active.

Why It Matters: Security teams reviewing these workflows may believe network restrictions are enforced when they're not. This is a security posture clarity issue.

Where: Most workflows with network: section that lack sandbox: agent: awf

How to Implement:

Option A: Add sandbox: agent: awf to enforce restrictions
Option B: Remove network: sections that serve no purpose without AWF
Option C: Document this behavior clearly in workflow comments

network:
  allowed: [defaults]  # Note: only enforced when sandbox.agent: awf is set
sandbox:
  agent: awf           # Required to enforce network restrictions

View Medium Priority Opportunities

🟡 Opportunity 4: Model Selection for High-Complexity Workflows

What: Workflows with long timeouts (30-60 minutes) don't specify a model. They rely on the default or the GH_AW_MODEL_AGENT_COPILOT variable.

Where: daily-repo-chronicle.md (45min), org-health-report.md (60min), ci-coach.md (30min), pr-triage-agent.md (30min), delight.md (30min), workflow-health-manager.md (30min)

How to Implement:

engine:
  id: copilot
  model: claude-sonnet-4  # Or appropriate model for complex reasoning

Expected Benefits: Better results for complex tasks, predictable behavior regardless of env variable settings.

🟡 Opportunity 5: Expand `safe-inputs` Adoption

What: safe-inputs is a security feature for sanitizing user input in slash-command workflows. Only security-review.md uses it despite many workflows accepting user input via slash commands.

Where: All slash-command workflows that incorporate user input into agent prompts:

grumpy-reviewer.md
pr-nitpick-reviewer.md
dev.md
q.md
refiner.md

How to Implement:

safe-inputs:
  - id: safeinputs-context
    command: cat
    args: "$\{\{ steps.sanitized.outputs.text }}"

Expected Benefits: Protection against prompt injection via user input in slash commands.

🟡 Opportunity 6: `engine.args` for Debugging & Performance

What: Custom CLI arguments via engine.args can unlock debugging capabilities or optimize performance in specific scenarios.

Where: Workflows experiencing token limit issues or needing special behavior.

How to Implement:

engine:
  id: copilot
  args: ["--verbose"]  # For debugging
  # Or for specific performance tuning

Expected Benefits: Easier debugging, more control over Copilot CLI behavior.

🟡 Opportunity 7: `safe-inputs` for GitHub Toolset Scoping

What: Several workflows use github: without specifying toolsets:. This gives the agent access to all GitHub MCP tools rather than just the ones it needs.

Where: smoke-copilot.md uses github: with no toolsets. copilot-pr-merged-report.md is configured with github: false (correct!) but some others may over-provision.

How to Implement:

tools:
  github:
    toolsets: [pull_requests, repos]  # Minimum required toolsets

Expected Benefits: Principle of least privilege, faster tool discovery, reduced risk of unintended mutations.

View Low Priority Opportunities

🟢 Opportunity 8: Version Pinning for Stability

What: No production workflows pin the Copilot CLI version. All use the default (0.0.412 currently). Pinning would provide reproducibility.

Where: Critical production workflows.

How to Implement:

engine:
  id: copilot
  version: "0.0.412"  # Pin to tested version

Tradeoff: Version pinning requires maintenance but provides stability. The current approach auto-updates with each gh-aw release which may be preferable.

4️⃣ Specific Workflow Recommendations

View Workflow-Specific Recommendations

`grumpy-reviewer.md`

Current State: Has a comprehensive persona defined inline in the prompt
Recommendation: Add engine.agent: grumpy-reviewer to use the pre-built agent file
Benefit: Persona becomes reusable, single source of truth

`pr-nitpick-reviewer.md`

Current State: Defines reviewer behavior inline
Recommendation: Create a nitpick-reviewer.agent.md or use existing reviewer agents
Benefit: Consistent review style, easier maintenance

`contribution-check.md`

Current State: Does not use the contribution-checker.agent.md file
Recommendation: Add engine.agent: contribution-checker
Benefit: Uses specialized agent designed for contribution checking

`smoke-copilot.md`

Current State: github: tool with no toolsets restriction
Recommendation: Add toolsets: [default] or specific toolsets needed
Benefit: Principle of least privilege, also good as a smoke test example

`daily-repo-chronicle.md` / `org-health-report.md`

Current State: 45-60 minute timeout with no model specified
Recommendation: Consider specifying a capable model explicitly
Benefit: Predictable quality for these long-running analytical tasks

Workflows with `network:` but no AWF (60+ workflows)

Current State: Network config specified but not enforced
Recommendation: Either add AWF or add a comment clarifying the config is informational only
Benefit: Prevents security posture confusion

5️⃣ Historical Trends

View Historical Analysis

This is the first comprehensive Copilot CLI usage analysis for this repository. No previous analysis exists in repo-memory to compare against.

Observations from code history (CHANGELOG):

--share flag was recently added and is now automatically injected ✅
Session state files (.jsonl) are now copied to logs before redaction ✅
Retry logic was added to the Copilot CLI installer ✅
Domain blocklist support (--block-domains) was recently added
The --disable-builtin-mcps flag is always added to prevent user MCP config interference ✅

Evolution trend: The tooling is mature and the compiler handles most Copilot CLI complexity automatically. The remaining gaps are primarily in user-facing configuration (agent files, engine.env, safe-inputs).

Future runs will track:

Whether unused agent files get adopted
Whether engine.env starts seeing usage
Whether safe-inputs adoption increases across slash-command workflows
Whether network config / AWF alignment improves

6️⃣ Best Practice Guidelines

Based on this research, here are recommended best practices for Copilot workflows:

Specify GitHub toolsets explicitly: Always use toolsets: [...] with the minimum required set rather than leaving it open. Most workflows do this well already.
Use agent files for personas: If a workflow has a specialized persona (reviewer, doc writer, analyzer), define it as an agent file and reference it via engine.agent: rather than embedding it inline.
Add safe-inputs to slash-command workflows: Any workflow that accepts user input via slash commands should use safe-inputs to sanitize input before it reaches the prompt.
Network config requires AWF to be enforced: Don't add network: sections unless also enabling sandbox.agent: awf. The network configuration has no effect without the sandbox.
Use engine.env for configuration: Instead of embedding environment-specific values in args or prompts, use engine.env for clean, auditable configuration.
Set explicit models for complex tasks: Long-running workflows (>20 minutes) should specify a model explicitly rather than relying on the default or environment variables.

7️⃣ Action Items

Immediate Actions (this week):

Update grumpy-reviewer.md to use engine.agent: grumpy-reviewer instead of inline persona
Update contribution-check.md to use engine.agent: contribution-checker
Audit workflows with network: but no AWF — add comments or remove dead config

Short-term (this month):

Add safe-inputs to top 5 most-used slash-command workflows (q.md, refiner.md, grumpy-reviewer.md, dev.md, pr-nitpick-reviewer.md)
Specify explicit models for the top 5 longest-running workflows
Create a first workflow that uses engine.env as a reference example
Add toolsets: to smoke-copilot.md

Long-term (this quarter):

Review all 7 unused agent files — either adopt or archive
Evaluate whether max-turns support for Copilot should be added (currently supportsMaxTurns: false)
Consider adding a shared workflow configuration template that demonstrates all Copilot best practices
Track safe-inputs adoption as the feature matures

View Supporting Evidence & Methodology

Research Methodology

Data Collection:

Analyzed all 153 workflow markdown files in .github/workflows/
Reviewed 6 Copilot engine source files: copilot_engine.go, copilot_engine_execution.go, copilot_engine_tools.go, copilot_mcp.go, copilot_installer.go, copilot_participant_steps.go
Examined .github/agents/ directory (10 agent files)
Reviewed docs/src/content/docs/reference/engines.md
Checked pkg/constants/constants.go for version and model constants
Reviewed CHANGELOG.md for recent Copilot feature additions

Key Statistics:

153 total workflow files
73 (47.7%) use engine: copilot
37 (24.2%) use Claude engine
9 (5.9%) use Codex engine
1 uses Gemini engine
19 use extended engine config format (engine:\n id: ...)
13 workflows use AWF sandbox
48 cache-memory configurations
66 network configurations
8 explicit model selections
0 engine.env usages
1 safe-inputs usage
10 agent files, 2 used (20%)

Tools Used: grep, find, file inspection of Go source code

References

Engines documentation: docs/src/content/docs/reference/engines.md
Copilot engine implementation: pkg/workflow/copilot_engine*.go
Agent files: .github/agents/*.agent.md
Copilot version: 0.0.412 (as of this analysis)

References:

§22241412824

AI generated by Copilot CLI Deep Research Agent

expires on Feb 27, 2026, 9:20 PM UTC

2026-02-20T21:42:27Z

github-actions[bot]
bot Feb 20, 2026
Author

🤖 Beep boop! The smoke test agent was here!

I just rolled through to make sure everything's working. Tests are running, engines are revving, and the CI pipeline is humming along nicely. 🚀

Stay fresh, discussion #17287! ✨

📰 BREAKING: Report filed by Smoke Copilot for issue #17245

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-cli-research] Copilot CLI Deep Research - 2026-02-20 #17287

Uh oh!

{{title}}

Uh oh!

CLI Flags (auto-injected by gh-aw compiler)

Engine Configuration Options

Sandbox & Security Features

Tool Usage

Available Copilot Engine Capabilities

🔴 Opportunity 1: Use `engine.env` for Environment Configuration

🔴 Opportunity 2: Use Agent Files for Specialized Roles

🔴 Opportunity 3: Network Config Enforcement Gap

🟡 Opportunity 4: Model Selection for High-Complexity Workflows

🟡 Opportunity 5: Expand `safe-inputs` Adoption

🟡 Opportunity 6: `engine.args` for Debugging & Performance

🟡 Opportunity 7: `safe-inputs` for GitHub Toolset Scoping

🟢 Opportunity 8: Version Pinning for Stability

`grumpy-reviewer.md`

`pr-nitpick-reviewer.md`

`contribution-check.md`

`smoke-copilot.md`

`daily-repo-chronicle.md` / `org-health-report.md`

Workflows with `network:` but no AWF (60+ workflows)

Research Methodology

References

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[copilot-cli-research] Copilot CLI Deep Research - 2026-02-20 #17287

Uh oh!

github-actions[bot] bot Feb 20, 2026

📊 Executive Summary

🔴 Critical Findings

High Priority Issues

Medium Priority Opportunities

1️⃣ Copilot CLI Capabilities Inventory

CLI Flags (auto-injected by gh-aw compiler)

Engine Configuration Options

Sandbox & Security Features

Tool Usage

Available Copilot Engine Capabilities

2️⃣ Feature Usage Matrix

3️⃣ Missed Opportunities

🔴 Opportunity 1: Use engine.env for Environment Configuration

🔴 Opportunity 2: Use Agent Files for Specialized Roles

🔴 Opportunity 3: Network Config Enforcement Gap

🟡 Opportunity 4: Model Selection for High-Complexity Workflows

🟡 Opportunity 5: Expand safe-inputs Adoption

🟡 Opportunity 6: engine.args for Debugging & Performance

🟡 Opportunity 7: safe-inputs for GitHub Toolset Scoping

🟢 Opportunity 8: Version Pinning for Stability

4️⃣ Specific Workflow Recommendations

grumpy-reviewer.md

pr-nitpick-reviewer.md

contribution-check.md

smoke-copilot.md

daily-repo-chronicle.md / org-health-report.md

Workflows with network: but no AWF (60+ workflows)

5️⃣ Historical Trends

6️⃣ Best Practice Guidelines

7️⃣ Action Items

Research Methodology

References

Replies: 1 comment

Uh oh!

github-actions[bot] bot Feb 20, 2026 Author

github-actions[bot]
bot Feb 20, 2026

🔴 Opportunity 1: Use `engine.env` for Environment Configuration

🟡 Opportunity 5: Expand `safe-inputs` Adoption

🟡 Opportunity 6: `engine.args` for Debugging & Performance

🟡 Opportunity 7: `safe-inputs` for GitHub Toolset Scoping

`grumpy-reviewer.md`

`pr-nitpick-reviewer.md`

`contribution-check.md`

`smoke-copilot.md`

`daily-repo-chronicle.md` / `org-health-report.md`

Workflows with `network:` but no AWF (60+ workflows)

github-actions[bot]
bot Feb 20, 2026
Author