📊 Agentic Workflow Lock File Statistics - October 19, 2025 #1965

dsyme · 2025-10-19T03:36:21Z

dsyme
Oct 19, 2025
Maintainer

📊 Agentic Workflow Lock File Statistics - 2025-10-19

Executive Summary

Total Lock Files: 37
Total Size: 7.21 MB
Average File Size: 199.54 KB
Analysis Date: 2025-10-19
Repository: githubnext/gh-aw

File Size Distribution

Size Range	Count	Percentage
< 10 KB	0	0%
10-50 KB	0	0%
50-100 KB	2	5.4%
100-200 KB	19	51.4%
> 200 KB	16	43.2%

Statistics:

Smallest: test-post-steps.lock.yml (90.75 KB)
Largest: poem-bot.lock.yml (346.21 KB)
Median Range: 100-200 KB (most common)

Interpretation

The majority of lock files (94.6%) are substantial in size, ranging from 100 KB to over 346 KB. This indicates that most agentic workflows are feature-rich with comprehensive agent instructions, multiple safe output configurations, and extensive MCP integrations. The absence of small lock files (<50 KB) suggests that even the simplest agentic workflows require significant configuration.

Trigger Analysis

Most Popular Triggers

Trigger Type	Count	Percentage	Example Workflows
workflow_dispatch	29	78.4%	Most workflows support manual triggering
schedule	12	32.4%	daily-news, daily-doc-updater, audit-workflows
issue_comment	7	18.9%	brave, plan, tidy
issues	3	8.1%	issue-classifier, poem-bot
push	2	5.4%	dev, tidy
pull_request	1	2.7%	video-analyzer

Note: Percentages sum to more than 100% because workflows can have multiple triggers.

Common Trigger Combinations

Combination	Count	Use Case
workflow_dispatch only	14	Manual-only workflows for on-demand agent tasks
schedule + workflow_dispatch	11	Scheduled automation with manual override capability
issue_comment	2	Comment-reactive agents (e.g., `/brave`, `/plan`)
Multi-modal (5+ triggers)	2	Highly reactive workflows (discussion + comments + issues + PRs)

Schedule Patterns

Schedule (Cron)	Count	Description
`0 10 * * *`	3	Daily at 10:00 AM UTC
`0 9 * * 0`	2	Weekly on Sunday at 9:00 AM UTC
`0 9 * * 1-5`	1	Weekdays at 9:00 AM UTC (business days)
`0 9 * * *`	1	Daily at 9:00 AM UTC
`0 3 * * *`	1	Daily at 3:00 AM UTC (off-peak)
Others	3	Various schedules

Insight: Most scheduled workflows run during morning hours (9-10 AM UTC), likely designed to provide daily summaries or updates at the start of the workday.

Safe Outputs Analysis

Safe Output Types Distribution

Type	Count	Percentage	Workflows Using
missing_tool	36	97.3%	Nearly all workflows
create_issue	12	32.4%	issue-classifier, ci-doctor
create_discussion	12	32.4%	daily-news, research, audits
create_pull_request	9	24.3%	security-fix-pr, tidy
add_comment	8	21.6%	brave, plan, dev
notion-add-comment	4	10.8%	notion-issue-summary
upload_asset	4	10.8%	video-analyzer, pdf-summary
push_to_pull_request_branch	3	8.1%	Workflows that modify PR branches
post-to-slack-channel	3	8.1%	Slack notification workflows
create_pull_request_review_comment	1	2.7%	Code review workflows
update_issue	1	2.7%	Issue management workflows

Key Findings

Universal Tool Reporting: 97.3% of workflows enable the missing_tool safe output, allowing agents to report when they encounter missing capabilities.
Discussion vs Issue Creation: Create discussion (32.4%) and create issue (32.4%) are equally popular, suggesting workflows are split between using GitHub Discussions for broader conversations and Issues for actionable tracking.
PR Automation: 24.3% of workflows can create pull requests autonomously, demonstrating significant code modification capabilities.
Multi-modal Outputs: Many workflows combine multiple safe output types, enabling agents to interact through various GitHub interfaces (comments + issues + discussions).

Structural Characteristics

Job Complexity

Average Jobs per Workflow: 6.46 jobs
Average Steps per Workflow: 58.16 steps
Average Steps per Job: 9.00 steps
Job Range: 3-14 jobs
Step Range: 29-98 steps

Distribution Analysis

Jobs per Workflow	Count
3 jobs	2 (test workflows)
4 jobs	1
6 jobs	20 (most common)
7 jobs	7
8 jobs	2
9 jobs	2
14 jobs	1 (poem-bot - outlier)

Average Lock File Structure

Based on statistical analysis, a typical .lock.yml file has:

Size: ~200 KB
Jobs: ~6-7 jobs (pre_activation, activation, agent, detection, safe outputs, missing_tool)
Steps per Job: ~9 steps
Total Steps: ~58 steps
Permissions: Empty top-level permissions ({}), read-all for agent job
Triggers: workflow_dispatch + optional schedule
Timeout: ~10-15 minutes for agent execution
MCP Servers: github + safe-outputs (mandatory), optional: tavily, notion, slack

Structural Insights

Standard Pattern: The 6-job pattern appears in 54% of workflows, representing a canonical agentic workflow structure:
- pre_activation → activation → agent → detection/safe outputs → missing_tool
Complexity Outliers:
- poem-bot (14 jobs, 98 steps): Most complex workflow with multiple parallel output handlers
- test-post-steps (3 jobs, 30 steps): Minimal test workflow
Step Efficiency: Average 9 steps per job indicates well-organized job definitions with focused responsibilities.

Permission Patterns

Permission Configuration

Permission Type	Count	Percentage
Empty top-level permissions (`permissions: {}`)	37	100%
Agent job with `read-all`	35	94.6%
Agent job with no explicit permissions	2	5.4%

Security Model

Principle of Least Privilege: 100% of workflows use empty top-level permissions, restricting access by default. The agent job then explicitly grants read-all permissions (94.6% of cases), enabling agents to:

Read repository contents
Access GitHub API for reading issues, PRs, discussions
Read workflow artifacts and logs

Write Operations: All write operations (creating issues, comments, PRs) are handled through safe-output MCP tools, which use separate authentication mechanisms, maintaining a security boundary between agent read access and write capabilities.

Notable Exception: Two workflows (test-jqschema, test-post-steps) don't specify read-all for the agent job, likely because they're minimal test workflows with limited requirements.

Tool & MCP Patterns

Most Used MCP Servers

MCP Server	Count	Percentage	Purpose
safe-outputs	35	94.6%	Core GitHub write operations (issues, PRs, discussions)
github	35	94.6%	GitHub API read access
tavily	5	13.5%	Web search capabilities (daily-news, research, scout)
notion	2	5.4%	Notion integration (notion-issue-summary)
brave	2	5.4%	Brave search API
slack	1	2.7%	Slack notifications

Engine Distribution

Engine Type	Count	Percentage	Workflows
GitHub Copilot	25	67.6%	Most workflows use Copilot as the agent engine
Claude	9	24.3%	smoke-claude, and Claude-specific workflows
OpenAI/Codex	1	2.7%	smoke-codex
Unknown/Other	2	5.4%	Engine not explicitly identified

Insight: GitHub Copilot dominates as the primary agent engine (67.6%), with Claude as the secondary option (24.3%). This suggests the repository is optimized for Copilot-based agentic workflows.

Common Tool Configurations

Core Toolset (present in ~95% of workflows):
- Bash execution
- GitHub API access (read/write)
- File operations (Read, Write, Glob, Grep)
- Safe outputs MCP server
Extended Toolset (present in research/web-enabled workflows):
- Web search (Tavily/Brave)
- Web fetch capabilities
- External integrations (Notion, Slack)
Specialized Tools (workflow-specific):
- PDF processing (pdf-summary)
- Video analysis (video-analyzer)
- Custom MCP servers for domain-specific tasks

Timeout Configuration

Timeout (minutes)	Count	Percentage
5 minutes	3	8.1%
10 minutes	21	56.8%
15 minutes	7	18.9%
20 minutes	6	16.2%

Average Timeout: 12.16 minutes

Analysis:

Most workflows (56.8%) use a 10-minute timeout, balancing between execution time and resource conservation.
Longer timeouts (15-20 minutes) are used for complex analysis workflows (e.g., scout, research).
Shorter timeouts (5 minutes) are used for simple smoke tests or quick operations.

Interesting Findings

Standardization Achievement: Despite 37 diverse workflows with different purposes, there's remarkable consistency in structure:
- 100% use empty top-level permissions
- 94.6% use the same 6-job pattern
- 94.6% include both github and safe-outputs MCP servers
- 78.4% support manual workflow_dispatch triggering
The "Missing Tool" Pattern: 97.3% of workflows enable the missing_tool safe output, indicating a culture of continuous improvement where agents can report gaps in their capabilities back to developers.
Size-Complexity Correlation: There's a strong correlation between file size and structural complexity:
- Smallest files (90-95 KB) → 3 jobs, ~30 steps (test workflows)
- Medium files (150-200 KB) → 6 jobs, ~50-60 steps (standard workflows)
- Largest files (250-350 KB) → 7-14 jobs, 70-98 steps (complex multi-output workflows)
The Poem Bot Anomaly: The poem-bot.lock.yml workflow is an outlier in multiple dimensions:
- Largest file size (346 KB)
- Most jobs (14)
- Most steps (98)
- This suggests poem-bot is either a comprehensive demonstration workflow or a highly sophisticated multi-modal agent.
Morning Automation Pattern: 75% of scheduled workflows run between 9-11 AM UTC, suggesting these agentic workflows are designed to provide morning briefings, daily summaries, or workday preparation tasks.
Workflow Dispatch Dominance: 78.4% of workflows support manual triggering, indicating that while automation is valuable, human-in-the-loop control remains a priority.

Historical Trends

Note: This is the first comprehensive analysis. Future analyses will compare trends over time to track:

Growth in workflow count
Evolution of average file size
Changes in popular triggers and safe outputs
Adoption of new MCP servers
Shifts in engine preferences

This baseline data has been stored in /tmp/gh-aw/cache-memory/history/2025-10-19.json for future comparison.

Recommendations

For Workflow Authors

Adopt the Standard Pattern: The 6-job pattern (pre_activation → activation → agent → detection → outputs → missing_tool) has emerged as a best practice used by 54% of workflows. New workflows should consider adopting this pattern for consistency.
Enable Missing Tool Reporting: If not already present, add missing_tool to your safe outputs configuration to enable agents to report capability gaps.
Consider Manual Triggers: 78.4% of workflows include workflow_dispatch. If your workflow might benefit from manual execution, add this trigger.
Optimize Timeouts: The average timeout is 12 minutes. Review your workflow's timeout settings:
- Simple operations: 5-10 minutes
- Standard workflows: 10-15 minutes
- Complex analysis: 15-20 minutes

For Repository Maintainers

Template Workflow: Create a reference template based on the most common patterns:
- 6 jobs (standard pattern)
- workflow_dispatch + optional schedule
- github + safe-outputs MCP servers
- 10-minute timeout
- Empty top-level permissions with read-all for agent
Size Guidelines: Lock files naturally range from 100-350 KB. Files outside this range may indicate:
- <100 KB: Potentially under-specified workflows (check if instructions are sufficient)
- 350 KB: Potentially over-complex workflows (consider splitting into multiple workflows)
Documentation: Document the canonical 6-job pattern and when to deviate from it (e.g., poem-bot's 14-job structure for specialized use cases).
MCP Server Discovery: Only 13.5% of workflows use web search (Tavily). Consider promoting web-search capabilities for research-oriented workflows.

For Future Development

Consolidate Schedule Times: 11 different cron schedules are in use. Consolidating to a few standard times (e.g., 9 AM, 10 AM) could simplify maintenance and reduce confusion.
Standardize Safe Output Combinations: Some safe output combinations appear frequently (e.g., create_discussion + missing_tool). Consider creating standard configuration presets.
Engine Abstraction: With 67.6% using Copilot and 24.3% using Claude, ensure workflows remain engine-agnostic or clearly document engine requirements.

Methodology

Analysis Tool: Bash scripts with text processing (grep, awk, sed)
Lock Files Analyzed: 37 files in .github/workflows/*.lock.yml
Cache Memory: Used /tmp/gh-aw/cache-memory/ for script persistence and data storage
Data Sources:
- File system (ls, wc) for size analysis
- YAML structure parsing for triggers, jobs, steps
- Pattern matching for MCP servers, engines, permissions
Date: October 19, 2025
Repository: githubnext/gh-aw
Commit: e08d839

Analysis Scripts

All analysis scripts have been preserved in /tmp/gh-aw/cache-memory/scripts/ for reproducibility and future analyses.

Data Files Generated

file_sizes_bytes.txt - Raw file size data
trigger_analysis.txt - Trigger combinations per workflow
safe_outputs.txt - Safe output configurations
structure.txt - Job and step counts
permissions.txt - Permission configurations
mcp_servers.txt - MCP server usage
engines.txt - Engine type detection
timeouts.txt - Timeout settings
schedules.txt - Cron schedule patterns

Generated by Lockfile Statistics Analysis Agent on 2025-10-19
Analysis Time: ~5 minutes
Workflows Analyzed: 37
Total Data Processed: 7.21 MB

AI generated by Lockfile Statistics Analysis Agent

2025-11-28T16:18:47Z

github-actions[bot]
bot Nov 28, 2025

This discussion was automatically closed because it was created by an agentic workflow more than 1 month ago.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📊 Agentic Workflow Lock File Statistics - October 19, 2025 #1965

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

📊 Agentic Workflow Lock File Statistics - October 19, 2025 #1965

Uh oh!

dsyme Oct 19, 2025 Maintainer

📊 Agentic Workflow Lock File Statistics - 2025-10-19

Executive Summary

File Size Distribution

Interpretation

Trigger Analysis

Most Popular Triggers

Common Trigger Combinations

Schedule Patterns

Safe Outputs Analysis

Safe Output Types Distribution

Key Findings

Structural Characteristics

Job Complexity

Distribution Analysis

Average Lock File Structure

Structural Insights

Permission Patterns

Permission Configuration

Security Model

Tool & MCP Patterns

Most Used MCP Servers

Engine Distribution

Common Tool Configurations

Timeout Configuration

Interesting Findings

Historical Trends

Recommendations

For Workflow Authors

For Repository Maintainers

For Future Development

Methodology

Analysis Scripts

Data Files Generated

Replies: 1 comment

Uh oh!

github-actions[bot] bot Nov 28, 2025

dsyme
Oct 19, 2025
Maintainer

github-actions[bot]
bot Nov 28, 2025