📊 Agentic Workflow Lock File Statistics - November 25, 2025 #4720
Closed
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This comprehensive analysis examines 86 lock files across the gh-aw repository, revealing key patterns in workflow structure, trigger usage, safe outputs, and architectural decisions.
Executive Summary
The gh-aw repository contains a mature collection of agentic workflows with strong patterns around automation, safe outputs, and structured permissions. Key findings include widespread adoption of workflow_dispatch for manual control, comprehensive use of GitHub MCP servers, and consistent safe output patterns for creating discussions and issues.
Key Highlights:
Full Statistical Analysis
File Size Distribution
Lock files in this repository are substantial, reflecting the comprehensive automation and safety features built into each workflow.
File Size Statistics:
.github/workflows/shared/mcp/arxiv.lock.yml(81 KB).github/workflows/poem-bot.lock.yml(433 KB)Analysis: The uniformly large file sizes (94% over 100KB) indicate these are production-ready workflows with extensive safety checks, comprehensive permissions handling, and detailed safe output processing. The smallest files are MCP configuration templates, while the largest contains complex multi-step automations.
Trigger Analysis
Most Popular Triggers
Workflows in this repository favor flexibility with manual triggering while maintaining automated schedules for regular operations.
workflow_dispatchscheduleissue_commentissuespull_requestpull_request_review_commentdiscussion_commentdiscussionworkflow_runpushreleaseworkflow_callCommon Trigger Combinations
The most common pattern combines scheduled automation with manual override capability:
schedule+workflow_dispatch: 42 workflows (48.8%)pull_request+schedule+workflow_dispatch: 3 workflowsdiscussion+discussion_comment+issue_comment+issues+pull_request+pull_request_review_comment: 3 workflowsSchedule Patterns
Workflows predominantly run during business hours with a preference for morning execution:
0 9 * * *0 13 * * 1-50 0,6,12,18 * * *0 9 * * 1-50 8 * * *0 10 * * *0 0 * * *Analysis: Strong preference for morning execution (8-10 AM UTC) and weekday-only runs suggests these workflows generate reports and insights meant for human review during business hours.
Safe Outputs Analysis
Safe outputs are the primary mechanism for workflows to communicate results, with a clear preference for creating discussions and adding comments.
Safe Output Types Distribution
create-discussionadd-commentcreate-issuecreate-pull-requestcreate-pull-request-review-commentupdate-issueKey Insights:
Discussion Categories
Discussion categories are programmatically determined in most workflows using expressions like
${discussionCategories[0].name}, indicating dynamic category selection based on repository configuration. This pattern appears in 120+ instances across workflows.Observed Pattern: Workflows query available discussion categories at runtime and select appropriate categories (typically "audits", "reports", or "insights") based on the workflow's purpose.
Structural Characteristics
Job Complexity
Lock files contain sophisticated multi-job workflows with substantial step counts:
.github/workflows/poem-bot.lock.ymlAnalysis: The high step count reflects the comprehensive nature of agentic workflows, which include:
Average Lock File Structure
Based on statistical analysis, a typical gh-aw lock file has:
workflow_dispatch+schedulecontents: read,pull-requests: read,issues: readPermission Patterns
Workflows follow a least-privilege security model with careful permission scoping:
Most Common Permissions
groupcontentspull-requestsissuesactionsdiscussionscancel-in-progresssecurity-eventsrepository-projectsSecurity Analysis:
write-allor overly broad permissionsPermission Distribution Categories
Tool & MCP Patterns
Most Used MCP Servers
The GitHub MCP server dominates, with emerging adoption of specialized servers:
githubplaywrightdeepwikiarxivcontexttavilymicrosoftdocsmarkitdownast-grepInsights:
Common Tool Configurations
Based on observed patterns:
WebFetch: ~60% of workflowsWebSearch: ~40% of workflowsTimeout Configuration
Workflows are configured with conservative timeouts to prevent runaway executions:
Analysis: The 13-minute average timeout balances execution time with cost control. Shorter timeouts (5 min) are used for quick checks and test jobs, while longer timeouts (30-45 min) accommodate complex analyses like code metrics or security scans.
Concurrency Patterns
Workflows use sophisticated concurrency control to prevent race conditions:
Most Common Concurrency Groups
gh-aw-${{ github.workflow }}gh-aw-${{ github.workflow }}-${{ github.event.issue.number || github.event.pull_request.number }}gh-aw-${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}gh-aw-${{ github.workflow }}-${{ github.event.issue.number }}Pattern Analysis:
Interesting Findings
No Minimal Workflows: All lock files exceed 81 KB, indicating comprehensive safety and automation. This suggests the gh-aw framework has substantial baseline requirements for agent workflows.
Poem Bot Complexity: The largest workflow (
poem-bot.lock.ymlat 433 KB, 100 steps) demonstrates the upper bounds of workflow complexity supported by the system.High Manual Trigger Adoption: 79.8% of workflows support
workflow_dispatch, indicating strong emphasis on manual control and testing capability alongside automation.Multi-Output Strategy: Workflows average 2.2 safe output types, showing sophisticated communication patterns (e.g., create discussion + add comments + create issues).
Test Infrastructure: Dedicated test workflows in
.github/workflows/tests/directory maintain smaller sizes (81-98 KB) and serve as minimal viable examples.Playwright Integration: With 210 references, Playwright MCP server shows strong adoption for web automation, suggesting many workflows perform browser-based analysis or testing.
Business Hours Scheduling: Strong clustering of schedules around 8-10 AM UTC on weekdays indicates workflows generate human-actionable insights rather than pure automation.
GitHub-Centric: With 3,245 GitHub MCP server references across 86 workflows, the average workflow makes ~38 calls to GitHub APIs, showing deep repository integration.
Historical Trends
This is the baseline analysis for the lockfile statistics agent. Future runs will track changes over time including:
Recommendations
Based on this analysis, here are recommendations for the gh-aw project:
For Workflow Authors
schedule + workflow_dispatchtrigger combination (used by 48.8% of workflows) for flexibilityFor the Platform
For Performance
For Security
write-allor overly broad permissionsMethodology
Data Collection
.lock.ymlfiles in.github/workflows/directory and subdirectoriesAnalysis Tools
grep,awk,sed,PyYAML,globCache Memory
Analysis scripts and data stored in
/tmp/gh-aw/cache-memory/:scripts/analyze_lockfiles.sh: Comprehensive bash analysis scriptscripts/extract_triggers.py: Python trigger analysisscripts/detailed_analysis.py: Full workflow parserscripts/safe_outputs_detail.py: Safe output extractionData Accuracy
on:sectionspermissions:sectionsmcp__prefixesGenerated by Lockfile Statistics Analysis Agent on 2025-11-25T03:28:00Z
Beta Was this translation helpful? Give feedback.
All reactions