-
Notifications
You must be signed in to change notification settings - Fork 11
Description
π Executive Summary
gh-aw-firewall is already one of the most agentically mature repositories I've analyzed β with 27 agentic workflows spanning security, testing, documentation, CI/CD diagnostics, and release automation. However, three high-impact gaps stand out: a missing issue triage agent (the most foundational automation), an uncompiled build-test-node workflow (the repo's primary language!), and a "Firewall Escape Test Agent" referenced in security-review.md that doesn't exist yet β a uniquely powerful opportunity for this security-critical tool.
π Patterns Learned from Pelis Agent Factory
After crawling the Pelis Agent Factory documentation and the githubnext/agentics reference repository, these patterns stand out:
| Pattern | Description | This Repo |
|---|---|---|
| Issue Triage | "Hello world" agent β labels + comments on new issues | β Missing |
| Continuous Simplicity | Daily agent simplifying recently-modified code | β Missing |
| Breaking Change Checker | Catches backward-incompatible changes in PRs | β Missing |
| CI Doctor | Investigates failed workflows, proposes fixes | β Present |
| Workflow Health Manager | Meta-agent monitoring all other agents | β Missing |
| Security Compliance | CVE deadline tracking, secrets scanning | β Present |
| Changeset/Version Automation | Auto version bumps + changelog generation | |
| Schema Consistency | Detects drift between types, docs, and code | β Missing |
| Mergefest | Keeps long-lived PRs in sync with main |
β Missing |
| Slash Command ChatOps | /plan, /ask etc. interactive workflows |
β
/plan only |
The Pelis Factory's highest-ROI workflows (by merge rates) are: CLI Consistency Checker (78%), Code Simplifier (83%), CI Doctor (69%), and Issue Triage. This repo already has the consistency checker and CI Doctor β adding triage and a simplifier would complete the top-tier set.
π Current Agentic Workflow Inventory
| Workflow | Purpose | Trigger | Assessment |
|---|---|---|---|
build-test-bun |
Test Bun runtime compatibility | PR | β Good β compiled |
build-test-cpp |
Test C++ runtime compatibility | PR | β Good β compiled |
build-test-deno |
Test Deno runtime compatibility | PR | β Good β compiled |
build-test-dotnet |
Test .NET runtime compatibility | PR | β Good β compiled |
build-test-go |
Test Go runtime compatibility | PR | β Good β compiled |
build-test-java |
Test Java runtime compatibility | PR | β Good β compiled |
build-test-node |
Test Node.js runtime compatibility | PR | |
build-test-rust |
Test Rust runtime compatibility | PR | β Good β compiled |
ci-cd-gaps-assessment |
Daily CI/CD gap analysis β discussion | Daily | β Good β creating useful discussions |
ci-doctor |
Investigates workflow failures | workflow_run | β Excellent β high-impact pattern |
cli-flag-consistency-checker |
Weekly CLI flag/doc sync check | Weekly | β Good β relevant to CLI tool |
dependency-security-monitor |
Daily CVE monitoring + dep update PRs | Daily | β Excellent β security-critical |
doc-maintainer |
Daily doc sync with code changes | Daily | β Good β skip-if-match is well-configured |
issue-duplication-detector |
Detects duplicate issues | On issue open | β Good β helpful for maintenance |
issue-monster |
Assigns issues to Copilot SWE agent | Hourly + On issue | β Good β task dispatcher pattern |
pelis-agent-factory-advisor |
This workflow! | Daily | β Meta-learning pattern |
plan |
/plan slash command |
Slash command | β Good ChatOps foundation |
secret-digger-claude |
Red-team: find secrets in container | Hourly | β Excellent β domain-specific security test |
secret-digger-codex |
Red-team: find secrets in container | Hourly | β Excellent β multi-engine comparison |
secret-digger-copilot |
Red-team: find secrets in container | Hourly | β Excellent β multi-engine comparison |
security-guard |
PR security review (Claude) | PR | β Excellent β Claude for security review |
security-review |
Daily threat modeling + analysis | Daily | β Good β but references missing workflow |
smoke-chroot |
Smoke test chroot mode | PR + Reaction | β Good |
smoke-claude |
Smoke test Claude engine | PR + Schedule | β Good |
smoke-codex |
Smoke test Codex engine | PR + Schedule | β Good |
smoke-copilot |
Smoke test Copilot engine | PR + Schedule | β Good |
test-coverage-improver |
Weekly test coverage improvement PRs | Weekly | β Good β security-focused |
update-release-notes |
Enhances release notes on publish | On release | β Good |
π Actionable Recommendations
P0 β Implement Immediately
π΄ P0.1: Issue Triage Agent
What: Auto-label and comment on new issues when they're opened.
Why: You have 20+ open issues with no labels and inconsistent categorization. This is the "hello world" of agentic workflows (per Pelis Factory docs) and is the highest-ROI addition. Currently, every issue requires manual triage β that's friction that slows maintainers and first-time contributors.
How: Add a triage agent triggered on issues: [opened, reopened]. Use add-labels safe output with your label taxonomy: bug, enhancement, security, documentation, question, integration, performance, ci-cd.
Effort: Low β template exists, just needs customization for AWF's domain.
---
on:
issues:
types: [opened, reopened]
lockdown: false
permissions:
issues: read
tools:
github:
toolsets: [issues, labels]
safe-outputs:
add-labels:
allowed: [bug, enhancement, security, documentation, question, integration, performance, ci-cd, good-first-issue]
add-comment: {}
timeout-minutes: 5
---
# Issue Triage Agent
Analyze new issues in $\{\{ github.repository }} and apply the most appropriate label.
Research the codebase context (iptables, Squid config, Docker, MCP, domains, CI).
Comment to explain the label and suggest next steps.Add with: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/issue-triage-agent.md
π΄ P0.2: Compile the build-test-node Workflow
What: build-test-node.md is the only uncompiled workflow. Node.js is the primary language of this project β the test workflow for it doesn't actually run.
Why: This is a critical gap β the main language's build test is silently broken/uncompiled.
How: Run gh aw compile .github/workflows/build-test-node.md and apply post-processing per AGENTS.md instructions.
Effort: Very Low β run one command.
P1 β Plan for Near-Term
π P1.1: Firewall Escape Test Agent
What: A dedicated red-team agent that systematically attempts to bypass the AWF firewall β trying DNS tunneling, direct IP connections, HTTPS to blocked domains, etc. β and reports results.
Why: security-review.md already references this agent ("use agentic-workflows tool to check recent runs of the 'Firewall Escape Test Agent'") β but it doesn't exist. This creates a gap in the security review. More importantly, for a repository whose entire purpose is a firewall, having an agent continuously red-teaming it is uniquely valuable and domain-specific. The secret-diggers serve a related purpose but focus on credential leakage, not network escape.
How: Create a scheduled workflow that spins up the AWF container with a test command and verifies that blocked domains are unreachable, DNS is restricted, and iptables rules are enforced. Post results as a discussion.
Effort: Medium β requires Docker test infrastructure, but the smoke tests show the pattern exists.
π P1.2: Breaking Change Checker
What: A PR-triggered agent that detects backward-incompatible changes β CLI flag removals, API changes, config format changes, container interface changes.
Why: AWF is a CLI tool used by external agents and CI systems. Breaking changes to --allow-domains, --dns-servers, container behavior, or the action.yml interface can silently break downstream users. The CI Doctor catches CI failures but not semantic breaking changes. Given the active PR velocity (PR#1079 changes docker cp behavior, PR#1066 adds rate limiting), this is timely.
How: PR-triggered agent reviewing changes to src/cli.ts, action.yml, src/types.ts, container entrypoints. Creates an issue/comment if breaking changes are detected.
Effort: Low-Medium.
Add with: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/breaking-change-checker.md
π P1.3: Workflow Health Manager
What: A meta-agent that periodically reviews the health of all 27+ agentic workflows β checking for failed runs, stale outputs, uncompiled workflows, misconfigured triggers, and patterns that indicate an agent is not producing value.
Why: With 27 workflows, manual monitoring is impractical. The CI Doctor handles CI workflow failures, but not agentic workflow quality. Looking at open issues, there are already failures like [agentics] Issue Duplication Detector failed and [agentics] Secret Digger (Claude) failed β a health manager would detect these patterns and create consolidated health reports.
How: Weekly schedule, uses agentic-workflows tool + GitHub API to audit all workflows and post a health summary discussion.
Effort: Low-Medium.
Add with: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/workflow-health-manager.md
P2 β Consider for Roadmap
π‘ P2.1: Code Simplifier
What: Daily agent analyzing recently-modified code for simplification opportunities β extracting helpers, flattening nested ifs, removing redundancy.
Why: The Pelis Factory's Code Simplifier has an 83% merge rate. This repo has active AI-assisted development (multiple Claude/Copilot PRs per week), which benefits from a cleanup sweep after each sprint. TypeScript code like docker-manager.ts (1400+ lines) could particularly benefit.
Effort: Low (template available).
Add with: gh aw add-wizard https://github.com/githubnext/agentics/workflows/code-simplifier.md
π‘ P2.2: Changeset / Version Bump Automation
What: An agent that, after merging PRs, proposes version bump and CHANGELOG entries based on conventional commit types.
Why: update-release-notes exists but only runs on published releases. There's a gap between "PRs merge" and "release is created" where versioning decisions are manual. Given the 28 release workflow (conventional commits are enforced!), this is well-suited for automation.
Effort: Medium.
π‘ P2.3: Mergefest β PR Branch Sync Agent
What: Automatically merges main into long-lived open PRs to prevent drift and surface conflicts early.
Why: Looking at open PRs, several are days/weeks old (PR#1003, #1059, etc.) and likely have merge conflicts or drift. This is pure ceremony that could be automated.
Effort: Low.
Add with: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/mergefest.md
π‘ P2.4: Schema/Type Consistency Checker
What: Detects when TypeScript interfaces, CLI flags, documentation, and action.yml inputs drift out of sync.
Why: WrapperConfig in src/types.ts must stay in sync with src/cli.ts options, action.yml inputs, and docs/ reference. The cli-flag-consistency-checker covers CLIβdocs drift, but not typesβCLI or action.ymlβdocs drift.
Effort: Low.
Add with: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/schema-consistency-checker.md
π‘ P2.5: Weekly Issue/PR Digest
What: A weekly summary discussion of repository activity β merged PRs, new issues, agent outputs, security findings β for maintainers who can't monitor every notification.
Why: With 27 workflows generating issues and discussions daily, important signals can get lost. A weekly digest aggregates the most important outputs.
Effort: Low.
Add with: gh aw add-wizard githubnext/agentics/weekly-issue-summary
P3 β Future Ideas
βͺ P3.1: Contributor Onboarding Agent
What: Welcomes first-time contributors with context about the firewall architecture, where to start, and how to run tests.
Why: CONTRIBUTING.md exists but a personalized comment on a first PR improves contributor retention.
βͺ P3.2: Multi-Device Docs Site Tester
What: Uses Playwright to test docs-site/ on mobile, tablet, and desktop viewports.
Why: The docs site (docs-site/) has Astro/Starlight β a visual regression agent would catch layout issues before deploys. Pelis Factory's version had 100% merge rate.
βͺ P3.3: Performance Regression Monitor
What: Benchmarks container startup time and iptables rule application time on PRs, alerts if regressions >10%.
Why: AWF's value proposition includes fast startup. A perf regression caught before merge saves user pain.
βͺ P3.4: Dependabot PR Bundler
What: Bundles multiple Dependabot PRs into one β reducing review fatigue from the current pattern of 5-10 Dependabot PRs at a time.
Why: Looking at open PRs, there are already 4 Dependabot PRs open simultaneously. Bundling reduces cognitive load.
Add with: gh aw add-wizard githubnext/agentics/dependabot-pr-bundler
π Maturity Assessment
| Dimension | Score | Notes |
|---|---|---|
| Security Coverage | 5/5 | Secret diggers, security guard, security review, dependency monitor β excellent |
| Testing Automation | 4/5 | Build matrix + smoke tests + coverage improver; missing perf tests |
| Documentation Automation | 4/5 | Doc maintainer + CLI consistency; missing schema drift detection |
| Issue Management | 3/5 | Issue monster + deduplication; missing triage labeling |
| CI/CD Intelligence | 4/5 | CI doctor + gaps assessment; missing breaking change detection |
| Code Quality | 2/5 | No simplifier, no duplicate detector |
| Release Automation | 3/5 | Release notes update; missing changeset automation |
| Meta/Observability | 4/5 | Pelis advisor + plan command; missing health manager |
Current Level: 4/5 β Advanced
This repository is in the top tier of agentic workflow adoption. Most critical infrastructure categories are covered with thoughtful, domain-specific workflows.
Target Level: 5/5 β Factory-Grade
The gap to close: issue triage (the foundational missing piece), the firewall escape agent (unique domain opportunity), and workflow health management (to sustain 30+ workflows sustainably).
Gap Analysis: The primary gaps are in inbound workflow (issue triage), domain-specific red-teaming (escape testing), and code quality maintenance (simplifier). The most impactful single addition is issue triage β it's foundational, low-effort, and affects every issue filed.
π Comparison with Pelis Agent Factory Best Practices
What this repo does exceptionally well:
- π Security-first agent design β Three secret-digger engines + dedicated security guard is beyond what most repos do
- π Domain-specific workflows β Smoke tests across 4 engines (Claude, Codex, Copilot, chroot) is uniquely relevant
- π Multi-engine red-teaming β Using Claude, Codex, AND Copilot for secret digging enables comparison and cross-validation
- π Network isolation β Using
network.allowedin workflow configs aligns with best practices - π skip-if-match β Well-configured to prevent duplicate agent work (doc-maintainer, test-coverage-improver)
What could improve:
- Issue triage β The most foundational Pelis Factory workflow, absent here
- Code quality loops β No simplifier or duplicate detector despite active AI-assisted development
- Causal chain completeness β Issue Monster creates work for SWE agents, but without triage, issues may be poorly categorized when assigned
- Compiled workflow gap β
build-test-nodeis uncompiled; a workflow health manager would catch this automatically
Unique opportunities given the firewall/security domain:
- The Firewall Escape Test Agent has no equivalent in Pelis Factory β it would be a novel, domain-defining workflow
- Policy drift detection (changes that weaken security guarantees) is a uniquely valuable pattern for this repo
- Container image CVE scanning could feed directly into dependency-security-monitor
π Notes for Future Runs
Stored in /tmp/gh-aw/cache-memory/pelis-advisor-notes.json
- First analysis run: 2026-02-27
- 27 workflows found (up from baseline)
- Critical unresolved:
build-test-nodeuncompiled - Critical reference:
security-review.mdreferences non-existent "Firewall Escape Test Agent" - Top open issues include CI failures in Integration Tests and Dependency Vulnerability Audit
- Active PR development: Docker cp migration, API proxy enterprise support, rate limiting
- Track over time: Did issue triage get added? Did escape test agent get created?
Note: This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
Tip: Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.
Generated by Pelis Agent Factory Advisor
- expires on Mar 6, 2026, 3:25 AM UTC