Skip to content

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor - Agentic Workflow Analysis (2026-02-21) #993

@github-actions

Description

@github-actions

📊 Executive Summary

This repository is a highly mature agentic automation platform with 28 compiled agentic workflows covering CI investigation, security scanning, smoke testing, issue management, documentation, and dependency monitoring. It significantly exceeds most repositories in agentic workflow adoption. The primary gaps are meta-monitoring (no agent watching the agents), continuous code quality (no simplifier/refactorer), and automated release management (no changeset generator).


🎓 Patterns Learned from Pelis Agent Factory

Key Patterns Discovered

From crawling the Pelis Agent Factory documentation and the githubnext/agentics repository:

  1. Specialization over Monoliths — Small, focused agents outperform one giant agent. Each agent does one thing well.
  2. Continuous Code Quality — Daily agents that propose simplification/refactoring PRs compound into significant long-term improvements (96% merge rate on daily doc updater, 78% on code simplifier).
  3. Meta-Agents are Critical — The Workflow Health Manager and Metrics Collector/Audit Workflows provide observability across all agents. Running 28+ agents without a meta-monitor is like operating a factory without a foreman.
  4. Issue Triage as the "Hello World" — Auto-labeling issues is the simplest, highest-ROI workflow for any repo.
  5. Skip-If-Match Prevents Duplication — Using skip-if-match in workflow frontmatter prevents redundant PRs from accumulating.
  6. Cache Memory for Continuity — Using cache-memory: true tool allows workflows to build knowledge across runs (already used in issue-duplication-detector and security-review).
  7. Causal Chains — Agents that create issues/discussions → other agents (like Issue Monster) pick up those issues → Copilot PRs created. This multi-agent chain multiplies impact.
  8. Secret/Malicious Code Scanning — Daily automated red-team scanning is a security best practice already well-implemented here.

Comparison to gh-aw-firewall

Pattern Pelis Factory gh-aw-firewall Gap?
Issue triage/labeling Yes
CI failure investigation ✅ (ci-doctor) None
Secret scanning ✅ (3 engines) None
Security review on PRs ✅ (security-guard) None
Daily doc updates ✅ (doc-maintainer) None
Code simplifier Yes
Duplicate code detector Yes
Meta-agent / workflow health Yes
Metrics collector Yes
Changeset/changelog generator Yes
Breaking change checker Yes
Smoke testing ✅ (5 engines) None
Multi-engine build tests ✅ (8 languages) None
Issue Monster / auto-assign None
Dependency monitoring None

📋 Current Agentic Workflow Inventory

Workflow Purpose Trigger Assessment
build-test-{8 langs} Verify firewall works for 8 language ecosystems PR + manual ✅ Excellent - unique domain value
ci-cd-gaps-assessment Identify CI/CD gaps Daily ✅ Good - produces useful discussions
ci-doctor Investigate CI failures On workflow failure ✅ Strong - covers 24 workflows
cli-flag-consistency-checker Detect CLI doc drift Weekly ✅ Good - catches doc drift
dependency-security-monitor Audit npm vulnerabilities + propose updates Daily ✅ Comprehensive
doc-maintainer Sync docs with code changes Daily ✅ Good - uses skip-if-match
issue-duplication-detector Detect duplicate issues on open Issue opened ✅ Good - uses cache-memory
issue-monster Auto-assign issues to Copilot Hourly + issue opened ✅ Very sophisticated scoring
pelis-agent-factory-advisor This advisor (meta-advisor) Daily ✅ Meta pattern
plan Break issues into tasks via /plan Slash command ⚠️ NOT COMPILED
secret-digger-claude Red-team secret search (Claude) Hourly ✅ Comprehensive
secret-digger-codex Red-team secret search (Codex) Hourly ✅ Multi-engine approach
secret-digger-copilot Red-team secret search (Copilot) Hourly ⚠️ Consider consolidating
security-guard Security review on PRs (Claude) PR opened/updated ✅ Deep security checks
security-review Daily threat modeling Daily ✅ Very thorough
smoke-chroot Smoke test chroot mode PR + reaction ✅ Domain-specific
smoke-claude Smoke test Claude engine PR + every 12h ✅ Good frequency
smoke-codex Smoke test Codex engine PR + every 12h ✅ Good frequency
smoke-copilot Smoke test Copilot engine PR + every 12h ✅ Good frequency
smoke-gemini Smoke test Gemini engine PR + every 12h ✅ Newest addition
test-coverage-improver Improve security-critical test coverage Weekly ✅ Good - uses skip-if-match
update-release-notes Enhance release notes on publish Release published ✅ Automated release enhancement

Compiled status issues:

  • plan.md — ❌ NOT compiled
  • build-test-java.md — ❌ NOT compiled

🚀 Actionable Recommendations

P0 - Implement Immediately

P0.1: Fix Uncompiled Workflows

What: plan.md and build-test-java.md are not compiled to .lock.yml, meaning they don't actually run.

Why: These workflows are defined but silent — any issues or PRs they would handle are silently ignored. The plan workflow in particular responds to user /plan commands.

How: Run gh aw compile .github/workflows/plan.md .github/workflows/build-test-java.md && npx tsx scripts/ci/postprocess-smoke-workflows.ts

Effort: Low — single command fix


P0.2: Issue Triage Agent (Auto-Labeling)

What: Add a lightweight issue triage workflow that auto-labels new issues as bug, enhancement, documentation, question, security, or help-wanted.

Why: The Issue Monster assigns issues to Copilot, but it relies on labels being present for scoring. Without triage, issues land unlabeled, reducing Issue Monster's effectiveness. This is the "hello world" of agentic workflows with immediate ROI.

How: Create .github/workflows/issue-triage.md triggered on issues: [opened, reopened]:

---
timeout-minutes: 5
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
    lockdown: false
safe-outputs:
  add-labels:
    allowed: [bug, enhancement, documentation, question, security, help-wanted, good-first-issue]
  add-comment: {}
---
# Issue Triage Agent
Analyze new issues in $\{\{ github.repository }} and apply the most appropriate label.
Focus on the issue title and body. For security-related issues, apply the `security` label.
After labeling, comment explaining the label choice and a brief response to the reporter.

Effort: Low


P1 - Plan for Near-Term

P1.1: Workflow Health Manager (Meta-Monitoring)

What: A daily meta-agent that monitors all other workflows — checking for failures, degraded performance, high costs, and unresponsive agents.

Why: With 28 workflows, you have a significant automated system running. Without observability, you won't know when workflows silently fail, cost more than expected, or produce low-quality outputs. The Pelis Factory's Workflow Health Manager created 40 issues and had a 100% causal chain to fixes.

How: Create .github/workflows/workflow-health-manager.md using agentic-workflows tool to analyze recent runs:

---
on:
  schedule: daily
tools:
  agentic-workflows:
  github:
    toolsets: [default, actions]
  cache-memory: true
safe-outputs:
  create-issue:
    title-prefix: "[Workflow Health] "
    labels: [workflow-health]
    max: 5
  add-comment: {}
---
# Workflow Health Manager
Analyze the health of all agentic workflows in $\{\{ github.repository }}.
Check for: repeated failures, high token costs, low-quality outputs, stalled workflows.
Create issues for workflows that need attention.

Effort: Medium


P1.2: Automatic Code Simplifier

What: Daily workflow that analyzes recently-modified TypeScript/shell files and proposes simplification PRs.

Why: As the codebase evolves (especially with AI-assisted development), code naturally becomes more complex than necessary. A daily simplifier catches this automatically, with ~83% merge rates in the Pelis Factory. For a security-critical firewall, simpler code = fewer bugs = better security.

How: Create .github/workflows/code-simplifier.md triggered daily, looking at commits from the last 3 days:

---
on:
  schedule: daily
  skip-if-match:
    query: 'is:pr is:open in:title "[Simplify]"'
    max: 2
permissions:
  contents: read
tools:
  github:
    toolsets: [default]
  bash:
    - "git log:*"
    - "git diff:*"
safe-outputs:
  create-pull-request:
    title-prefix: "[Simplify] "
    draft: true
    labels: [refactoring, ai-generated]
---
# Automatic Code Simplifier
Analyze TypeScript source files modified in the last 3 days in $\{\{ github.repository }}.
Propose minimal simplifications: extract repeated patterns, use idiomatic constructs,
simplify boolean logic, reduce nesting. Focus on src/*.ts files.
Create ONE focused PR. Do NOT change behavior - only clarity.

Effort: Medium


P1.3: Changeset / Changelog Generator

What: A workflow that triggers on merged PRs to propose changelog entries and version bumps based on conventional commits.

Why: This repo uses conventional commits (enforced by commitlint) but changelog generation is manual. The Pelis Factory's Changeset workflow had a 78% merge rate, significantly reducing release friction. The existing update-release-notes workflow only runs on published releases — this would generate changelogs incrementally before release.

How: Create .github/workflows/changeset.md triggered on push to main or PR merge:

---
on:
  push:
    branches: [main]
  schedule: weekly
  skip-if-match:
    query: 'is:pr is:open in:title "[Changeset]"'
    max: 1
tools:
  github:
    toolsets: [default]
  bash:
    - "git log:*"
    - "git tag:*"
safe-outputs:
  create-pull-request:
    title-prefix: "[Changeset] "
    labels: [release, ai-generated]
---
# Changeset Generator
Analyze commits since the last tag in $\{\{ github.repository }}.
Using conventional commits (feat, fix, docs, chore, etc.), determine the appropriate
semver bump (major/minor/patch), update CHANGELOG.md, and bump package.json version.
Follow the repo's conventional commit conventions.

Effort: Medium


P1.4: Breaking Change Checker

What: A PR-triggered workflow that analyzes whether the changes introduce backward-incompatible changes to the CLI, API, or container interface.

Why: AWF is used by external tools and in CI pipelines. Breaking changes (CLI flag renames, behavior changes) could silently break user workflows. This is particularly important for a firewall tool where misconfigurations have security implications.

How: Create .github/workflows/breaking-change-checker.md triggered on PRs:

---
on:
  pull_request:
    paths: ["src/cli.ts", "src/types.ts", "containers/**", "action.yml"]
    types: [opened, synchronize]
permissions:
  contents: read
  pull-requests: read
tools:
  github:
    toolsets: [default]
  bash: true
safe-outputs:
  add-comment:
    max: 1
  create-issue:
    title-prefix: "[Breaking Change] "
    labels: [breaking-change]
    max: 1
---
# Breaking Change Checker
Analyze PR #$\{\{ github.event.pull_request.number }} for backward-incompatible changes.
Check: CLI flag removals/renames, changed default behaviors, removed env vars,
changed Docker image interfaces, action.yml signature changes.
If breaking changes found, comment on the PR with details and create a tracking issue.

Effort: Low-Medium


P2 - Consider for Roadmap

P2.1: Metrics Collector / Workflow Audit

What: A weekly workflow that collects metrics across all agentic workflow runs: token costs, success rates, PR merge rates, and outputs per workflow.

Why: With 28+ agents, understanding which ones provide ROI vs. which are noisy is critical for tuning the portfolio. The Pelis Factory's Metrics Collector created 41 daily discussions helping identify underperforming agents.

Effort: Medium


P2.2: Container Image Freshness Monitor

What: A weekly workflow that checks if the base Docker images (ubuntu/squid:latest, ubuntu:22.04) have new security-relevant updates available.

Why: AWF's security depends partly on the security of its base images. If a critical CVE is fixed in ubuntu:22.04, the AWF containers need to be rebuilt. This is uniquely important for a security tool.

Effort: Medium


P2.3: Consolidate Secret Diggers

What: Currently running 3 hourly secret-digger agents (Claude, Codex, Copilot). Consider consolidating to one agent per 6-hour rotation, or at minimum document the rationale for triple-engine coverage.

Why: Triple-engine coverage at 1-hour intervals means 3 secret scans every hour, which may be excessive and costly. The multi-engine approach is valuable for finding different types of secrets, but the frequency could be reduced.

How: Change schedules to stagger at 6h intervals rather than 1h intervals.

Effort: Low (config change only)


P2.4: Documentation Noob Tester

What: A bi-weekly workflow that reads the documentation as a "first-time user" and tests that the quick-start steps actually work in the AWF container environment.

Why: The doc-maintainer keeps docs in sync with code, but doesn't test if the documented commands actually work. Given the complexity of Docker + iptables + Squid, documentation accuracy is critical for user success.

Effort: Medium-High


P2.5: Add Missing Workflows to CI Doctor Watch List

What: The ci-doctor workflow monitors 24 workflows but is missing: smoke-gemini, test-coverage-improver, dependency-security-monitor, cli-flag-consistency-checker, and the build-test-bun/cpp/deno/dotnet/go/java/node/rust workflows.

Why: If these workflows fail, the CI Doctor won't investigate them automatically.

Effort: Low (add to the workflows list in ci-doctor.md frontmatter)


P3 - Future Ideas

P3.1: Duplicate Code Detector

What: Use semantic analysis to identify duplicate patterns in TypeScript source. The AWF codebase has grown organically and likely has repeated patterns in docker-manager.ts (which is the largest file).

Effort: High (requires semantic tooling setup)


P3.2: Daily Malicious Code Scan

What: Daily review of recent commits for suspicious patterns (obfuscation, unexpected network calls, unusual file operations). Especially relevant for supply chain security of a firewall tool.

Effort: Medium — but note the security-review workflow already covers much of this ground.


P3.3: Firewall Config Regression Tester

What: A specialized workflow that validates iptables rule changes by checking that the expected allow/deny behavior is preserved. Could run the existing smoke tests with specific traffic patterns.

Why: Domain-specific to AWF — iptables rule changes can have subtle security implications that require integration testing beyond unit tests.

Effort: High


📈 Maturity Assessment

Dimension Score Notes
Security Automation 5/5 Exceptional: 3x secret diggers, daily threat modeling, PR security review, dependency monitoring
CI/CD Quality 4/5 Strong: ci-doctor, smoke tests (5 engines), 8 build-test workflows. Missing: breaking change checker
Code Quality 2/5 Weak: no simplifier, no refactorer, no duplicate detector
Documentation 4/5 Good: daily doc-maintainer, CLI flag checker. Missing: noob tester
Issue Management 5/5 Excellent: issue-monster, duplication detector, plan command
Release Management 3/5 Partial: update-release-notes exists. Missing: changeset generator
Meta-Monitoring 1/5 Critical gap: no workflow health manager, no metrics collector

Current Level: 4/5 — Advanced Factory

This repository is far ahead of typical repos in agentic adoption, with a particular strength in security and multi-engine testing. The main gaps are in code quality automation and meta-monitoring.

Target Level: 5/5 — Elite Factory

Gap Analysis: To reach elite level:

  1. Add meta-monitoring (Workflow Health Manager) — critical for operating 28+ agents sustainably
  2. Add issue triage for better Issue Monster effectiveness
  3. Add code simplifier for continuous quality improvement
  4. Fix uncompiled workflows (plan.md, build-test-java.md)

🔄 Comparison with Best Practices

What gh-aw-firewall Does Exceptionally Well

  1. Multi-Engine Testing: 5 smoke test engines + 8 language build-test agents is industry-leading
  2. Defense-in-Depth Security: 3 hourly secret diggers + daily security review + PR security guard is exceptionally thorough for a security tool
  3. Domain-Specific Intelligence: Security review, secret diggers, and smoke tests are highly customized to the firewall domain
  4. Cache Memory Usage: Issue duplication detector and security review use cache-memory for persistent state across runs
  5. Import System: Using shared/ imports for reusable fragments (mcp-pagination, secret-audit, version-reporting) is a best practice
  6. skip-if-match: Properly used in doc-maintainer, test-coverage-improver to prevent duplicate PRs

Areas for Improvement vs. Best Practices

  1. No Meta-Monitoring: Pelis recommends always monitoring your monitors. With 28 agents, a Workflow Health Manager is overdue.
  2. No Issue Triage: The foundational "hello world" workflow is missing, reducing Issue Monster's scoring accuracy.
  3. No Continuous Code Quality: Simplifier + refactorer workflows are table stakes in a mature factory.
  4. Uncompiled Workflows: plan.md and build-test-java.md are dead code — they should either be compiled or removed.
  5. Secret Digger Frequency: 3 engines × 1 hour = potentially excessive. Consider staggering to 6-hour intervals per engine.

📝 Notes saved to /tmp/gh-aw/cache-memory/pelis-advisor-notes.json for continuity across runs.

🤖 Generated by Pelis Agent Factory Advisor — tracking 28 agentic workflows as of 2026-02-21


Note: This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.

Tip: Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.

Generated by Pelis Agent Factory Advisor

  • expires on Feb 28, 2026, 3:20 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions