feat(sdk): argus Python SDK with CLI, Docker execution, and CI dogfooding#88
feat(sdk): argus Python SDK with CLI, Docker execution, and CI dogfooding#88eFAILution wants to merge 65 commits intomainfrom
Conversation
…cture Research comprehensive analysis of current GitHub Actions coupling points and evaluate 5 architectural approaches for making Argus portable across GitHub, GitLab, Jenkins, local environments, and any CI/CD system. Recommends Python SDK + CLI + thin CI adapters architecture with incremental 4-phase migration strategy that preserves backward compatibility for existing GitHub Actions users.
Implement Phase 1 foundation for cross-platform portability (ADR-013): - Severity enum with comparison operators and multi-format parsing - Finding, ScanResult, ScanSummary dataclasses with to_dict() - Scanner Protocol defining the contract for all scanner modules - ArgusConfig loading from argus.yml via pyyaml - ArgusEngine orchestrating scanner registration and execution Pure Python + pyyaml, zero CI dependencies.
Reporter modules consuming ScanSummary for platform-agnostic output: - TerminalReporter: ASCII tables with severity counts to stdout - MarkdownReporter: collapsible sections with emoji severity indicators - SarifReporter: SARIF 2.1.0 with per-scanner runs and severity mapping - JsonReporter: full summary serialization via to_dict() All reporters use stdlib only (no rich, no colorama).
Scanner modules implementing the Scanner protocol, each with: - scan() for tool invocation via subprocess - parse_results() for standalone parsing with test fixtures - is_available() and install_command() for toolchain detection Ported scanners: bandit, clamav, trivy-iac, gitleaks, osv, checkov, opengrep, supply-chain (zizmor+actionlint), zap, container (trivy+grype+syft with CVE deduplication).
180 pytest tests covering the full argus SDK at 83%+ coverage: - Core: models, config, engine, CLI argument parsing - Scanners: all 10 scanners tested with existing fixtures - Reporters: terminal, markdown, SARIF, JSON output validation Also updates: - pytest.ini: add argus/ to test paths, python path, coverage source - .ai/architecture.yaml: document argus SDK structure per ADR-013 - argus.example.yml: example configuration file
Transparent container fallback when scanner tools aren't installed locally: - ExecutionConfig with backend (auto/local/docker), registry override, and pull_policy (always/if-not-present/never) - Engine resolves local vs container execution per scanner - Central containers.py manifest mapping scanners to official author images - All 10 scanners declare container_image and container_args() - 8 use official images (aquasec/trivy, anchore/grype, etc.) - 3 need custom builds (bandit, opengrep, supply-chain)
Custom Dockerfiles for scanners without official images: - docker/Dockerfile.bandit (python:3.12-slim + bandit) - docker/Dockerfile.opengrep (binary from GitHub releases) - docker/Dockerfile.supply-chain (zizmor + actionlint combined) - docker/Dockerfile.cli (all-in-one with every scanner tool) Dependabot config updated with pip and docker ecosystems for automated dependency maintenance on container base images.
22 new tests covering: - ExecutionConfig defaults and YAML parsing - Engine Docker fallback logic (local preferred, container fallback) - Image resolution with registry override - Container manifest completeness (all scanners have images and args)
- portability-research.md: Phase 3 detailed design with official container table, execution flow, config options, dependency maintenance strategy, and air-gapped environment support - ADR-014: Docker execution backend decision with alternatives - architecture.yaml: updated with argus SDK structure
Remove job-level continue-on-error: true from all 23 scanner E2E test jobs. Scanner failures must propagate to fail the workflow. Also fixes ZAP and ClamAV job conditions to run on PRs, and adds a test-results gate job for branch protection.
- test-unit.yml: matrix strategy for Python 3.11, 3.12, 3.13 - test-unit.yml: version reference coverage check (runs once on 3.13) - test-examples-functional.yml: minimum example count gate (>= 3)
package.json version field was not bumped by release-it, causing drift. Also adds .claude and .agents to SKIP_DIRS in check-version-refs.py to exclude worktree checkouts and template files from false positives.
… summary - test_check_version_refs.py: 25 tests covering brace expansion, version ref detection, coverage checking, and release-it-ignore marker - test_validate_action_schemas.py: parametrized pytest version of the standalone action schema validator (259 tests across all actions) - test_security_summary.py: 10 integration tests for security-summary aggregation of scanner markdown files
- Delete actions-composite-example.yml (duplicate of composite-actions-example.yml) - Fix README.md: remove "coming soon" for existing scanners, add missing scanner entries, remove non-existent scanner-trivy-container - Fix granular-scanner-usage.yml: correct workflow path for codeql, complete opengrep job with inputs and permissions
_list_scanners() accessed engine.scanners (public) but ArgusEngine stores scanners as engine._scanners (private). Fixed to use the correct attribute name.
Renovate regex managers for dependencies Dependabot can't track: - Container image tags in argus/containers.py (Docker datasource) - Tool versions in action.yml scripts (GitHub releases datasource) - Tool versions in custom Dockerfiles (GitHub releases datasource) Grouped into container-images and tool-versions PRs. Complements existing Dependabot config for actions, npm, pip, Docker.
Argus scans itself: PRs test the commit, main tests the release. - argus.yml: dogfood config enabling bandit, gitleaks, opengrep, clamav, osv, and supply-chain scanners against this repo - security-scan.yml: installs argus SDK from current commit, runs argus scan, uploads SARIF to GitHub Security tab Uses execution.backend: auto so locally-installed tools run natively and missing tools fall back to Docker containers.
YAML supports comments, making the config self-documenting. Same Renovate regex managers, same grouping rules — just readable.
Security-scan.yml now installs every tool enabled in argus.yml: bandit, gitleaks, opengrep, osv-scanner, zizmor, actionlint, clamav. Adds tool verification step that fails loudly if any scanner binary is missing — no silent skips in CI. Adds markdown output format for job summary.
…kflows BREAKING CHANGE: Remove 22 workflows replaced by argus Python SDK. Removed 15 scanner-*.yml thin wrappers, 6 compound orchestrators (reusable-security-hardening, container-scan, dependency-scan, infrastructure-scan, linting, container-scan-from-config), and security-reusable-demo. Refactored test-reusable-workflows.yml into argus SDK integration test. Removed 5 example workflows that referenced deleted workflows. Updated all documentation (README, QUICK-START, AGENTS, docs/, .ai/ files) to position the SDK as primary interface with composite actions as secondary for GitHub Actions users. Updated docsite.yml, zizmor.yml, bug report template, and agent skills to reflect the new architecture.
The argus engine handles tool execution — local if available, Docker container fallback otherwise. Workflows just run: pip install pyyaml && python -m argus scan --config argus.yml No more manual installs of bandit, gitleaks, opengrep, etc.
E2E Test Coverage Report
Summary
✅ All Actions Have E2E Coverage |
🚀 Release Preview📦 Version UpdateCurrent: 📋 Changelog1.0.0 (2026-04-12)⚠ BREAKING CHANGES
Features
Bug Fixes
Documentation
Code Refactoring
Tests
🔍 Version Reference Coverage✅ Version refs found: 312 across 108 files All covered by release-it config. ✅ Actions that would be performed
This preview is generated by running Last updated: 4/12/2026, 3:26:51 PM | Commit: ed67874 | View Run |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
2 similar comments
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
_build_workflows_nav() had hardcoded entries for deleted workflows (reusable-security-hardening, scanner-*.yml). Replaced with dynamic discovery that only includes workflows present on disk. Fixes mkdocs --strict failure from missing nav reference.
Both files were entirely oriented around the composite-action-first architecture. Updated to position the argus Python SDK as primary: CLAUDE.md: - Architecture section: SDK + composite actions (dual path) - Scanner flow: SDK flow + composite action flow - Adding a scanner: SDK module (preferred) + composite action - Usage examples: SDK CLI first, actions second - Test structure: added argus/tests/ (202 tests) - Important files: added SDK files CONTRIBUTING.md: - New "Adding a Scanner via the SDK" section with full template - Scanner protocol reference and registration guide - Container image manifest instructions - SDK test structure and commands - Composite action guide preserved as secondary path - Updated naming conventions for SDK modules
Scanner container_args() methods now read exclude paths, config files, framework options, etc. from the config dict — same as _build_command() does for local execution. No hardcoded scanner-specific args. Also fixes: - bandit: remove hardcoded exclude, read from config.exclude - clamav: add container_entrypoint for official image override - osv: fix container image tag and CLI subcommand (scan source -r) - engine: fallback to linux/amd64 on ARM pull failures - engine: capture container stdout when no output files produced - argus.yml: add exclude config for bandit dogfood
Removed separate argus container subcommand. Container scanning is now accessed via argus scan with lifecycle flags: argus scan container --image nginx:latest argus scan container --discover ./ argus scan container --discover docker/ argus scan container --config argus.yml When scanner is 'container' and --discover or --image flags are present, the engine routes to the container lifecycle (discover, build, scan, deduplicate, report). Without those flags, it runs the container scanner module as a normal scanner. CI workflow updated to use ContainerMarkdownReporter for the rich collapsible vulnerability report format.
Prevents resource exhaustion when scanning large or many containers: - Pre-scan disk space check (2 GB minimum, 5 GB warning threshold) - Aborts remaining scans if disk becomes critically low mid-run - Removes built images after each scan via docker rmi (cleanup=true) - Prunes dangling images after all scans complete - Prunes Docker build cache if disk is low after final scan - Only removes images that argus built (not pre-existing user images) Sequential processing with per-container cleanup keeps disk usage bounded. For CI parallelism, use matrix strategy in the workflow.
Removed the 2GB minimum disk space gate that blocked scans preemptively. We can't predict what a scan needs — a 50MB Alpine image succeeds in 100MB free while a 2GB ML image needs more. New approach: try every scan, catch failures individually. - Disk space check is informational only (warns at < 2GB) - Build failures check post-failure disk to hint at cause - OSError caught separately for disk/permission failures - Each container's failure is isolated — doesn't abort the rest - Cleanup still runs between scans to keep usage bounded
Trivy now scans remote images directly from the registry using --image-src remote when the image wasn't built locally. Grype does this automatically. No docker pull needed for registry images. Resource impact: - Remote scan: ~200MB (vuln DB + scan output) - Local scan: full image size + ~200MB Also adds: - get_remote_image_size() via docker manifest inspect (no pull) - is_image_local() to check Docker daemon - 10-minute timeout on all scanner subprocess calls - Stderr truncated to 500 chars in error logs
New argus scan zap command with full container lifecycle management:
argus scan zap --target http://localhost:3000 # scan running target
argus scan zap --image myapp:latest # auto-discover, start, scan, stop
argus scan zap --image myapp:latest --port 8080 # override port
argus scan zap --image myapp:latest --env DB=... # pass env vars
Auto-discovery workflow for --image:
1. docker inspect for ExposedPorts (no manual port config needed)
2. Create isolated argus-dast-{name} Docker network
3. Start container with port mapping (free port auto-selection)
4. Health probe with TCP + HTTP and exponential backoff
5. Run ZAP on shared network (reaches target by container name)
6. Stop and clean up container + network (always, even on failure)
Crash detection: if container exits within 3 seconds, reports as
startup failure rather than waiting for the full health timeout.
Architecture: argus/dast/ with inspect.py, runner.py, engine.py
matching the argus/container/ pattern.
…tles Each matrix job now writes just its per-container detail section via report_single() (named <details><summary>📦 scanner-bandit — N vulns), plus a JSON summary with severity counts. The combine step reads all JSON summaries to build: 1. Combined findings summary table (aggregated counts) 2. Container breakdown table (one row per image with severity columns) 3. Per-container detail sections (each named, distinguishable collapsed) Replaces the old approach where each matrix job produced a full report wrapped in an identical 🐳 Container Security Scan header.
Brings SDK coverage from 74.9% to 83%+ (338 total tests). New test files: - test_container_discovery.py (21): Dockerfile discovery, config parsing - test_container_scanner.py (18): severity counts, deduplication - test_container_engine.py (14): orchestration with mocked Docker - test_container_resources.py (12): disk space, image cleanup - test_container_markdown.py (15): report generation, combined reports - test_cli_container.py (18): routing, terminal/json/markdown output - test_dast.py (12): dataclass properties, summary aggregation - test_scanner_scan_methods.py (24): scan() and _build_command() for bandit, clamav, gitleaks, osv, checkov, opengrep
Every argus scan now produces a forensic evidence package: argus-results/ ├── argus.log # Structured JSONL log (every phase, timestamped) ├── argus-audit.json # Evidence manifest (provenance, config hash, │ # findings summary, artifact SHA-256 hashes) ├── argus-results.json # Scan results └── argus-results.sarif # SARIF output Architecture (inspired by user's log_to_text_file.py and log-collector): - argus/audit/logger.py: colored console (ANSI) + JSONL file handler - argus/audit/secrets.py: masks tokens, passwords, API keys in all output - argus/audit/platform.py: auto-detects GitHub Actions, GitLab CI, Jenkins - argus/audit/manifest.py: generates argus-audit.json with provenance, timing, config hash, findings summary, artifact integrity hashes 65 new tests (403 total SDK tests). argus-results/ added to .gitignore.
Every decision, subprocess call, and fallback is now logged: DEBUG: backend selection, Docker commands, pull policy, container exit codes, output file names, parse results, config values INFO: scanner start/complete with duration, image pulls, fallbacks WARNING: no output produced, image not found with pull_policy=never ERROR: scanner failures with elapsed time and stderr excerpts Console shows INFO+ by default, DEBUG+ with --verbose. JSONL log file captures everything for forensic analysis.
Every container operation now captures and logs the immutable digest: - After pull: "Pulled gitleaks:v8.30.1 in 4027ms (digest=sha256:c00b...)" - Before scan: "Running 'gitleaks' in container (digest=sha256:c00b...)" - In results: ScanResult.metadata["digest"] = "sha256:c00b..." - In manifest: container_images.gitleaks.digest = "sha256:c00b..." Tags can be re-pushed in a supply chain attack. Digests cannot. The audit manifest proves exactly which image binary executed, enabling post-incident verification against known-good digests.
CLI help improvements: - argus scan container (no flags) → shows container-specific usage with examples instead of silently failing - argus scan zap (no flags) → shows DAST-specific usage with examples - Each scanner lifecycle mode has clear required/optional flags CLI doc generator (scripts/gen_cli_docs.py): - Introspects argparse parser tree to generate markdown docs - Walks subcommands, argument groups, options, choices, defaults - Produces docs/cli-reference.md for the docsite - Auto-generated — the CLI IS the documentation source Run: python scripts/gen_cli_docs.py --output docs/cli-reference.md
argus scan gitleaks --help now shows gitleaks-specific information: - Scanner description (from class docstring) - Tool installation status (installed locally or not) - Install command (if not installed) - Container image reference - Usage examples specific to that scanner - Common options Unknown scanner names show the list of available scanners. Implemented by intercepting --help in main() before argparse exits, then introspecting the scanner module from SCANNER_REGISTRY.
argus scan now validates the scanner name against the registry before running. Prevents silent no-op scans from typos. Uses stdlib difflib.get_close_matches for suggestions: argus scan gitleak → "Did you mean 'gitleaks'?" argus scan bandi → "Did you mean 'bandit'?" argus scan trivyiac → "Did you mean 'trivy-iac'?" argus scan foobar → lists all available scanners Only triggers when a specific scanner is requested (not --list, not when running all enabled scanners).
argus/__init__.py was stuck at 0.1.0 while version.yaml is 0.7.0. Fixed to 0.7.0 and added release-it regex-bumper rule so it stays in sync on every release. Regenerated CLI docs with correct version.
The CLI test job now validates the full structured log directory: 1. Existence: argus.log, argus-audit.json, argus-summary.md 2. Audit manifest structure: scan_id, timestamps, duration, platform detection, config hash, artifact inventory 3. Artifact integrity: SHA-256 hashes in manifest match actual files 4. Structured log format: every line is valid JSONL with timestamp, level, and message fields Also adds markdown format to the CLI test scan so argus-summary.md is produced alongside JSON, SARIF, and the log. This ensures the audit trail is reliably produced in CI — the evidence package is not optional.
Restores all 22 deleted workflows with identical input interfaces so existing user pipelines don't break on upgrade. Internal implementation now uses argus CLI instead of direct composite action calls. 12 scanner workflows migrated to argus CLI: bandit, checkov, clamav, gitleaks, grype, opengrep, osv, supply-chain, trivy-container, trivy-iac, zap, zap-from-config 3 scanner workflows kept as composite actions (GitHub-native): codeql, dependency-review, syft 6 compound workflows restored: reusable-security-hardening (dispatcher — calls scanner workflows) container-scan, container-scan-from-config (argus scan container) dependency-scan (argus scan osv + composite dependency-review) infrastructure-scan (argus scan trivy-iac + argus scan checkov) linting (composite actions — linters not yet in argus registry) Demo workflow restored: security-reusable-demo Users change nothing except the version tag.
New command aggregates per-scanner results from parallel CI jobs:
argus collect ./downloaded-artifacts/ -o ./argus-audit-package/
Produces a unified audit package:
argus-audit-package/
├── argus-combined.log # All JSONL logs merged, sorted by timestamp
├── argus-audit.json # Combined manifest (provenance, all findings)
├── summary.md # Combined markdown
└── scanners/
├── bandit/ # Individual scanner results preserved
├── gitleaks/
└── osv/
Platform-agnostic: doesn't care how files got there. CI platforms
handle artifact transport, argus handles the merge.
Merger features:
- JSONL logs sorted by timestamp across all scanners
- Manifests combined: earliest start, latest end, aggregated findings
- Artifact inventory with SHA-256 integrity hashes
- Scanner source tagged on each log entry for traceability
argus.log is still being written to when finalize_manifest() runs — the "Audit manifest written" log line happens after the hash is computed, causing a mismatch in CI validation. Fix: exclude argus.log and argus-audit.json from the artifact hash inventory. Static artifacts (JSON results, SARIF, markdown) are still hashed for integrity verification. Also flushes all log handlers before computing hashes to ensure other artifacts are fully written.
Documents the three-stage ref lifecycle (PR → main → release) and the division between in-repo testing (PR gating) and external validation (argus-test for release confidence). In-repo: 403+ pytest tests, E2E actions, CLI validation, container builds, audit trail — all on every PR. No external deps. argus-test: scheduled weekly against latest release, opens issues on argus when failures detected, validates consumer perspective. Referenced from issue #29 comment.
…ment PR comment now only shows container image vulnerabilities (real security posture). CLI test results (which scan test fixtures with intentional vulnerabilities) are kept as CI artifacts but not included in the PR comment. Also excludes tests/fixtures/ from the dogfood argus.yml config: - bandit: exclude tests/fixtures from SAST scan - opengrep: exclude tests/fixtures from pattern scan - osv: exclude tests/fixtures from dependency scan Test fixtures contain intentionally vulnerable code for scanner validation — reporting them as our security posture is misleading.
Only bandit read the exclude config — OSV, OpenGrep, and other scanners ignored it, still reporting test fixture findings. New approach: the engine applies exclude filtering AFTER parsing, regardless of whether the tool supports it natively. Any finding whose location contains an excluded path is dropped. This is scanner-agnostic — works for all scanners universally. The tool still scans everything (catches real issues in all paths), but excluded findings are removed before reporting.
New argus validate command checks argus.yml before scanning: argus validate # auto-detect config argus validate --config my.yml # specific file Catches: - Unknown top-level keys (scaners → did you mean scanners?) - Unknown section keys (formts → formats, pull_polcy → pull_policy) - Invalid enum values (severity: mega → must be critical/high/medium/low/none) - Type mismatches (enabled: "yes" → must be boolean) Validation also runs automatically when loading config via ArgusConfig.load(). Errors abort the scan with clear messages. Warnings log but allow the scan to proceed. Pure Python — no jsonschema dependency.
argus init — project bootstrap: Detects languages, lockfiles, Dockerfiles, IaC, GitHub workflows. Generates tailored argus.yml with detected scanners enabled. Optional --platform github generates CI workflow file. Never overwrites without --force. 45 tests. argus-config.schema.json — formal schema: JSON Schema draft-07 for IDE autocomplete and validation. Generated configs reference it via yaml-language-server comment. Version in $id URL managed by release-it. Skill file sync: Updated SKILL.md and REFERENCE.md for SDK-first architecture. argus init as starting point, full CLI command surface, container/DAST scanning flags, argus validate/collect commands.
Displays the Argus eye+shield logo and text banner on init. Generated from the official brandmark, embedded as static string. Only shows on interactive terminals. Prints to stderr.
Replace the basic ANSI banner with full truecolor RGB art generated from the official Argus brandmark. 200+ unique colors matching the original logo. ASCII characters for detail, per-character truecolor for the green gradient eye with shield iris. 20 lines at 80 columns. Only displays on interactive terminals.
Removed iteration/test files from img/. Kept argus_eye_final.txt. Banner now scrolls in line-by-line (30ms/line) on init.
Each argus scan creates a new timestamped subdirectory: argus-results/2026-04-12T07-24-50Z/ argus-results/2026-04-12T11-30-22Z/ argus-results/latest -> 2026-04-12T11-30-22Z/ Previous scan results are never overwritten. The 'latest' symlink always points to the most recent run for scripting convenience. Applied to all scan modes: source, container, and DAST.
Description
Major architectural refactor: introduces the argus Python SDK as the primary interface for running security scanners. Replaces 22 GitHub Actions wrapper workflows with a single
python -m argus scanCLI that handles tool execution locally or via Docker containers.Changes Made
Details
Argus Python SDK (
argus/package)Severityenum,Finding/ScanResult/ScanSummarydataclasses,Scannerprotocol,ArgusConfig(loadsargus.yml),ArgusEngineorchestrationargus scan [scanner] --config --path --format --severity-threshold,argus reportargus/containers.py) for version trackingCI/CD Dogfooding
security-scan.yml: runspython -m argus scan --config argus.yml— no manual tool installstest-reusable-workflows.yml: validates argus CLI integration (listing, scanning, output formats)argus.yml: dogfood config enabling bandit, gitleaks, opengrep, clamav, osv, supply-chainDeprecated Workflows (BREAKING)
scanner-*.ymlthin wrapper workflowssecurity-reusable-demo.yml.github/actions/remain for external GitHub Actions usersCI/CD Hardening
test-actions.yml: removed job-levelcontinue-on-error: truefrom 23 jobs, added test-results gate jobtest-unit.yml: Python version matrix (3.11, 3.12, 3.13), version ref coverage checktest-examples-functional.yml: minimum example count gate.release-it.json: added package.json to version bumperDependency Maintenance
renovate.yaml: Renovate regex managers for container image tags and tool versions.github/dependabot.yml: added pip and docker ecosystemsDocumentation
.ai/files (architecture, context, workflows, errors, decisions)Testing
Test Results
Security Considerations
Security Details
:ro)argus/containers.pytest-actions.ymlno longer silently passes broken scanners (continue-on-errorremoved)AI Context Updates (.ai/)
.ai/architecture.yamlupdated (if components/structure changed).ai/workflows.yamlupdated (if commands/tasks changed).ai/decisions.yamlupdated (if design decision made).ai/errors.yamlupdated (if common error addressed)Checklist
Related Issues
Closes #58
Screenshots/Logs (if applicable)