Skip to content

[cli-tools-test] audit: "No specific errors identified" for failed runs when agent never executed #17621

@github-actions

Description

@github-actions

Problem Description

When a workflow run fails in a pre-agent step (before the AI agent executes), the audit tool reports "No specific errors identified" in the failure_analysis.error_summary field, despite the specific error being present in the downloaded step log files.

Steps to Reproduce

  1. Trigger a workflow run that fails before the agent step (e.g., lockdown mode validation failure)
  2. Audit the failed run: gh aw audit (run_id) (or via MCP audit tool)
  3. Observe the failure_analysis.error_summary in the report

Example run ID: 22266577612 (Issue Monster, failed 2026-02-21T23:40:03Z)

Expected Behavior

The audit report should extract and surface the specific error message from the step log files. For example, from workflow-logs/agent/12_Validate lockdown mode requirements.txt:

Lockdown mode is enabled (lockdown: true) but no custom GitHub token is configured.

Please configure one of the following as a repository secret:
  - GH_AW_GITHUB_TOKEN (recommended)
  - GH_AW_GITHUB_MCP_SERVER_TOKEN (alternative)
  - Custom github-token in your workflow frontmatter

Actual Behavior

The audit report shows:

{
  "failure_analysis": {
    "primary_failure": "failure",
    "failed_jobs": ["agent"],
    "error_summary": "No specific errors identified"
  }
}
  • No agent-stdio.log is downloaded (agent never ran)
  • run_summary.json shows "errors": null
  • The actual error exists in downloaded step log files but is not surfaced

Root Cause

The audit tool appears to extract errors primarily from agent-stdio.log (the AI agent's output). When the agent job fails at an early step (e.g., step 12 out of 67+), there's no agent output. The tool doesn't fall back to scanning the step-level log files in workflow-logs/agent/ to find the failure message.

Environment

  • Repository: github/gh-aw
  • Run ID (testing session): 22266663143
  • Failed run audited: 22266577612
  • Date: 2026-02-21
  • gh-aw version: 0.0.414 (agent_version from aw_info.json)

Impact

  • Severity: High
  • Frequency: Always (any workflow that fails before agent execution)
  • Workaround: Manually inspect the step logs in workflow-logs/agent/ directory
  • Affected users: Anyone using audit to debug failed workflows, where the failure is a configuration issue (missing secrets, lockdown validation, etc.)

Additional Context

Common scenarios where the agent never executes (and thus this bug manifests):

  • Missing required secrets in lockdown mode
  • Failed MCP gateway setup
  • Failed binary installation (awf, copilot CLI)
  • Failed repository checkout

The audit is most valuable precisely in these failure scenarios, making this a significant gap in debugging capability.

Related: The audit command also returns a cryptic exit status 1 error when given an invalid run ID (e.g., 99999999999), instead of a human-readable "Run not found" message. Both issues reduce the audit tool's usefulness for diagnosing problems.

Generated by Daily CLI Tools Exploratory Tester

  • expires on Feb 28, 2026, 11:55 PM UTC

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions