-
Notifications
You must be signed in to change notification settings - Fork 238
Description
Problem Description
When a workflow run fails in a pre-agent step (before the AI agent executes), the audit tool reports "No specific errors identified" in the failure_analysis.error_summary field, despite the specific error being present in the downloaded step log files.
Steps to Reproduce
- Trigger a workflow run that fails before the agent step (e.g., lockdown mode validation failure)
- Audit the failed run:
gh aw audit (run_id)(or via MCPaudittool) - Observe the
failure_analysis.error_summaryin the report
Example run ID: 22266577612 (Issue Monster, failed 2026-02-21T23:40:03Z)
Expected Behavior
The audit report should extract and surface the specific error message from the step log files. For example, from workflow-logs/agent/12_Validate lockdown mode requirements.txt:
Lockdown mode is enabled (lockdown: true) but no custom GitHub token is configured.
Please configure one of the following as a repository secret:
- GH_AW_GITHUB_TOKEN (recommended)
- GH_AW_GITHUB_MCP_SERVER_TOKEN (alternative)
- Custom github-token in your workflow frontmatter
Actual Behavior
The audit report shows:
{
"failure_analysis": {
"primary_failure": "failure",
"failed_jobs": ["agent"],
"error_summary": "No specific errors identified"
}
}- No
agent-stdio.logis downloaded (agent never ran) run_summary.jsonshows"errors": null- The actual error exists in downloaded step log files but is not surfaced
Root Cause
The audit tool appears to extract errors primarily from agent-stdio.log (the AI agent's output). When the agent job fails at an early step (e.g., step 12 out of 67+), there's no agent output. The tool doesn't fall back to scanning the step-level log files in workflow-logs/agent/ to find the failure message.
Environment
- Repository: github/gh-aw
- Run ID (testing session): 22266663143
- Failed run audited: 22266577612
- Date: 2026-02-21
- gh-aw version: 0.0.414 (agent_version from aw_info.json)
Impact
- Severity: High
- Frequency: Always (any workflow that fails before agent execution)
- Workaround: Manually inspect the step logs in
workflow-logs/agent/directory - Affected users: Anyone using
auditto debug failed workflows, where the failure is a configuration issue (missing secrets, lockdown validation, etc.)
Additional Context
Common scenarios where the agent never executes (and thus this bug manifests):
- Missing required secrets in lockdown mode
- Failed MCP gateway setup
- Failed binary installation (awf, copilot CLI)
- Failed repository checkout
The audit is most valuable precisely in these failure scenarios, making this a significant gap in debugging capability.
Related: The audit command also returns a cryptic exit status 1 error when given an invalid run ID (e.g., 99999999999), instead of a human-readable "Run not found" message. Both issues reduce the audit tool's usefulness for diagnosing problems.
Generated by Daily CLI Tools Exploratory Tester
- expires on Feb 28, 2026, 11:55 PM UTC