Skip to content

Auto-resolve stale bot review threads on re-review#63

Merged
derekmisler merged 1 commit intodocker:mainfrom
derekmisler:worktree-resolve-stale-threads
Mar 4, 2026
Merged

Auto-resolve stale bot review threads on re-review#63
derekmisler merged 1 commit intodocker:mainfrom
derekmisler:worktree-resolve-stale-threads

Conversation

@derekmisler
Copy link
Contributor

Summary

  • Adds a new step to review-pr/action.yml that runs before the review agent to auto-resolve stale bot review threads
  • Parses pr.diff to build a set of file:line pairs, fetches unresolved review threads via GraphQL, and resolves threads where the commented line is no longer in the diff (issue was addressed)
  • Threads where the line is still in the diff are kept open for re-evaluation by the next review

Details

Uses GitHub's GraphQL resolveReviewThread mutation (REST API has no equivalent). The step is entirely bash/jq — no LLM involvement. Key design decisions:

  • Null safety: Threads with null path or line (outdated diff positions, file-level comments) are kept open rather than incorrectly auto-resolved
  • Diff parsing: Uses +++ b/ lines instead of diff --git headers for unambiguous filename extraction
  • GraphQL pagination: Handles PRs with 100+ review threads
  • Graceful degradation: Wrapped in continue-on-error: true — failures never block the actual review
Edge case Behavior
File deleted No + lines → old threads resolved
Line moved (refactor) Old file:line gone → resolved; new review catches at new line
Same line still changed file:line in diff → kept open
No previous reviews 0 bot threads → no-op
GraphQL fails continue-on-error: true → review still runs

Test plan

  • Create a test PR, trigger /review, get inline comments
  • Push a fix addressing one comment, trigger /review again
  • Verify: fixed comment's thread → resolved, unfixed → still open
  • Check workflow logs for 📋 Summary: Resolved X threads, kept Y threads open
  • Edge case: push a fix that deletes the entire file — all threads should resolve

@derekmisler derekmisler marked this pull request as ready for review March 2, 2026 18:28
@derekmisler derekmisler requested a review from a team as a code owner March 2, 2026 18:28
Copy link

@docker-agent docker-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

⚠️ Found 2 critical issues that need to be addressed:

  1. GraphQL error handling incomplete: Both the query and mutation operations only check shell exit codes, not GraphQL errors in the response JSON
  2. Security: Using world-writable /tmp: Could enable TOCTOU attacks in multi-user CI environments

The core logic for parsing diffs and matching threads looks solid, but the error handling needs strengthening before this can safely run in production.

@derekmisler
Copy link
Contributor Author

/review

Copy link

@docker-agent docker-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR adds a useful feature to auto-resolve stale bot review threads. The implementation is generally sound, but there's a moderate concern about error handling in the GraphQL pagination logic that could lead to incomplete thread resolution.

Key Finding

GraphQL Pagination Error Handling: When the pagination loop breaks due to jq parse errors, the script continues with partial data without clearly indicating to users that thread resolution may be incomplete. While the error is logged, the script proceeds to resolve threads based on potentially incomplete data from earlier pages.

Recommendation

Consider adding explicit validation after the pagination loop to check if ALL_THREADS is valid JSON and potentially exit early (or add a prominent warning) if the data is incomplete. This would make failures more obvious to users.

@derekmisler
Copy link
Contributor Author

/review

docker-agent[bot]
docker-agent bot previously approved these changes Mar 2, 2026
Copy link

@docker-agent docker-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Review Complete

This PR implements a robust solution for auto-resolving stale bot review threads. The code has been thoroughly reviewed and all potential concerns have been verified.

Key strengths:

  • AWK script correctly handles unified diff format (deleted lines should not increment new file line counter)
  • Proper incremental line tracking through diff hunks
  • Comprehensive error handling with continue-on-error: true for graceful degradation
  • Sound null-safety checks in jq operations
  • Defensive programming with FETCH_OK flag to prevent acting on partial data
  • Safe temp file handling with immediate trap setup

The implementation correctly parses pr.diff, fetches paginated review threads via GraphQL, and resolves threads where the commented line is no longer in the diff. No issues found.

@derekmisler derekmisler enabled auto-merge (squash) March 3, 2026 14:35
@derekmisler derekmisler force-pushed the worktree-resolve-stale-threads branch 5 times, most recently from 2e3fbe3 to 08e70e8 Compare March 3, 2026 18:15
@derekmisler derekmisler self-assigned this Mar 3, 2026
@derekmisler
Copy link
Contributor Author

/review

Copy link

@docker-agent docker-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

⚠️ Note: Verification step was inconclusive. The findings below are from initial analysis and should be evaluated by the author.

Assessment: 🟡 NEEDS ATTENTION

Summary

Found 2 potential issues in the thread resolution logic:

  1. HIGH: Substring matching in grep could cause false positives, preventing threads from being resolved
  2. MEDIUM: Missing validation that BOT_COUNT is a valid integer before arithmetic operations

Both issues are in new code added by this PR.

Findings

🔴 HIGH: Potential false positive in thread resolution check

File: review-pr/action.yml:338

The grep -qF command allows substring matching, which could incorrectly identify threads as "still in diff" when they're not. For example:

  • If FILE_LINE is bar.txt:123 and DIFF_LINES_FILE contains foo/bar.txt:123, grep would match (substring)
  • This would keep the thread open when it should be resolved

Recommendation: Use grep -qFx for exact whole-line matching instead of grep -qF.

🟡 MEDIUM: Missing validation for BOT_COUNT before arithmetic

File: review-pr/action.yml:323

The script doesn't validate that BOT_COUNT is a valid integer before using it in:

  • Line 323: seq 0 $((BOT_COUNT - 1))
  • Line 317: if [ "$BOT_COUNT" -eq 0 ]

If jq 'length' (line 311) fails or returns unexpected output, this could cause bash errors.

Recommendation: Add validation after line 311:

BOT_COUNT=$(echo "$BOT_THREADS" | jq 'length')
if [ -z "$BOT_COUNT" ] || ! [[ "$BOT_COUNT" =~ ^[0-9]+$ ]]; then
  echo "::warning::Failed to count bot threads"
  exit 0
fi

@derekmisler derekmisler force-pushed the worktree-resolve-stale-threads branch 2 times, most recently from 66b293e to 874947c Compare March 4, 2026 03:18
Signed-off-by: Derek Misler <derek.misler@docker.com>
@derekmisler derekmisler force-pushed the worktree-resolve-stale-threads branch from 874947c to 0a00577 Compare March 4, 2026 03:42
@derekmisler derekmisler merged commit 3350c25 into docker:main Mar 4, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants