-
Notifications
You must be signed in to change notification settings - Fork 3
feat: add test evidence checker for PR submissions (#61) #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements automated test evidence checking for PRs in desktop and ComfyUI repos. - Creates gh-test-evidence task to scan open PRs - Uses GPT-4o-mini to analyze PR bodies for test evidence - Posts warning comments when test explanations or visual proof are missing - Auto-updates or deletes comments based on PR changes - Follows the same comment pattern as ComfyUI_frontend workflow Resolves #61 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements automated test evidence checking for pull requests in the Comfy-Org/desktop and comfyanonymous/ComfyUI repositories. The solution uses GPT-4o-mini to analyze PR descriptions for test explanations, screenshots, and videos, then posts/updates/deletes warning comments based on what evidence is present.
Key changes:
- New automated task that scans open PRs and validates test evidence using AI
- Smart comment management system that updates existing comments instead of creating duplicates
- Database-backed state tracking to avoid redundant analysis
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| app/tasks/run-gh-tasks.ts | Registers the new test evidence task and reformats existing task entries for consistency |
| app/tasks/gh-test-evidence/gh-test-evidence.ts | Core implementation of the test evidence checker with OpenAI integration, comment management, and database persistence |
| app/tasks/gh-test-evidence/gh-test-evidence.spec.ts | Test suite structure with mocked dependencies for validating the task behavior |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
…dd CI cleanup - Extract main logic into runCorePingTask() function for better testability - Add isCI check to properly close DB and exit in CI environments - Add todo comment about deprecating custom webhook types in favor of @octokit/webhooks-types - Add llm-api, @keyv/mongo, and @octokit/webhooks-types dependencies - Remove trailing whitespace 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Resolve conflicts in coreping.ts by keeping new refactored structure - Resolve conflicts in run-gh-tasks.ts by including all task imports - Resolve conflicts in package.json by keeping both new dependencies - Accept incoming bun.lock changes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Switch from gpt-4o-mini to gpt-5-mini for analyzing PR test evidence. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Fixed typo 'Explaination' -> 'Explanation' throughout the codebase: - Updated schema field name in TestEvidenceSchema - Updated all references in code and tests - Updated OpenAI prompt and JSON schema - Updated warning message generation Addresses review comments from Copilot. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
snomiao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrects zod version that was accidentally changed during merge from ^4.0.5 to ^4.0.0. This should resolve the Vercel build failure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…ubRepoUrl The function parseUrlRepoOwner does not exist in @/src/parseOwnerRepo. The correct function name is parseGithubRepoUrl. This fixes the TypeScript compilation error that was causing the Vercel build to fail. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…or code formatting
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
All alerts resolved. Learn more about Socket for GitHub. This PR previously contained dependency changes with security issues that have been resolved, removed, or ignored. |
Replace bun:mock with MSW (Mock Service Worker) for more realistic HTTP mocking: - Mock GitHub API endpoints (pulls, comments) and OpenAI API - Add proper MSW server lifecycle (beforeAll, afterEach, afterAll) - Mock database module to avoid MongoDB connection in tests - All tests passing (4/4) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…nation' Address Copilot review feedback to use singular form 'explanation' for consistency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Use the centralized MSW setup from @/src/test/msw-setup instead of duplicating server configuration. This addresses the review comment to use the unified MSW setup. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Fixed spelling from 'explanations' to 'explanation' in b963f43 |
|
Refactored to use unified MSW setup from @/src/test/msw-setup in 0800956 |
| type S = GithubApiComponents["schemas"]; | ||
| // todo(sno): deprecate this and use @octokit/webhooks-types | ||
| export type WEBHOOK_EVENTS = { | ||
| branch_protection_configuration: S[`webhook-branch-protection-configuration${string}` & keyof S]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets' remove this file
Updated OpenAI model from invalid 'gpt-5-mini' to correct 'gpt-4o-mini' for test evidence analysis. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary
Implements automated test evidence checking for PRs in desktop and ComfyUI repos, solving issue #61 in a smarter way.
Changes
app/tasks/gh-test-evidence/gh-test-evidence.tstaskComfy-Org/desktopandcomfyanonymous/ComfyUISmart Improvements
Testing
gh-test-evidence.spec.tswith test structureWorkflow Integration
Added to
app/tasks/run-gh-tasks.tsto run on schedule with other GitHub tasks.Closes #61
🤖 Generated with Claude Code