Add Agent Fight results from last month by justinpecott · Pull Request #5 · justinpecott/api-blaster

justinpecott · 2025-10-14T01:14:58Z

Last month I forked this repo and did battle between claude code, copilot, and gpt codex. This PR is the tasks and results. Short story is Claude won and is the base of this repo. Code from copilot and gpt are saved in agent-fight directory for posterity.

Copilot

Pull Request Overview

This PR documents a comparison between three AI coding agents (Claude Code, GitHub Copilot, and GPT/Codex) performing identical API client development tasks. The author tested each agent on four sequential tasks to refactor and enhance an OAuth2-enabled API client, ultimately choosing Claude's implementation as the main branch.

Added comprehensive documentation of the agent comparison methodology and results
Preserved code implementations from all three agents for reference and analysis
Included sensitive API credentials and configuration details for the test API

Reviewed Changes

Copilot reviewed 14 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
agent-fight/tasks.md	Documents the four development tasks assigned to each AI agent and their performance
agent-fight/results.md	Brief summary declaring Claude as the winner
agent-fight/justinsweb-api.md	Contains OAuth2 credentials and API endpoint details
agent-fight/gpt/*	Complete GPT/Codex implementation with Python files and configuration
agent-fight/copilot/*	Complete Copilot implementation with Python files and configuration

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-14T01:15:38Z

agent-fight/tasks.md

+```
+This project needs to be generalized to a generic API client. Hence, we will no longer be calling it "asp" for Alexa Smart Properties.
+
+1. Rename all files and folders, existing and generated, from "asp" to "api-blaster". This inclides all "asp" and ".asp" prefixed file and folders.


Corrected spelling of 'inclides' to 'includes'.

Suggested change

1. Rename all files and folders, existing and generated, from "asp" to "api-blaster". This inclides all "asp" and ".asp" prefixed file and folders.

1. Rename all files and folders, existing and generated, from "asp" to "api-blaster". This includes all "asp" and ".asp" prefixed file and folders.

Copilot · 2025-10-14T01:15:38Z

agent-fight/tasks.md

+```
+This project needs to be generalized to a generic API client. Hence, we will no longer be calling it "asp" for Alexa Smart Properties.
+
+1. Rename all files and folders, existing and generated, from "asp" to "api-blaster". This inclides all "asp" and ".asp" prefixed file and folders.


Corrected spelling of 'inclides' to 'includes'.

Copilot · 2025-10-14T01:15:39Z

agent-fight/tasks.md

+```
+This project needs to be generalized to a generic API client. Hence, we will no longer be calling it "asp" for Alexa Smart Properties.
+
+1. Rename all files and folders, existing and generated, from "asp" to "api-blaster". This inclides all "asp" and ".asp" prefixed file and folders.


Corrected spelling of 'inclides' to 'includes'.

agent-fight/justinsweb-api.md

claude · 2025-10-14T01:16:35Z

Pull Request Review: Agent Fight Results

Thank you for sharing the results of your AI coding assistant comparison! This is an interesting historical record. Here's my detailed review:

Overview

This PR adds code artifacts from comparing Claude Code, GitHub Copilot, and GPT Codex implementations of the api-blaster utility to the agent-fight/ directory.

🟢 Strengths

1. Good Organization

Clearly separated implementations in dedicated subdirectories (copilot/, gpt/)
Each implementation is self-contained with its own documentation

2. Comprehensive Documentation

Both implementations include README files and setup instructions
Copilot version has detailed .github/copilot-instructions.md with validation commands
GPT version includes AGENTS.md with coding guidelines

3. Copilot Implementation Quality

Well-refactored with separation of concerns (api-blaster.py vs api-blaster-example.py)
Includes get_authenticated_session() function for better modularity
Uses modern uv package manager with proper pyproject.toml
Lock file (uv.lock) ensures reproducible builds

🟡 Areas for Improvement

1. Code Quality & Best Practices

Copilot Version (`agent-fight/copilot/api-blaster.py`):

Lines 109-113: Creates a new OAuth2Session for authorization flow but doesn't reuse the existing pattern consistently
Line 38: Hardcoded example API call in library code - should be in the example file only
Module loading in example: Uses importlib.util.spec_from_file_location which is fragile. Consider making it a proper package

GPT Version (`agent-fight/gpt/`):

The diff was truncated, but from what's visible:

Incomplete README (line 21 cuts off mid-sentence: "Asks you for your client id, clien")
Should verify completeness of all files

2. Potential Bugs

Both Implementations:

No input validation: User inputs for client_id, client_secret, and redirect_uri are not validated
Error handling: Limited error handling for network failures during OAuth flow
Token expiry edge case: If expires_at is present but the token is already expired, the code will fail on first API call

Copilot Version Specific:

Line 15 (api-blaster-example.py): spec.loader.exec_module() could fail if the module has syntax errors - needs try/except
Missing dependencies check: The example imports but doesn't verify the OAuth module is available

3. Security Concerns

⚠️ Important Security Issues:

Input sanitization missing:
- User-provided URLs (redirect_uri, authorization_response) are not validated
- Could potentially be exploited with malicious URLs
File permissions:
- Token and config files are created with default permissions
- Should set restrictive permissions (0600) on .api-blaster/ files containing secrets:
```
import os
os.chmod(self.auth_file_path, 0o600)
os.chmod(self.conf_file_path, 0o600)
```
No HTTPS validation:
- Should validate that OAuth URLs use HTTPS
- Should warn if redirect_uri is not localhost or HTTPS
Token exposure in memory:
- Tokens are kept in memory longer than necessary
- Consider using secrets module for sensitive data handling
Error messages leak sensitive info:
- Error messages print full response content which may contain tokens
- Lines like print(f"Response content: {response.text}") are risky

4. Performance Considerations

Blocking I/O: The OAuth flow uses input() which blocks - acceptable for a CLI utility

No timeout on token fetch: Network calls should have timeouts

token = oauth_session.fetch_token(
    self.token_url,
    client_secret=self.config["client_secret"],
    authorization_response=redirect_response,
    timeout=30  # Add timeout
)

File I/O: Multiple file reads/writes are fine for this use case

5. Test Coverage

❌ Critical Gap: No tests in this PR

Recommendations:

Add unit tests for configuration loading/saving
Add tests for token validation logic
Add integration tests with mocked OAuth endpoints
Test error paths (missing files, invalid JSON, network failures)
Consider adding tests to compare behavior across all three implementations

Example test structure:

# tests/test_oauth_blaster.py
import pytest
from unittest.mock import patch, mock_open
from oauth_blaster import OAuthBlaster

def test_load_config_from_file():
    # Test configuration loading
    pass

def test_token_refresh():
    # Test automatic token refresh
    pass

📋 Specific Recommendations

For Copilot Implementation:

Remove the API call logic from api-blaster.py (lines 104-117) - keep it library-only
Make it a proper Python package with __init__.py
Add type hints consistently throughout
Add docstrings to all functions

For GPT Implementation:

Verify all files are complete (README appears truncated)
Add validation similar to copilot version
Ensure consistency in error handling

General:

Add a comparison summary document explaining:
- What tasks were given to each agent
- Evaluation criteria
- Specific strengths/weaknesses observed
- Why Claude "won" (as mentioned in the PR description)

Add .gitignore entries if not already present:

agent-fight/*/.api-blaster/
agent-fight/*/.venv/

Consider adding a test suite that runs the same tests against all three implementations

🎯 Verdict

This is valuable historical documentation of an AI assistant comparison. The code quality is generally good, especially the Copilot implementation's modular structure. However, before merging:

Must Fix:

Security: Add file permissions and input validation
Completeness: Verify GPT implementation files are complete
Documentation: Add comparison summary

Should Fix:

Add test coverage
Add timeouts to network calls
Improve error handling

Nice to Have:

Refactor copilot version into proper package
Add comparative analysis document
Standardize documentation across implementations

Would you like me to help implement any of these fixes?

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

scrub the sensitive info please Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

claude · 2025-10-14T02:14:39Z

Pull Request Review: Agent Fight Results

Overview

This PR documents an interesting experiment comparing three AI coding agents (Claude Code, GitHub Copilot, and GPT Codex) on the same set of tasks. The PR adds historical code artifacts from each agent in an agent-fight/ directory.

Positive Aspects ✅

Documentation

Clear narrative: The tasks.md file provides excellent documentation of the methodology and experiences with each agent
Transparent comparison: Results are honestly presented, including challenges faced with each agent
Valuable historical record: Preserving all three implementations helps demonstrate the evolution of the codebase

Code Quality - Winner (Claude Code on main)

The code that ultimately landed on main shows several improvements:

Better separation of concerns: Created separate oauth_blaster.py module vs single-file implementations
Proper encapsulation: OAuthBlaster class with clear public API
Type hints: Added typing annotations
Better error handling: Validates required fields with clear error messages
Documentation: Includes comprehensive docstrings for the class and methods

Comparative Analysis

Copilot approach: Function-based design, simpler for single-script use
GPT approach: Class-based with inheritance pattern
Claude approach: Most sophisticated architecture with factory function pattern

Concerns and Recommendations ⚠️

1. Security - Hardcoded Credentials

All three implementations contain hardcoded OAuth endpoints for cognito.justinsweb.com. Consider adding a note in README about replacing these with actual OAuth providers.

2. Dependency Management Inconsistency

Main branch lacks pyproject.toml while agent-fight branches have it. Add dependency files to main branch for consistency.

3. Missing Test Coverage

None of the implementations include tests. Recommend adding unit tests for token management, configuration validation, and mocked OAuth flows.

4. Code Quality - Input Validation

Missing validation for OAuth endpoints, redirect URI HTTPS check, and scope format validation.

5. Error Handling - Token Refresh

The auto-refresh mechanism lacks error handling for failures.

6. Documentation Gaps

Missing from main README: customization guide, troubleshooting, security best practices.

Summary

Verdict: ✅ APPROVE - This is valuable documentation of the development process. Code quality is acceptable for a utility script, though there is room for improvement in testing and error handling.

Action Items:

Merge as-is (documentation value outweighs minor issues)
Follow-up: Add tests
Follow-up: Add pyproject.toml to main branch
Follow-up: Enhance error handling
Follow-up: Improve README with security guidance

Great work documenting this experiment!

Add Agent Fight results from last month

ae45e97

Copilot AI review requested due to automatic review settings October 14, 2025 01:14

Copilot AI reviewed Oct 14, 2025

View reviewed changes

Update agent-fight/justinsweb-api.md

ebdb1d3

scrub the sensitive info please Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

justinpecott merged commit af30c5f into main Oct 14, 2025
1 check passed

justinpecott deleted the agent-fight branch October 14, 2025 03:35

claude bot mentioned this pull request Oct 15, 2025

Add automatic browser-based OAuth flow #6

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add Agent Fight results from last month#5

Add Agent Fight results from last month#5
justinpecott merged 2 commits intomainfrom
agent-fight

justinpecott commented Oct 14, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 14, 2025

Uh oh!

Copilot AI Oct 14, 2025

Uh oh!

Copilot AI Oct 14, 2025

Uh oh!

Uh oh!

claude bot commented Oct 14, 2025

Uh oh!

claude bot commented Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	1. Rename all files and folders, existing and generated, from "asp" to "api-blaster". This inclides all "asp" and ".asp" prefixed file and folders.
	1. Rename all files and folders, existing and generated, from "asp" to "api-blaster". This includes all "asp" and ".asp" prefixed file and folders.

Comments

Conversation

justinpecott commented Oct 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claude bot commented Oct 14, 2025

Pull Request Review: Agent Fight Results

Overview

🟢 Strengths

1. Good Organization

2. Comprehensive Documentation

3. Copilot Implementation Quality

🟡 Areas for Improvement

1. Code Quality & Best Practices

Copilot Version (agent-fight/copilot/api-blaster.py):

GPT Version (agent-fight/gpt/):

2. Potential Bugs

Both Implementations:

Copilot Version Specific:

3. Security Concerns

4. Performance Considerations

5. Test Coverage

📋 Specific Recommendations

For Copilot Implementation:

For GPT Implementation:

General:

🎯 Verdict

Uh oh!

claude bot commented Oct 14, 2025

Pull Request Review: Agent Fight Results

Overview

Positive Aspects ✅

Documentation

Code Quality - Winner (Claude Code on main)

Comparative Analysis

Concerns and Recommendations ⚠️

1. Security - Hardcoded Credentials

2. Dependency Management Inconsistency

3. Missing Test Coverage

4. Code Quality - Input Validation

5. Error Handling - Token Refresh

6. Documentation Gaps

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Copilot Version (`agent-fight/copilot/api-blaster.py`):

GPT Version (`agent-fight/gpt/`):