Skip to content

Conversation

@alzeck
Copy link

@alzeck alzeck commented Nov 27, 2025

This pull request adds support for the OpenAI Evals API introduced earlier this year.

Test for retrieving an output item are a bit flaky as it requires the evals run to be completed in order for the fetch to work, happy for any suggestion on how to implement this differently if needed.

All Submissions:

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?
  • Have you added an explanation of what your changes do and why you'd like us to include them?

Copilot AI review requested due to automatic review settings November 27, 2025 19:13
Copilot finished reviewing on behalf of alzeck November 27, 2025 19:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds comprehensive support for the OpenAI Evals API, enabling users to systematically evaluate AI model performance through the ruby-openai gem. The implementation follows the existing codebase patterns with proper module structure, test coverage using VCR cassettes, and detailed documentation.

Key Changes:

  • Added complete Evals API client implementation with support for evaluations, runs, and output items
  • Comprehensive RSpec test suite with 13 test cases covering all API endpoints
  • Extensive README documentation with usage examples for all supported operations

Reviewed changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
lib/openai/evals.rb New Evals module with nested Runs and OutputItems classes implementing all API endpoints
lib/openai/client.rb Integration of evals accessor method into main client
lib/openai.rb Added require statement for the new evals module
spec/openai/client/evals_spec.rb Complete test suite covering all evals, runs, and output_items operations
README.md Added comprehensive documentation section with 11 usage examples
spec/fixtures/cassettes/*.yml 38 VCR cassette files for test fixtures

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant