Skip to content

Conversation

@piyushkr0509
Copy link

@piyushkr0509 piyushkr0509 commented Feb 10, 2026

@virattt extract eval parsing/reliability logic into a shared core module
add deterministic reliability gates before LLM judging in eval run
add dataset selection support for eval runs (default, regression, custom CSV)
add focused regression eval dataset
add regression tests for env provider key mapping and progress channel behavior
add eval core unit tests for CSV parsing + reliability checks
include xAI provider API-key mapping fix in env config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant