Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easy to run evaluation directly from this repo #2233

Merged
merged 16 commits into from
Feb 10, 2025

Conversation

pamelafox
Copy link
Collaborator

@pamelafox pamelafox commented Dec 13, 2024

Purpose

This PR makes it easier to run evaluations by bringing the evaluation SDK and tools directly into the repo. These scripts still use the ai-rag-chat-evaluator for its custom evaluation metrics and evaluation review CLI tools, but I've moved the ground truth generation directly into the repo, as I've found that it is often very specific to the needs of the repo.

This PR uses RAGAS for ground truth data generation, which works by constructing a knowledge graph and scenarios. That's a different approach from azure-ai-generative, what we used to use, but that SDK is now deprecated, and this RAGAS approach seems to produce good questions.

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[ ] Yes
[X] No

Does this require changes to learn.microsoft.com docs?

This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.

[X] Yes - I need to update the evaluation tutorial!
[ ] No

Type of change

[ ] Bugfix
[X] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff and black manually on my code.

docs/evaluation.md Outdated Show resolved Hide resolved
evals/requirements.txt Outdated Show resolved Hide resolved
@pamelafox pamelafox changed the title WIP: Bring evaluation more tightly into the repo Make it easy to run evaluation directly from this repo Feb 8, 2025
@pamelafox pamelafox merged commit a7dfc64 into Azure-Samples:main Feb 10, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants