FMBench support for RAG + Model evaluations
madhurprash
released this
01 Aug 14:45
·
215 commits
to 144-add-evaluation-support-to-fmbench
since this release
This version contains [Work in Progress] code for evaluating both:
- Majority Vote: Using a Panel of LLM Evaluators to check for RAG eval on whether a given candidate model output is correct or incorrect.
- Average Pooling: Using a Panel of LLM Evaluators to evaluate candidate model responses using 'user-defined' subjective evaluation criteria.