FMBench support for RAG + Model evaluations

madhurprash released this 01 Aug 14:45

· 215 commits to 144-add-evaluation-support-to-fmbench since this release

Support-Open-Ended-Majority-Vote-Model-Evaluations

This version contains [Work in Progress] code for evaluating both:

Majority Vote: Using a Panel of LLM Evaluators to check for RAG eval on whether a given candidate model output is correct or incorrect.
Average Pooling: Using a Panel of LLM Evaluators to evaluate candidate model responses using 'user-defined' subjective evaluation criteria.

Assets 2