Skip to content

FMBench support for RAG + Model evaluations

Compare
Choose a tag to compare
@madhurprash madhurprash released this 01 Aug 14:45
· 215 commits to 144-add-evaluation-support-to-fmbench since this release

This version contains [Work in Progress] code for evaluating both:

  1. Majority Vote: Using a Panel of LLM Evaluators to check for RAG eval on whether a given candidate model output is correct or incorrect.
  2. Average Pooling: Using a Panel of LLM Evaluators to evaluate candidate model responses using 'user-defined' subjective evaluation criteria.