Hallucinations (Confabulations) Document-Based Benchmark for RAG
benchmark leaderboard gemini llama language-model claude rag hallucinations ai-evaluation llm llm-benchmarking gpt-4o o1-mini o1-preview confabulations
-
Updated
Jan 6, 2025 - HTML