rag-evaluation

Here are 30 public repositories matching this topic...

Giskard-AI / giskard-oss

🐢 Open-Source Evaluation & Testing library for LLM Agents

ai-security mlops fairness-ai responsible-ai ml-validation red-team-tools trustworthy-ai ml-testing llm ai-red-team ai-testing llmops llm-security llm-eval llm-evaluation rag-evaluation agent-evaluation

Updated Oct 23, 2025
Python

Marker-Inc-Korea / AutoRAG

Star

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

python open-source qa benchmarking ops pipeline analysis optimization evaluation embeddings automl document-parser rag llm retrieval-augmented-generation llm-ops llm-evaluation rag-evaluation

Updated Oct 13, 2025
Python

Agenta-AI / agenta

Star

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

prompt-engineering prompt-management llm-tools llm-framework llm-playground llm-platform llm-evaluation rag-evaluation llm-monitoring llm-as-a-judge llm-observability llmops-platform

Updated Oct 22, 2025
Python

frutik / Awesome-RAG

Star

rag rag-implementation rag-evaluation

Updated Sep 7, 2025

vectara / open-rag-eval

Star

Open source RAG evaluation package

metrics evaluation-metrics rag vectara retrieval-augmented-generation rag-evaluation

Updated Oct 24, 2025
Python

LLAMATOR-Core / llamator

Star

Framework for testing vulnerabilities of large language models (LLM).

Updated Sep 24, 2025
Python

mts-ai / rurage

Star

information-retrieval question-answering rag llm-evaluation rag-evaluation

Updated Apr 14, 2025
Python

oztrkoguz / RAG-Framework-Evaluation

Star

This project aims to compare different Retrieval-Augmented Generation (RAG) frameworks in terms of speed and performance.

swarms autogen rag langchain llamaindex rag-evaluation crewai langchain-rag autogen-rag crewai-rag llamaindex-rag swarms-rag

Updated Jul 28, 2024
Python

ioannis-papadimitriou / rag-playground

Star

A framework for systematic evaluation of retrieval strategies and prompt engineering in RAG systems, featuring an interactive chat interface for document analysis.

chatbot qa-generation llm-inference retrieval-augmented-generation rag-evaluation

Updated Dec 18, 2024
Python

rostyslavshovak / RAG-Retrieval-Augmented-Generation

Star

RAG Chatbot for Financial Analysis

open-source pdf rag gradio-interface langchain qdrant-vector-database retrieval-augmented-generation rag-evaluation

Updated Mar 9, 2025
Python

simranjeet97 / Learn_RAG_from_Scratch_LLM

Star

Learn Retrieval-Augmented Generation (RAG) from Scratch using LLMs from Hugging Face and Langchain or Python

artificial-intelligence rag datascience-machinelearning generative-ai llm-training retrieval-augmented-generation rag-model llm-framework llm-apps llm-evaluation genai-usecase rag-implementation rag-evaluation rag-embeddings rag-pipeline rag-llm rag-chatbot rag-application genai-domain

Updated Jan 20, 2025
Jupyter Notebook

shaadclt / EvalRAG

Sponsor

Star

A comprehensive evaluation toolkit for assessing Retrieval-Augmented Generation (RAG) outputs using linguistic, semantic, and fairness metrics

rag rag-evaluation

Updated Apr 19, 2025
Python

fkapsahili / EntRAG

Star

EntRAG - Enterprise RAG Benchmark

benchmark evaluations retrieval evaluation dataset knowledge-graph rag llm generative-ai retrieval-augmented-generation llm-evaluation rag-evaluation

Updated Jun 10, 2025
Python

bluewave-labs / evalwise

Sponsor

Star

EvalWise is a developer-friendly platform for LLM evaluation and red teaming that helps test AI models for safety, compliance, and performance issues

rag llm prompt-engineering llmops prompt-testing evals llm-evaluation rag-evaluation llm-evaluation-toolkit

Updated Sep 4, 2025
Python

neomatrix369 / AIE7-Cert-Challenge

Sponsor

Star

AIE7: Certification Challenge

Updated Oct 14, 2025
Jupyter Notebook

BetterRAG: Powerful RAG evaluation toolkit for LLMs. Measure, analyze, and optimize how your AI processes text chunks with precision metrics. Perfect for RAG systems, document processing, and embedding quality assessment.

optimization evaluation embeddings evaluation-framework rag embeddings-extraction rag-evaluation rag-application rag-optimization chunking-optimization embeddings-optimization

Updated Mar 26, 2025
Python

sprakash21 / aws-genai-rageval-bot

Star

RAG Pipeline Evaluation and monitoring on AWS using RAGAS

monitoring rag-evaluation genai-chatbot ragas

Updated Oct 16, 2024
Python

AnasAber / MLflow_with_RAG

Star

Using MLflow to deploy your RAG pipeline, using LLamaIndex, Langchain and Ollama/HuggingfaceLLMs/Groq

deployment cicd evaluation-metrics rag mlops mlflow mlflow-tracking-server mlflow-tracking mlflow-projects mlflow-ui mlops-template mlops-project llamaindex rag-evaluation rag-pipeline llamaindex-rag mlflow-deployement

Updated Jan 20, 2025
Python

264Gaurav / medical-RAG-chatbot

Star

A LangChain-based Retrieval-Augmented Generation (RAG) chatbot for medical data. Integrates with Gemini/Grok AI to deliver accurate, context-aware answers in healthcare and biomedical domains.

debugging data-visualization nomic evaluation-metrics dvc rag mlflow experiment-tracking vector-database langchain llmops dagshub ollama langsmith pineconedb rag-evaluation ragas rag-chatbot ragas-evaluation

Updated Sep 10, 2025
Jupyter Notebook

alexmartin1722 / mirage

Star

An evaluation framework for evaluating any modality to text generation and multimodal RAG.

multimodal rag rag-evaluation multimodal-summarization multimodal-rag

Updated Oct 23, 2025
Python

Improve this page

Add a description, image, and links to the rag-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rag-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rag-evaluation

Here are 30 public repositories matching this topic...

Giskard-AI / giskard-oss

Marker-Inc-Korea / AutoRAG

Agenta-AI / agenta

frutik / Awesome-RAG

vectara / open-rag-eval

LLAMATOR-Core / llamator

mts-ai / rurage

oztrkoguz / RAG-Framework-Evaluation

ioannis-papadimitriou / rag-playground

rostyslavshovak / RAG-Retrieval-Augmented-Generation

simranjeet97 / Learn_RAG_from_Scratch_LLM

shaadclt / EvalRAG

fkapsahili / EntRAG

bluewave-labs / evalwise

neomatrix369 / AIE7-Cert-Challenge

Kaos599 / BetterRAG

sprakash21 / aws-genai-rageval-bot

AnasAber / MLflow_with_RAG

264Gaurav / medical-RAG-chatbot

alexmartin1722 / mirage

Improve this page

Add this topic to your repo