LocalRAG Forge

Introduction

LocalRAG Forge is a local-first RAG framework for maximizing the value of private, self-hosted, and on-device LLM systems. It helps developers build stronger document intelligence workflows around retrieval, reranking, knowledge ingestion, grounded generation, and evaluation without depending on a cloud-only stack.

The project is designed for teams that want to get more capability out of local models through better retrieval pipelines, higher-quality context construction, reusable dataset workflows, and repeatable evaluation. With LocalRAG Forge, you can improve retrieval quality, structure knowledge ingestion, benchmark responses, and validate local RAG behavior before shipping changes into production.

Features

Local-first RAG pipeline orchestration
Retrieval and reranking optimization
Knowledge ingestion and document processing
Dataset generation
LLM evaluation
Automated AI testing
Dataset quality analysis
Retrieval grounding checks
Workflow-level benchmarking
Extensible evaluation modules

Architecture

Document / Knowledge Base
          |
          v
     RAG Pipeline
  (retrieve, rank, prompt)
          |
          v
      LLM Response
          |
          v
   Evaluation Module
 (relevance, grounding,
   retrieval coverage)
          |
          v
     Quality Metrics
   and Test Reports

Installation

Install Python dependencies:

pip install -r requirements.txt

Or install the local framework package in editable mode:

pip install -e .

Optional local services:

docker compose up -d

Quick Start

Run the end-to-end demo:

python examples/demo_rag_test.py

Usage Example

from core.engine import build_default_llm
from dataset.dataset_builder import DatasetBuilder
from evaluation.evaluator import RAGEvaluator
from pipeline.workflow import TestingWorkflow
from rag.rag_pipeline import SimpleRAGPipeline, SourceDocument

documents = [
    SourceDocument(
        document_id="doc-1",
        text="LocalRAG Forge can generate evaluation datasets from source documents and golden answers.",
    ),
    SourceDocument(
        document_id="doc-2",
        text="LocalRAG Forge evaluates RAG systems using grounding, relevance, and retrieval quality metrics.",
    ),
]

pipeline = SimpleRAGPipeline(
    documents=documents,
    llm_client=build_default_llm(),
)

dataset = DatasetBuilder.from_records(
    name="demo-dataset",
    records=[
        {
            "sample_id": "sample-1",
            "question": "What can LocalRAG Forge generate for evaluation?",
            "expected_answer": "LocalRAG Forge can generate evaluation datasets from source documents and golden answers.",
            "relevant_document_ids": ["doc-1"],
        }
    ],
)

workflow = TestingWorkflow(
    pipeline=pipeline,
    evaluator=RAGEvaluator(),
)

report = workflow.run(dataset)
    print(report.average_score)

Project Structure

LocalRAG Forge
├── core
│   └── engine.py
├── rag
│   └── rag_pipeline.py
├── dataset
│   └── dataset_builder.py
├── evaluation
│   └── evaluator.py
├── pipeline
│   └── workflow.py
├── examples
│   └── demo_rag_test.py
├── docs
├── tests
└── cli

core: shared runtime components such as LLM adapters, execution primitives, and engine abstractions.
rag: retrieval and generation pipeline implementations used to test AI application behavior.
dataset: tools for building evaluation datasets, golden sets, and sample records.
evaluation: scoring logic for response quality, grounding, retrieval coverage, and benchmark metrics.
pipeline: orchestration workflows that connect datasets, pipelines, and evaluators into repeatable test runs.
examples: runnable demos that show how to test a RAG workflow with minimal setup.
docs: architecture notes, guides, and future public documentation.
tests: automated tests for framework behavior and regression protection.
cli: command-line interfaces for running evaluations and local automation.

Roadmap

Local model orchestration
Private knowledge workflows
Agent testing
Multi-model evaluation
Dataset synthesis
AI pipeline benchmarking
Retrieval regression suites
CI-integrated evaluation reports

Contributing

Contributions are welcome. Please open an issue to discuss major changes, submit pull requests for improvements, and include tests or examples when adding new features.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 436 Commits
cli		cli
core		core
dataset		dataset
docs		docs
evaluation		evaluation
examples		examples
frontend		frontend
knowledge_factory		knowledge_factory
logs		logs
pipeline		pipeline
processed_documents		processed_documents
rag		rag
retriever		retriever
scripts		scripts
tests		tests
tools		tools
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
api-server.js		api-server.js
app.py		app.py
attention_optimizer.py		attention_optimizer.py
check_bm25_content.py		check_bm25_content.py
check_milvus_data.py		check_milvus_data.py
check_milvus_schema.py		check_milvus_schema.py
config.py		config.py
config.yml		config.yml
config_manager.py		config_manager.py
config_query_optimization_example.yml		config_query_optimization_example.yml
contextual_generator.py		contextual_generator.py
deploy.sh		deploy.sh
deploy_root.sh		deploy_root.sh
docker-compose.yml		docker-compose.yml
document_manager.py		document_manager.py
download_bge_m3.py		download_bge_m3.py
dsrag_enhanced_rse.py		dsrag_enhanced_rse.py
dsrag_implementation.py		dsrag_implementation.py
dsrag_integration_manager.py		dsrag_integration_manager.py
dsrag_perfect_add_document.py		dsrag_perfect_add_document.py
dsrag_perfect_auto_context.py		dsrag_perfect_auto_context.py
dsrag_perfect_auto_query.py		dsrag_perfect_auto_query.py
dsrag_perfect_citations.py		dsrag_perfect_citations.py
dsrag_perfect_custom_term_mapping.py		dsrag_perfect_custom_term_mapping.py
dsrag_perfect_semantic_sectioning.py		dsrag_perfect_semantic_sectioning.py
dsrag_rse.py		dsrag_rse.py
fix_config.sh		fix_config.sh
fix_milvus_schema.py		fix_milvus_schema.py
fix_startup.sh		fix_startup.sh
gc_hybrid_milvus_store.py		gc_hybrid_milvus_store.py
gpu_memory_manager.py		gpu_memory_manager.py
grapecity_qa_milvus_store.py		grapecity_qa_milvus_store.py
grapecity_qa_vector_store.py		grapecity_qa_vector_store.py
hybrid_retriever.py		hybrid_retriever.py
improved_chinese_chunking.py		improved_chinese_chunking.py
kill_all_services.sh		kill_all_services.sh
knowledge_factory_lightweight.py		knowledge_factory_lightweight.py
llm.py		llm.py
loader.py		loader.py
local_bge_embeddings.py		local_bge_embeddings.py
local_reranker.py		local_reranker.py
manage_gpu_processes.py		manage_gpu_processes.py
memory_manager.py		memory_manager.py
milvus_store.py		milvus_store.py
push_to_server.sh		push_to_server.sh
pyproject.toml		pyproject.toml
rag_pipeline.py		rag_pipeline.py
requirements.txt		requirements.txt
requirements_frozen.txt		requirements_frozen.txt
reranker.py		reranker.py
rewriter.py		rewriter.py
run_evaluation.py		run_evaluation.py
search_milvus_content.py		search_milvus_content.py
simulate_sop_exact.py		simulate_sop_exact.py
start_all.sh		start_all.sh
start_all_enhanced.sh		start_all_enhanced.sh
start_knowledge_factory.sh		start_knowledge_factory.sh
stop_backend_tmux.sh		stop_backend_tmux.sh
stop_knowledge_factory.sh		stop_knowledge_factory.sh
storage_service.py		storage_service.py
summarizer.py		summarizer.py
vector_cache.py		vector_cache.py
vector_index.py		vector_index.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LocalRAG Forge

Introduction

Features

Architecture

Installation

Quick Start

Usage Example

Project Structure

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LocalRAG Forge

Introduction

Features

Architecture

Installation

Quick Start

Usage Example

Project Structure

Roadmap

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages