rag-notebook

A Python-based RAG (Retrieval-Augmented Generation) system for document processing and vector search.

Features

PDF and text document loading with LangChain
Document chunking and embedding generation
Vector storage using ChromaDB and FAISS
Support for multiple document formats (PDF, TXT)

Dependencies

langchain & langchain-community
chromadb
faiss-cpu
sentence-transformers
pymupdf & pypdf

Installation

pip install -r requirements.txt

Usage

Documents are stored in data/ directory:

data/pdf/ - PDF files
data/text_files/ - Text files
data/vector_store/ - ChromaDB vector storage

Notebooks

Example notebooks demonstrating document loading, processing, and RAG pipeline:

document.ipynb - Basic document loading with text and PDF files
pdf_loader.ipynb - Complete RAG pipeline with PDF processing, chunking, embeddings, and vector storage

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
notepad		notepad
.DS_Store		.DS_Store
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

rag-notebook

Features

Dependencies

Installation

Usage

Notebooks

About

Uh oh!

Releases

Packages

Languages

pawneetdev/rag-notebook

Folders and files

Latest commit

History

Repository files navigation

rag-notebook

Features

Dependencies

Installation

Usage

Notebooks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages