distill 🧪

Local RAG in one command. Index your files, ask questions, get answers — no cloud, no config.

pip install distill-rag

# Index a codebase or docs folder
distill index .

# Ask questions
distill ask "how does authentication work?"
distill ask "what are the main API endpoints?"
distill ask "summarize the database schema"

# Interactive chat mode
distill chat

That's it. No vector databases to set up. No cloud accounts. Everything runs locally.

Why distill?

RAG tools are either too complex (LangChain, LlamaIndex — pages of boilerplate) or too limited (can't handle code). distill is the middle ground:

🏠 Fully local — SQLite + Ollama. Your data never leaves your machine.
⚡ Zero config — distill index . && distill ask "..."
💻 Code-aware — Understands Python, JS/TS, Rust, Go, Java, C/C++, and more
📄 Doc-aware — Markdown, RST, TXT, PDF, HTML
🔄 Incremental — Re-index only changed files
🔌 Multi-LLM — Works with Ollama (default), OpenAI, Anthropic, or any OpenAI-compatible API

How It Works

┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
│  Files    │────▶│  Chunk   │────▶│ Embed    │────▶│ SQLite   │
│  on disk  │     │  & parse │     │ vectors  │     │  + vec   │
└──────────┘     └──────────┘     └──────────┘     └──────────┘
                                                          │
┌──────────┐     ┌──────────┐     ┌──────────┐           │
│  Answer   │◀────│  LLM     │◀────│ Retrieve │◀──────────┘
│           │     │  generate│     │ top-k    │
└──────────┘     └──────────┘     └──────────┘

Index — Files are chunked (respecting code boundaries), embedded, and stored in SQLite with sqlite-vec
Query — Your question is embedded, top-k similar chunks are retrieved
Answer — Retrieved chunks + your question go to the LLM for a grounded answer

Commands

Command	Description
`distill index <path>`	Index files in a directory
`distill ask "<question>"`	Ask a question about indexed files
`distill chat`	Interactive chat mode with context
`distill search "<query>"`	Raw similarity search (no LLM)
`distill status`	Show index stats
`distill forget <path>`	Remove a path from the index

Options

--model TEXT        LLM model (default: ollama/llama3)
--embed-model TEXT  Embedding model (default: ollama/nomic-embed-text)
--top-k INT         Number of chunks to retrieve (default: 5)
--chunk-size INT    Target chunk size in tokens (default: 512)
--include TEXT      File patterns to include (e.g., "*.py,*.md")
--exclude TEXT      File patterns to exclude
--db PATH           Database path (default: .distill/index.db)
--verbose           Show retrieved chunks and scores

Examples

Index and query a codebase

cd my-project
distill index .
distill ask "how is error handling done?"

Use with OpenAI instead of Ollama

export OPENAI_API_KEY=sk-...
distill ask "explain the main architecture" --model gpt-4o-mini --embed-model text-embedding-3-small

Search without LLM generation

distill search "database migration" --top-k 10

Output:

[0.92] src/db/migrations.py:45-78
  def run_migrations(engine):
      """Run all pending migrations in order..."""

[0.87] docs/database.md:12-34
  ## Migrations
  We use alembic for database migrations...

[0.81] src/db/models.py:1-23
  """SQLAlchemy models for the application."""

Interactive chat

distill chat

distill> what does the auth middleware do?

Based on src/auth/middleware.py, the auth middleware:
1. Extracts the JWT token from the Authorization header
2. Validates it against the secret key
3. Attaches the user object to the request context
...

distill> what about rate limiting?

The rate limiter in src/middleware/ratelimit.py uses a sliding window...

Supported File Types

Category	Extensions
Code	`.py`, `.js`, `.ts`, `.jsx`, `.tsx`, `.rs`, `.go`, `.java`, `.c`, `.cpp`, `.h`, `.rb`, `.php`, `.swift`, `.kt`
Docs	`.md`, `.rst`, `.txt`, `.html`, `.pdf`
Config	`.yaml`, `.yml`, `.toml`, `.json`, `.ini`
Data	`.csv`, `.sql`

Privacy

Everything stays on your machine:

Embeddings stored in local SQLite
Default LLM is Ollama (fully offline)
No telemetry, no cloud calls (unless you choose a cloud model)

Requirements

Python 3.10+
Ollama (for default local mode) — or an OpenAI/Anthropic API key

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
distill		distill
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

distill 🧪

Why distill?

How It Works

Commands

Options

Examples

Index and query a codebase

Use with OpenAI instead of Ollama

Search without LLM generation

Interactive chat

Supported File Types

Privacy

Requirements

License

About

Uh oh!

Releases

Packages

Languages

License

nightcityblade/distill

Folders and files

Latest commit

History

Repository files navigation

distill 🧪

Why distill?

How It Works

Commands

Options

Examples

Index and query a codebase

Use with OpenAI instead of Ollama

Search without LLM generation

Interactive chat

Supported File Types

Privacy

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages