Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,13 @@ qmd collection add . --name <n> # Create/index collection
qmd collection list # List all collections with details
qmd collection remove <name> # Remove a collection by name
qmd collection rename <old> <new> # Rename a collection
qmd collection show <name> # Show collection details
qmd collection update-cmd <n> cmd # Set pre-update command (e.g. 'git pull')
qmd collection include <name> # Include in default queries
qmd collection exclude <name> # Exclude from default queries
qmd ls [collection[/path]] # List collections or files in a collection
qmd context add [path] "text" # Add context for path (defaults to current dir)
qmd context list # List all contexts
qmd context check # Check for collections/paths missing context
qmd context rm <path> # Remove context
qmd get <file> # Get document by path or docid (#abc123)
qmd multi-get <pattern> # Get multiple docs by glob or comma-separated list
Expand Down Expand Up @@ -70,9 +73,6 @@ qmd context add qmd://journals/2024 "Journal entries from 2024"
# List all contexts
qmd context list

# Check for collections or paths without context
qmd context check

# Remove context
qmd context rm qmd://journals/2024
qmd context rm / # Remove global context
Expand Down Expand Up @@ -135,7 +135,7 @@ bun test --preload ./src/test-preload.ts test/

- SQLite FTS5 for full-text search (BM25)
- sqlite-vec for vector similarity search
- node-llama-cpp for embeddings (embeddinggemma), reranking (qwen3-reranker), and query expansion (Qwen3)
- node-llama-cpp for embeddings (embeddinggemma), reranking (qwen3-reranker), and query expansion (qmd-query-expansion-1.7B)
- Reciprocal Rank Fusion (RRF) for combining results
- Smart chunking: 900 tokens/chunk with 15% overlap, prefers markdown headings as boundaries

Expand Down
38 changes: 20 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,9 +74,7 @@ qmd get "docs/api-reference.md" --full
Although the tool works perfectly fine when you just tell your agent to use it on the command line, it also exposes an MCP (Model Context Protocol) server for tighter integration.

**Tools exposed:**
- `qmd_search` - Fast BM25 keyword search (supports collection filter)
- `qmd_vector_search` - Semantic vector search (supports collection filter)
- `qmd_deep_search` - Deep search with query expansion and reranking (supports collection filter)
- `qmd_query` - Search with typed sub-queries (lex/vec/hyde/expand)
- `qmd_get` - Retrieve document by path or docid (with fuzzy matching suggestions)
- `qmd_multi_get` - Retrieve multiple documents by glob pattern, list, or docids
- `qmd_status` - Index health and collection info
Expand Down Expand Up @@ -129,8 +127,9 @@ qmd mcp stop # stop via PID file
qmd status # shows "MCP: running (PID ...)" when active
```

The HTTP server exposes two endpoints:
Endpoints:
- `POST /mcp` — MCP Streamable HTTP (JSON responses, stateless)
- `POST /query` — REST structured search (alias: `/search`)
- `GET /health` — liveness check with uptime

LLM models stay loaded in VRAM across requests. Embedding/reranking contexts are disposed after 5 min idle and transparently recreated on the next request (~1s penalty, models remain loaded).
Expand Down Expand Up @@ -180,14 +179,14 @@ Point any MCP client at `http://localhost:8181/mcp` to connect.
│ RRF Fusion + Bonus │
│ Original query: ×2 │
│ Top-rank bonus: +0.05│
│ Top 30 Kept │
│ Top 40 Kept │
└───────────┬───────────┘
┌───────────────────────┐
│ LLM Re-ranking │
│ (qwen3-reranker) │
Yes/No + logprobs
rankAll() scores
└───────────┬───────────┘
Expand All @@ -207,7 +206,7 @@ Point any MCP client at `http://localhost:8181/mcp` to connect.
|---------|-----------|------------|-------|
| **FTS (BM25)** | SQLite FTS5 BM25 | `Math.abs(score)` | 0 to ~25+ |
| **Vector** | Cosine distance | `1 / (1 + distance)` | 0.0 to 1.0 |
| **Reranker** | LLM 0-10 rating | `score / 10` | 0.0 to 1.0 |
| **Reranker** | `rankAll()` score | Used directly | 0.0 to 1.0 |

### Fusion Strategy

Expand All @@ -217,8 +216,8 @@ The `query` command uses **Reciprocal Rank Fusion (RRF)** with position-aware bl
2. **Parallel Retrieval**: Each query searches both FTS and vector indexes
3. **RRF Fusion**: Combine all result lists using `score = Σ(1/(k+rank+1))` where k=60
4. **Top-Rank Bonus**: Documents ranking #1 in any list get +0.05, #2-3 get +0.02
5. **Top-K Selection**: Take top 30 candidates for reranking
6. **Re-ranking**: LLM scores each document (yes/no with logprobs confidence)
5. **Top-K Selection**: Take top 40 candidates for reranking
6. **Re-ranking**: Cross-encoder scores each document via `rankAll()`
7. **Position-Aware Blending**:
- RRF rank 1-3: 75% retrieval, 25% reranker (preserves exact matches)
- RRF rank 4-10: 60% retrieval, 40% reranker
Expand Down Expand Up @@ -476,20 +475,23 @@ Index stored in: `~/.cache/qmd/index.sqlite`
### Schema

```sql
collections -- Indexed directories with name and glob patterns
path_contexts -- Context descriptions by virtual path (qmd://...)
documents -- Markdown content with metadata and docid (6-char hash)
content -- Content-addressable storage (hash PK, doc, created_at)
documents -- File metadata (collection, path, title, hash FK to content)
documents_fts -- FTS5 full-text index
content_vectors -- Embedding chunks (hash, seq, pos, 900 tokens each)
vectors_vec -- sqlite-vec vector index (hash_seq key)
llm_cache -- Cached LLM responses (query expansion, rerank scores)
```

Collections and contexts are managed in `~/.config/qmd/index.yml`, not SQLite.

## Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `XDG_CACHE_HOME` | `~/.cache` | Cache directory location |
| `XDG_CACHE_HOME` | `~/.cache` | Cache directory (SQLite index) |
| `XDG_CONFIG_HOME` | `~/.config` | Config directory (index.yml) |
| `NO_COLOR` | *(unset)* | Disable terminal colors when set |

## How It Works

Expand Down Expand Up @@ -575,11 +577,11 @@ Query ──► LLM Expansion ──► [Original, Variant 1, Variant 2]
Top-rank bonus: +0.05/#1, +0.02/#2-3
Top 30 candidates
Top 40 candidates
LLM Re-ranking
(yes/no + logprob confidence)
(cross-encoder via rankAll())
Position-Aware Blend
Expand Down Expand Up @@ -613,11 +615,11 @@ const DEFAULT_GENERATE_MODEL = "hf:tobil/qmd-query-expansion-1.7B-gguf/qmd-query

### Qwen3-Reranker

Uses node-llama-cpp's `createRankingContext()` and `rankAndSort()` API for cross-encoder reranking. Returns documents sorted by relevance score (0.0 - 1.0).
Uses node-llama-cpp's `createRankingContext()` and `rankAll()` API for cross-encoder reranking. Returns documents sorted by relevance score (0.0 - 1.0).

### Qwen3 (Query Expansion)
### Query Expansion (Fine-tuned)

Used for generating query variations via `LlamaChatSession`.
`qmd-query-expansion-1.7B` generates query variations via `LlamaChatSession`. Fine-tuned from Qwen3-1.7B.

## License

Expand Down
16 changes: 4 additions & 12 deletions docs/SYNTAX.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,24 +106,16 @@ This generates lex, vec, and hyde variations automatically. Useful when you don'

## MCP/HTTP API

The `query` tool accepts a query document:

```json
{
"q": "lex: CAP theorem\nvec: consistency vs availability",
"collections": ["docs"],
"limit": 10
}
```

Or structured format:
The `query` tool accepts a structured format:

```json
{
"searches": [
{ "type": "lex", "query": "CAP theorem" },
{ "type": "vec", "query": "consistency vs availability" }
]
],
"collections": ["docs"],
"limit": 10
}
```

Expand Down
12 changes: 8 additions & 4 deletions skills/qmd/references/mcp-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ qmd mcp stop # Stop daemon

## Tools

### structured_search
### query

Search with pre-expanded queries.

Expand All @@ -61,7 +61,7 @@ Search with pre-expanded queries.
{ "type": "hyde", "query": "hypothetical answer passage..." }
],
"limit": 10,
"collection": "optional",
"collections": ["optional"],
"minScore": 0.0
}
```
Expand All @@ -71,15 +71,17 @@ Search with pre-expanded queries.
| `lex` | BM25 | Keywords (2-5 terms) |
| `vec` | Vector | Question |
| `hyde` | Vector | Answer passage (50-100 words) |
| `expand` | LLM | Auto-expand (max 1 per query) |

### get

Retrieve document by path or `#docid`.

| Param | Type | Description |
|-------|------|-------------|
| `path` | string | File path or `#docid` |
| `full` | bool? | Return full content |
| `file` | string | File path or `#docid` |
| `fromLine` | number? | Start from line (1-indexed) |
| `maxLines` | number? | Max lines to return |
| `lineNumbers` | bool? | Add line numbers |

### multi_get
Expand All @@ -89,7 +91,9 @@ Retrieve multiple documents.
| Param | Type | Description |
|-------|------|-------------|
| `pattern` | string | Glob or comma-separated list |
| `maxLines` | number? | Max lines per file |
| `maxBytes` | number? | Skip large files (default 10KB) |
| `lineNumbers` | bool? | Add line numbers |

### status

Expand Down