PairOfCleats

Local-first hybrid code search for humans and coding agents.
Build an offline index of a repo, then retrieve the most relevant chunks using BM25 + fuzzy matching + embeddings + metadata filters.

("Paraclete"): "PairOfCleats" is a phonetic nod to Paraclete--a word meaning helper/advocate.

The idea: give your agent (or you) a helper that can sprint through a large codebase with better traction than plain grep.

What this is

PairOfCleats builds a hybrid semantic index for a repository (code + configs + docs, and optionally triage records) and exposes:

a CLI (pairofcleats search, pairofcleats index build)
an HTTP API server (pairofcleats service api)

Why it exists

Large repos make "just read the whole tree" impractical

Grep is fast but literal.
Pure embeddings can be fuzzy and harder to constrain.
Agents need structured context (functions/classes/sections), not giant file dumps.

PairOfCleats combines the strengths:

Chunk-aware indexing → results are immediately usable snippets
Lexical + fuzzy + semantic retrieval → better recall without losing precision
Rich metadata → filters like type/signature/reads-writes/calls/churn/risk tags
Scale options → memory artifacts for small repos; SQLite + ANN for large ones; auto picks the best available backend based on index size + installed deps

Requirements

>Node.js 24.13.0 LTS (see .nvmrc)
Optional (recommended for best Python chunk metadata): Python 3 (indexing.pythonAst.*)
Optional (recommended for large repos): SQLite backend (via better-sqlite3)
Optional (recommended for fastest semantic search): sqlite-vec extension for ANN
Optional (document extraction): PDF/DOCX support (planned) via pdfjs-dist + mammoth (indexing.documentExtraction.enabled)
Optional performance backends (auto-selected when available): LMDB, LanceDB, SQLite ANN extension. Set explicit config to force a backend.

Quick start

pairofcleats setup
- Guided prompts for install, dictionaries, models, extensions, tooling, and indexes.
CLI: node bin/pairofcleats.js <command>
pairofcleats index watch
pairofcleats service api (local HTTP JSON API for status/search)

Install

npm install

CI/PR test suite

node tests/run.js --lane ci-lite

Guided setup (recommended)

pairofcleats setup

Bootstrap (no prompts)

pairofcleats bootstrap

Build index

pairofcleats index build
# Add --mode code|prose|extracted-prose|records|all|both to scope the index
# Add --quality auto|fast|balanced|max to tune AutoPolicy

Search

pairofcleats search -- "how do we validate JWT tokens?"
pairofcleats search -- "UserRepository findByEmail" --mode code
pairofcleats search -- "rate limit exceeded" --mode prose

Query syntax (core)

"exact phrase" boosts phrase matches
-term excludes a token
-"phrase" excludes a phrase

Modes:

--mode code (code-focused)
--mode prose (docs/readmes/comments)
--mode extracted-prose (comment-prose only; requires extracted-prose index)
--mode records (triage records)
--mode both (alias for all when indexing)
--mode all (code + prose + extracted-prose + records)

Use --explain (or --why) to see score breakdowns.

Backends (memory, SQLite, LMDB)

PairOfCleats can query indexes through different backends:

memory: file-backed JSON artifacts loaded into memory
sqlite: SQLite tables used as the backend (same general scoring model)
lmdb: LMDB tables used as the backend (build separately) For large repos, SQLite is usually the best experience.

Build LMDB indexes:

pairofcleats lmdb build

Search with SQLite:

pairofcleats search -- "query" --backend sqlite

Where artifacts live (cache)

By default, caches and indexes live outside the repo:

cache root: OS-specific (override with cache.root in .pairofcleats.json)
per-repo artifacts: <cache>/repos/<repoId>/builds/<buildId>/index-code, index-prose, etc.
current pointer: <cache>/repos/<repoId>/builds/current.json (active build root)

Override cache location via .pairofcleats.json:

{ "cache": { "root": "/absolute/path/to/cache" } }

Mental model

PairOfCleats has two steps: build an index, then search it.

Index: repo files -> index build -> artifacts/sqlite

Search: query -> filters + rank -> top chunks

ASCII draft:

[Repo] -> [Index build] -> [Artifacts / SQLite] [Query] -> [Search pipeline] -> [Ranked chunks]

Detailed diagrams: docs/guides/architecture.md

Learn more

Search pipeline: docs/guides/search.md
Architecture diagrams: docs/guides/architecture.md
Setup & bootstrap: docs/guides/setup.md
Config schema: docs/config/schema.json
SQLite schema: docs/sqlite/index-schema.md
SQLite ANN extension: docs/sqlite/ann-extension.md
API server: docs/api/server.md
Triage records: docs/guides/triage-records.md
Structural search: docs/guides/structural-search.md

Status

Active development. See GIGAROADMAP_2.md for current execution status.

License

License not yet specified in this repo.

Name		Name	Last commit message	Last commit date
Latest commit History 257 Commits
.github		.github
assets/isomap		assets/isomap
benchmarks		benchmarks
bin		bin
docs		docs
eslint-rules		eslint-rules
extensions/vscode		extensions/vscode
rules		rules
src		src
sublime/PairOfCleats		sublime/PairOfCleats
tests		tests
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
.markdownlint-cli2.jsonc		.markdownlint-cli2.jsonc
.nvmrc		.nvmrc
.pairofcleats.json		.pairofcleats.json
.rgignore		.rgignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
COMPLETED_PHASES.md		COMPLETED_PHASES.md
FUTUREROADMAP.md		FUTUREROADMAP.md
GIGAROADMAP_2.md		GIGAROADMAP_2.md
HAWKTUI_ROADMAP.md		HAWKTUI_ROADMAP.md
LEXI.md		LEXI.md
NIKE_SB_CHUNK_ROADMAP.md		NIKE_SB_CHUNK_ROADMAP.md
README.md		README.md
SWEET16_ROADMAP.md		SWEET16_ROADMAP.md
TES_LAYN_ROADMAP.md		TES_LAYN_ROADMAP.md
broken_tests.md		broken_tests.md
build_index.js		build_index.js
clete.png		clete.png
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
search.js		search.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PairOfCleats

What this is

Why it exists

Requirements

Quick start

Install

CI/PR test suite

Guided setup (recommended)

Bootstrap (no prompts)

Build index

Search

Query syntax (core)

Backends (memory, SQLite, LMDB)

Where artifacts live (cache)

Mental model

Learn more

Status

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

Uh oh!

doublemover/PairOfCleats

Folders and files

Latest commit

History

Repository files navigation

PairOfCleats

What this is

Why it exists

Requirements

Quick start

Install

CI/PR test suite

Guided setup (recommended)

Bootstrap (no prompts)

Build index

Search

Query syntax (core)

Backends (memory, SQLite, LMDB)

Where artifacts live (cache)

Mental model

Learn more

Status

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages