MyKB is an on-prem, zero-trust knowledge engine: crawl β chunk β embed β store β retrieve β compress β answer β wrapped with a hardened auth plane and an audit-friendly API surface.
It runs fully offline or with pluggable cloud LLMs, and ships in two modes:
- Solo Mode (No Auth): Local, private, developer-first. Runs on
localhostwith no authentication. Perfect for OSS hackers and indie devs. - Team Mode (Auth Enabled): JWT-based authentication with short-lived tokens and refresh support. Every query is policy-filtered before retrieval. Perfect for small teams that need zero-trust enforcement.
+-------------------------+ +--------------------------+
| Identity & Policy | JWT/SSO | Gateway (Planned) |
| (Auth Core) +---------->+ Policy Filters + Tools |
| - EdDSA JWT | | kb.search / kb.ingest |
| - Refresh tokens | +-----------+--------------+
| - Device fingerprint | |
| - Admin sessions (IP) | | policy-enforced queries
+-----------+-------------+ v
| +-----------------------------+
| | KB Core (RAG Engine) |
| | - Crawlers (site profiles) |
| | - Markdown AST chunker |
| | - Dense + sparse embeds |
| | - Qdrant hybrid index |
| | - Rerank + MMR + compress |
| | - Incremental ledger |
| +---------------+-------------+
| |
| +-----------v------------+
| | Vector DB (Qdrant) |
| | - named vectors |
| | - payload/metadata |
| +------------------------+
|
+-----> Observability & Admin:
- JSON logs, rate limits, health endpoints
- Future: Feedback β Patch Registry β Eval sets
- Auth Core = production-grade identity, tokens, sessions, rate limits, audits.
- KB Core = RAG pipeline (hybrid retrieval, reranking, context compression), ingestion, and incremental updates.
- Gateway (planned) = single entry point that enforces policy pre-retrieval (zero-trust) and exposes standardized tools.
- Crawling & Site Profiles: Headless crawler with per-domain configs (CSS scoping, link pruning).
- Markdown AST Chunking: Token-aware, structure-preserving chunker (headings, code, tables). Stable chunk IDs + section anchors.
- Embeddings: Dense (FastEmbed/ONNX) + BM25 sparse vectors for hybrid recall.
- Hybrid Retrieval: Dense + sparse, RRF fusion, optional cross-encoder reranker, MMR for diversity.
- Context Compressor: Semantic sentence extraction to keep answers concise but grounded.
- Incremental Ledger (SQLite): Detects chunk-level changes via content hashes β upsert only deltas.
- API (FastAPI):
/seed/preview,/ingest/urls,/search,/search/code_examples,/health.
- Collections for docs + code examples.
- Named vectors for dense + sparse.
- Payload holds URL, domain, chunk_id, section path, etc.
- Bulk or incremental index modes; configurable shards.
- EdDSA (Ed25519) JWT for access.
- Refresh tokens are DB-backed and revocable.
- Device fingerprint binding.
- Admin plane: session + IP whitelist.
- Rate limiting: per-route; CSP & CORS hardened.
- Audit logs: structlog JSON.
chunk_id(stable, deterministic)url,domain,section_path,headingtype = text|code|table|headingtextor content block- embeddings: dense[], sparse{indices,values}
- metadata: neighbor windowing, compressed/original lengths
documents: (url, doc_hash, chunk_ids, etag, last_modified)chunk_hashes: (url, chunk_id, chunk_hash)
This drives delta updates: add, delete, or change only whatβs new.
- Client β Gateway/Auth with JWT β derive policy filter.
- KB Core applies hybrid retrieval under filter constraints.
- Optional rerank (cross-encoder) + MMR for diversity.
- Neighbor expansion + semantic compression β concise context.
- Answering: optional LLM (local or cloud) uses retrieved context + citations.
- Auth plane: asymmetric JWTs, refresh rotation, device binding, admin sessions with IP allowlist.
- Zero-trust enforcement (planned Gateway): every query filtered pre-retrieval.
- Auditability: structured logs, health checks, (planned) per-query decision trails.
- Small: single host (Auth + KB + Qdrant) with optional GPU.
- Medium: Auth isolated; KB + Qdrant on separate node.
- Large: multi-node Qdrant (shards/replicas), autoscaled KB workers.
All containerized via Docker; config managed with .env.
- Crawl/ingest with site profiles
- Markdown AST chunker
- Hybrid search (dense+BM25), RRF, MMR
- Optional reranker, semantic compression
- Incremental ledger & precise chunk deletes
- Clean REST API
- Auth Core reference (JWT, refresh, device bind, rate limits, CSP/CORS)
- Gateway with policy-aware pre-retrieval filters
- Multi-tenancy isolation (collections or payload encryption)
- Policy-aware chunking (per-section ACLs)
- Feedback β Patch Registry β Eval Sets (self-healing loop)
- Solution Packs (domain-specific tuning)
- Lineage graph across sources (code/docs/tickets)
- Compliance Co-Pilot (policy enforcement on code/docs)
- On-prem first: fully offline path.
- Performance: small-model embeddings, GPU optional.
- Reliability: idempotent ingestion, ledgered changes.
- Portability: Python, FastAPI, ONNX/FastEmbed, Qdrant, Docker.
Services:
- KB Core API (FastAPI)
- Auth Core API (FastAPI)
- Qdrant vector store
Data:
- Vector DB: embeddings + metadata
- Ledger (SQLite): ingestion state
- Auth DB (Postgres): users, tokens, devices
Ops:
- Health endpoints, JSON logs, rate limits, CSP/CORS, Docker.
- M1 Foundations: Gateway MVP (JWT β filter map β pre-retrieval); OSS "Team" RBAC.
- M2 Safety + Packs: Feedback UI β Patch Registry + Eval harness; release Solution Packs.
- M3 Enterprise: Policy-aware chunking; multi-tenant isolation; lineage prototype; SSO/SAML/OIDC.
- No repo URLs, tokens, or secrets.
- No IPs, keys, or deployment envs.
- No customer data models.
π This README is an overview & architecture document for MyKB OSS. Code will follow; for now, this sets the stage for collaborators, testers, and enterprises to see what weβre building.