DeepFetch Architecture

DeepFetch is a local-first MCP server that turns public-web search into evidence-rich snippets for agent runtimes.

Design Goals

Work cleanly with local MCP clients such as Claude Desktop, Gemini CLI, and Codex CLI.
Require only user-supplied KAGI_API_KEY and SCRAPFLY_API_KEY.
Prefer deterministic retrieval and local semantic reranking over an extra paid LLM pass.
Keep the default deployment model simple enough for local Docker, while still supporting a managed HTTP transport.

Runtime Topology

flowchart LR
    Client[Client]
    Client --> Transport[FastMCP transport]
    Transport --> Server[src/deepfetch/server.py]
    Server --> Search[internet_search]
    Server --> PDF[pdf_extract_text]

    Search --> Kagi[Kagi discovery]
    Search --> Fetch[Scrapfly extraction fan-out]
    Search --> Rank[ONNX snippet ranking]
    Search --> Keyword[Keyword fallback]
    Search --> PDF

    PDF --> URL[Public HTTPS validation and download]
    PDF --> Pages[pypdf page extraction]
    PDF --> PDFRank[Semantic or keyword page matching]

src/deepfetch/server.py is intentionally thin. It registers the two tools and hands transport startup to FastMCP. The real retrieval logic lives under src/deepfetch/search/.

Request Flow

`internet_search`

Query Kagi for candidate URLs.
Normalize hosts and keep the first candidate per host.
Fetch candidate content in parallel with a bounded ThreadPoolExecutor.
Prefer Scrapfly AI extraction when a supported extraction_model is supplied.
Detect PDFs by URL or content type and route those candidates through pdf_extract_text.
Rank snippets semantically with the shared ONNX embedder when assets are present.
Fall back to keyword-centered snippets when semantic assets or semantic matches are unavailable.
If the first pass does not yield enough unique hosts, issue a second Kagi query that excludes already-attempted hosts.

`pdf_extract_text`

Accept exactly one source: url or pdf_base64.
Validate public HTTPS URLs before downloading.
Read the PDF with pypdf.
Extract the requested page range.
Run semantic page matching when the embedder is available, otherwise use keyword matching.
Return page-numbered snippets plus search-mode metadata.

Module Layout

Path	Responsibility
`src/deepfetch/server.py`	FastMCP server creation, tool registration, transport startup.
`src/deepfetch/search/internet_search.py`	Kagi discovery, Scrapfly extraction, host dedupe, reranking, PDF routing, and response shaping.
`src/deepfetch/search/pdf_utils.py`	PDF download/decoding, page extraction, semantic and keyword matching.
`src/deepfetch/search/http_utils.py`	Safe HTTP helpers and public-URL validation.
`src/deepfetch/search/text_utils.py`	Snippet anchoring and text slicing helpers.
`src/deepfetch/search/embedder.py`	ONNX embedder loading and vector generation.

Caching and Concurrency

Kagi responses are cached in-process for a short TTL.
Scrapfly text and AI extraction responses are cached in-process for a short TTL.
Search extraction uses bounded parallelism so one request does not fan out without limit.
No external cache or database sits on the core local stdio path.

Deployment Modes

Local-first

Transport: stdio
Packaging: Docker image
Secret model: the user injects KAGI_API_KEY and SCRAPFLY_API_KEY

This is the default because it matches how most local MCP clients launch servers today.

Managed

Transport: streamable-http
Packaging: long-running container service
Typical target: ECS Fargate or a similar always-on container platform

This path exists for remote-capable clients and hosted agent runtimes, but it is secondary to the local Docker workflow.

Non-Goals in the Current Codebase

No database-backed core retrieval path.
No multi-tenant credential brokering layer.
No Lambda-first deployment strategy.

The architectural rationale for those choices lives in ADR 0001.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepFetch Architecture

Design Goals

Runtime Topology

Request Flow

`internet_search`

`pdf_extract_text`

Module Layout

Caching and Concurrency

Deployment Modes

Local-first

Managed

Non-Goals in the Current Codebase

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

DeepFetch Architecture

Design Goals

Runtime Topology

Request Flow

internet_search

pdf_extract_text

Module Layout

Caching and Concurrency

Deployment Modes

Local-first

Managed

Non-Goals in the Current Codebase

`internet_search`

`pdf_extract_text`