DeepFetch is a local-first MCP server that turns public-web search into evidence-rich snippets for agent runtimes.
- Work cleanly with local MCP clients such as Claude Desktop, Gemini CLI, and Codex CLI.
- Require only user-supplied
KAGI_API_KEYandSCRAPFLY_API_KEY. - Prefer deterministic retrieval and local semantic reranking over an extra paid LLM pass.
- Keep the default deployment model simple enough for local Docker, while still supporting a managed HTTP transport.
flowchart LR
Client[Client]
Client --> Transport[FastMCP transport]
Transport --> Server[src/deepfetch/server.py]
Server --> Search[internet_search]
Server --> PDF[pdf_extract_text]
Search --> Kagi[Kagi discovery]
Search --> Fetch[Scrapfly extraction fan-out]
Search --> Rank[ONNX snippet ranking]
Search --> Keyword[Keyword fallback]
Search --> PDF
PDF --> URL[Public HTTPS validation and download]
PDF --> Pages[pypdf page extraction]
PDF --> PDFRank[Semantic or keyword page matching]
src/deepfetch/server.py is intentionally thin. It registers the two tools and hands transport startup to FastMCP. The real retrieval logic lives under src/deepfetch/search/.
- Query Kagi for candidate URLs.
- Normalize hosts and keep the first candidate per host.
- Fetch candidate content in parallel with a bounded
ThreadPoolExecutor. - Prefer Scrapfly AI extraction when a supported
extraction_modelis supplied. - Detect PDFs by URL or content type and route those candidates through
pdf_extract_text. - Rank snippets semantically with the shared ONNX embedder when assets are present.
- Fall back to keyword-centered snippets when semantic assets or semantic matches are unavailable.
- If the first pass does not yield enough unique hosts, issue a second Kagi query that excludes already-attempted hosts.
- Accept exactly one source:
urlorpdf_base64. - Validate public HTTPS URLs before downloading.
- Read the PDF with
pypdf. - Extract the requested page range.
- Run semantic page matching when the embedder is available, otherwise use keyword matching.
- Return page-numbered snippets plus search-mode metadata.
| Path | Responsibility |
|---|---|
src/deepfetch/server.py |
FastMCP server creation, tool registration, transport startup. |
src/deepfetch/search/internet_search.py |
Kagi discovery, Scrapfly extraction, host dedupe, reranking, PDF routing, and response shaping. |
src/deepfetch/search/pdf_utils.py |
PDF download/decoding, page extraction, semantic and keyword matching. |
src/deepfetch/search/http_utils.py |
Safe HTTP helpers and public-URL validation. |
src/deepfetch/search/text_utils.py |
Snippet anchoring and text slicing helpers. |
src/deepfetch/search/embedder.py |
ONNX embedder loading and vector generation. |
- Kagi responses are cached in-process for a short TTL.
- Scrapfly text and AI extraction responses are cached in-process for a short TTL.
- Search extraction uses bounded parallelism so one request does not fan out without limit.
- No external cache or database sits on the core local
stdiopath.
- Transport:
stdio - Packaging: Docker image
- Secret model: the user injects
KAGI_API_KEYandSCRAPFLY_API_KEY
This is the default because it matches how most local MCP clients launch servers today.
- Transport:
streamable-http - Packaging: long-running container service
- Typical target: ECS Fargate or a similar always-on container platform
This path exists for remote-capable clients and hosted agent runtimes, but it is secondary to the local Docker workflow.
- No database-backed core retrieval path.
- No multi-tenant credential brokering layer.
- No Lambda-first deployment strategy.
The architectural rationale for those choices lives in ADR 0001.