DeepFetch Tool Examples

This page shows concrete MCP tool-call examples for the two tools DeepFetch exposes.

All examples below are the arguments payloads you would pass to an MCP call_tool request.

Quick Test Without an LLM

Use examples/direct_mcp_client.py when you want to test the Dockerized MCP server directly.

Build the image:

docker build -t deepfetch:test .

Install the local client dependencies:

python -m pip install -e '.[dev]'

Export keys for internet_search:

export KAGI_API_KEY=your_kagi_key
export SCRAPFLY_API_KEY=your_scrapfly_key

List tools:

python examples/direct_mcp_client.py list-tools --image deepfetch:test

Run a search:

python examples/direct_mcp_client.py search \
  --image deepfetch:test \
  --query "Model Context Protocol official specification"

Run a PDF extraction:

python examples/direct_mcp_client.py pdf \
  --image deepfetch:test \
  --url "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf" \
  --query "dummy"

Run the full smoke sequence:

python examples/direct_mcp_client.py smoke --image deepfetch:test

`internet_search`

Use internet_search when you need current public-web information and want DeepFetch to discover, fetch, and rerank results for you.

Example 1: Current factual lookup

Use explicit time context when it matters.

{
  "query": "OpenAI API pricing March 2026",
  "extraction_model": "article"
}

Why this works:

It names the entity directly
It includes the specific subject
It resolves the time context instead of using a vague word like latest

Example 2: Source-constrained search

Use a domain filter when the source itself matters.

{
  "query": "site:fda.gov semaglutide shortage status",
  "extraction_model": "article"
}

Why this works:

It narrows the search to an authoritative source
It uses domain language the source is likely to use

Example 3: PDF-oriented discovery

DeepFetch will automatically handle PDFs found during internet_search.

{
  "query": "cybersecurity report filetype:pdf after:2024",
  "extraction_model": "article"
}

Why this works:

filetype:pdf biases discovery toward documents
after:2024 helps reduce stale sources

Example 4: Product page extraction

Choose a non-default extraction model when the page type is obvious.

{
  "query": "Sony WH-1000XM6 specifications site:sony.com",
  "extraction_model": "product"
}

When to use a different extraction_model:

article for news, blog posts, docs, and general pages
product for product pages
stock for market pages
organization for company profile pages
event for event pages

Representative response

[
  {
    "url": "https://modelcontextprotocol.io/specification/2025-11-25",
    "title": "Specification - Model Context Protocol",
    "target_status_code": 200,
    "snippet": "Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools.",
    "score": 0.7008,
    "source": "semantic_ai"
  }
]

`pdf_extract_text`

Use pdf_extract_text when you already know the PDF you want to inspect.

Example 1: Exact keyword search in a PDF

Use keyword mode for exact terms, names, and numbers.

{
  "url": "https://example.com/financial-report.pdf",
  "query": "revenue guidance",
  "search_mode": "keyword",
  "max_matches": 5,
  "context_chars": 600
}

Example 2: Semantic concept search

Use semantic when you care about meaning more than exact phrasing.

{
  "url": "https://example.com/research-paper.pdf",
  "query": "retrieval-augmented generation",
  "search_mode": "semantic",
  "max_matches": 5,
  "context_chars": 800,
  "min_similarity": 0.25
}

Example 3: Scan only part of a large PDF

Use page limits when the relevant section is near the front or in a known range.

{
  "url": "https://example.com/10-k.pdf",
  "query": "risk factors",
  "search_mode": "auto",
  "start_page": 1,
  "max_pages": 25,
  "max_matches": 8
}

Example 4: Extract without a focused query

Omit query when you just want text extraction metadata for a subset of pages.

{
  "url": "https://example.com/brief.pdf",
  "start_page": 1,
  "max_pages": 3
}

Example 5: Base64 PDF input

Use pdf_base64 when the client already has the file bytes.

{
  "pdf_base64": "<base64-pdf-bytes>",
  "query": "incident response playbook",
  "search_mode": "auto"
}

Representative response

{
  "source_type": "url",
  "source_url": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf",
  "final_url": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf",
  "content_type": "application/pdf; qs=0.001",
  "total_pages": 1,
  "pages_processed": 1,
  "search_mode_requested": "auto",
  "search_mode_used": "semantic",
  "matches": [
    {
      "page": 1,
      "query": "dummy",
      "snippet": "Dummy PDF file",
      "score": 0.3462
    }
  ],
  "notes": ""
}

Query Writing Tips

For internet_search, good queries are usually:

Specific
Short
Rich in proper nouns and source-like terminology
Explicit about time, location, product, or organization when those change the answer

Good:

OpenAI API pricing March 2026
site:fda.gov semaglutide shortage status
cybersecurity report filetype:pdf after:2024

Bad:

latest pricing current openai api all models now
drug shortage government website maybe semaglutide current
cybersecurity lots of reports pdf security recent

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepFetch Tool Examples

Quick Test Without an LLM

`internet_search`

Example 1: Current factual lookup

Example 2: Source-constrained search

Example 3: PDF-oriented discovery

Example 4: Product page extraction

Representative response

`pdf_extract_text`

Example 1: Exact keyword search in a PDF

Example 2: Semantic concept search

Example 3: Scan only part of a large PDF

Example 4: Extract without a focused query

Example 5: Base64 PDF input

Representative response

Query Writing Tips

FilesExpand file tree

examples.md

Latest commit

History

examples.md

File metadata and controls

DeepFetch Tool Examples

Quick Test Without an LLM

internet_search

Example 1: Current factual lookup

Example 2: Source-constrained search

Example 3: PDF-oriented discovery

Example 4: Product page extraction

Representative response

pdf_extract_text

Example 1: Exact keyword search in a PDF

Example 2: Semantic concept search

Example 3: Scan only part of a large PDF

Example 4: Extract without a focused query

Example 5: Base64 PDF input

Representative response

Query Writing Tips

`internet_search`

`pdf_extract_text`