Skip to content

Latest commit

 

History

History
131 lines (91 loc) · 3.3 KB

File metadata and controls

131 lines (91 loc) · 3.3 KB

DeepFetch Getting Started

This guide covers both end-user setup and repo-local verification.

Prerequisites

  • Docker for the default containerized MCP workflow.
  • Python 3.10 to 3.12 if you want to run the repo locally or use the direct MCP smoke client.
  • KAGI_API_KEY and SCRAPFLY_API_KEY for internet_search.

Option 1: Run the Published Image

Once the image is published, the end-user path is:

docker run --rm -i \
  -e KAGI_API_KEY=your_kagi_key \
  -e SCRAPFLY_API_KEY=your_scrapfly_key \
  ghcr.io/vinay9986/deepfetch:latest

Use one of the sample client configs in ../examples/clients:

Maintainer Publish Path

DeepFetch includes a GitHub Actions publisher in ../.github/workflows/publish-image.yml. It builds a multi-arch image and pushes:

  • ghcr.io/vinay9986/deepfetch:latest on the default branch
  • ghcr.io/vinay9986/deepfetch:v* on matching release tags
  • ghcr.io/vinay9986/deepfetch:sha-* for traceable CI images

By default the workflow authenticates with GITHUB_TOKEN and publishes to the vinay9986 personal GHCR namespace. If you later move the image to a separate publisher identity, you can override the workflow login with GHCR_TOKEN and optional GHCR_USERNAME.

If you want anonymous docker pull support for end users, change the GHCR package visibility to Public after the first successful publish.

Option 2: Build and Smoke-Test the Repo Locally

Build the Docker image:

docker build -t deepfetch:test .

Install repo dependencies so the direct client can talk MCP:

python -m pip install -e '.[dev]'

Export the required keys:

export KAGI_API_KEY=your_kagi_key
export SCRAPFLY_API_KEY=your_scrapfly_key

List tools:

python examples/direct_mcp_client.py list-tools --image deepfetch:test

Run a search:

python examples/direct_mcp_client.py search \
  --image deepfetch:test \
  --query "Model Context Protocol official specification" \
  --extraction-model article

Run a PDF lookup:

python examples/direct_mcp_client.py pdf \
  --image deepfetch:test \
  --url "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf" \
  --query "dummy"

Run the full smoke sequence:

python examples/direct_mcp_client.py smoke --image deepfetch:test

Run from Source

Create a virtualenv and install the project:

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e '.[dev]'

If you want local semantic ranking assets outside the Docker image, build them first:

python -m pip install '.[onnx-build]'
python onnx_assets/build.py

Run the server directly:

export KAGI_API_KEY=your_kagi_key
export SCRAPFLY_API_KEY=your_scrapfly_key
python -m deepfetch

Test Commands

Run the unit test suite:

pytest -v

Run the Dockerized MCP integration test:

DEEPFETCH_RUN_INTEGRATION=1 \
DEEPFETCH_TEST_IMAGE=deepfetch:test \
pytest -m integration -v tests/test_mcp_smoke.py