URL Content MCP Server

This repository provides a Model Context Protocol (MCP) server that retrieves the raw HTML content from a given URL to provide context to Large Language Models (LLMs). It acts as a tool for LLMs to fetch real-time web page content beyond their training data.

Overview

The URL Content MCP server serves as a bridge between LLMs and the web. Through a standardized MCP interface, an LLM can request the content of a specific URL and receive the HTML content of that page. This allows AI assistants to access up-to-date web content on demand.

Features

Fetch Web Page Content: Retrieve the HTML content of a web page given its URL.
Real-Time Data: Access current information directly from web pages in real time.
Optional Caching: Optionally cache fetched content in memory to avoid repeated network calls for the same URL during the server's runtime.
STDIO and SSE Support: Run the server in stdio mode for integration as a subprocess, or in sse (HTTP Server-Sent Events) mode to serve requests over HTTP.

Requirements

Python 3.8+ – The server is written in Python and requires version 3.8 or higher.
Internet Access – The server needs network access to fetch web pages from the internet.

Note: This server fetches raw HTML content. Ensure the target URL is accessible and returns text/HTML content. Some websites may block automated requests or require specific user-agent headers.

Installation

Clone this repository and install the package along with its dependencies:

git clone https://github.com/artryazanov/url-content-mcp.git
cd url-content-mcp
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

This will install the necessary Python packages as listed in requirements.txt. You can also install the package in editable mode (e.g., pip install -e .) if you plan to modify the code.

Usage

After installation, you can run the MCP server using the provided console script url-content-mcp or by executing the module. The server supports two modes of operation: STDIO (for direct integration with an MCP-compatible client) and SSE (for running as an HTTP server).

Running the Server (STDIO Mode)

By default, the server runs in stdio mode. In this mode, the server reads MCP requests from standard input and writes responses to standard output. This mode is suitable for integrating with applications that manage the server as a subprocess and communicate via MCP protocol (such as certain AI assistant platforms).

Example (running in stdio mode):

url-content-mcp

When running in stdio mode, the server will start and wait for incoming MCP requests via stdin (typically from an AI client). Each request (formatted according to the MCP protocol) will be processed, and the server will output the result to stdout as JSON.

Running the Server (SSE/HTTP Mode)

To run the server as an HTTP service, use the --transport sse option. In SSE mode, the server will start an HTTP server and provide a RESTful endpoint for fetching URL content.

Example (running in SSE mode on port 8080):

url-content-mcp --transport sse --host 0.0.0.0 --port 8080 --enable-cache

This starts the server in SSE mode, listening on all interfaces (0.0.0.0) at port 8080, with caching enabled. In this mode, you can send HTTP GET requests to the server's /fetch/{url} endpoint to retrieve content. Note: The {url} in the path should be URL-encoded.

For example, to fetch the content of http://example.com, encode the URL and request:

http://localhost:8080/fetch/http%3A%2F%2Fexample.com

This will return a JSON response containing the URL and the HTML content of the page. The response structure looks like:

{
  "url": "http://example.com",
  "content": "<!DOCTYPE html>...</html>"
}

If an error occurs during fetching (for example, a network error or a non-200 HTTP status), the response will include an "error" field with a message, and the "content" may be an empty string.

Note: When running in Docker or other container environments, use --host 0.0.0.0 to bind to all interfaces, and ensure the container's port is published (e.g., -p 8080:8080).

Command-Line Options

--transport, -t (string): Transport protocol for the server. Either stdio (default) or sse (to run an HTTP server for SSE).
--host (string): Host address to bind the HTTP server in SSE mode (default: 127.0.0.1).
--port (int): Port number for SSE mode (default: 8080).
--enable-cache (flag): Enable in-memory caching of fetched content. If this flag is set, the server will cache the content of each URL after the first fetch during its runtime.

Run url-content-mcp --help to see the usage information.

Available MCP Tool

This server provides one MCP tool that the LLM can use:

fetch_url

Description: Fetches the content of a web page at the given URL and returns the HTML content.
Parameters:
- url (string, required) – The web page URL to fetch.
Returns: A JSON object with the following structure:
- url: The URL that was fetched.
- content: The HTML content of the page as a string. (This will contain the raw HTML, including tags.)
- error: optional – An error message string, if an error occurred during fetching. This field is only present if there was an error (on success it is omitted).

The fetch_url tool is registered with the MCP server, so an LLM client can call this function to retrieve web page content. In stdio mode, the function is invoked via MCP tool calls in the protocol. In sse (HTTP) mode, the server exposes a GET endpoint /fetch/{url} (with the URL percent-encoded) that returns the same data.

Testing

This project includes a test suite to ensure the server works correctly.

Install test dependencies: pip install -r requirements-dev.txt (includes pytest).
Run all tests: pytest

Docker

A Dockerfile is provided to containerize the MCP server. To build the Docker image:

docker build -t url-content-mcp .

To run the server via Docker (exposing port 8080 for SSE mode):

docker run --rm -it -p 8080:8080 url-content-mcp --transport sse --host 0.0.0.0

This will start the MCP server inside a container. You can then interact with it via HTTP requests to http://localhost:8080 (for SSE mode) or attach it to an MCP-compatible client in stdio mode.

License

This project is licensed under the Unlicense license. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
tests		tests
url_content_mcp		url_content_mcp
.flake8		.flake8
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

URL Content MCP Server

Overview

Features

Requirements

Installation

Usage

Running the Server (STDIO Mode)

Running the Server (SSE/HTTP Mode)

Command-Line Options

Available MCP Tool

fetch_url

Testing

Docker

License

About

Uh oh!

Releases

Packages

Languages

License

artryazanov/url-content-mcp

Folders and files

Latest commit

History

Repository files navigation

URL Content MCP Server

Overview

Features

Requirements

Installation

Usage

Running the Server (STDIO Mode)

Running the Server (SSE/HTTP Mode)

Command-Line Options

Available MCP Tool

fetch_url

Testing

Docker

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages