Skip to content

bselleslagh/PyPaperFlow

Repository files navigation

PyPaperFlow Service

A FastAPI-based service that converts URLs, HTML and Word content to PDF documents.

Ruff uv GitHub License GitHub Release X (formerly Twitter) Follow

Features

  • Convert HTML content to PDF
  • Convert URLs to PDF
  • Convert Word documents to PDF
  • A4 format with customizable margins
  • Background graphics and colors support
  • Asynchronous processing
  • Base64 encoded output

Installation

  1. Clone the repository:
git clone https://github.com/bselleslagh/PyPaperFlow.git
cd PyPaperFlow
  1. Install dependencies using UV:
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -e .
  1. Install Playwright browsers:
playwright install
  1. Set up pre-commit hooks:
# Install the pre-commit hooks
uv run pre-commit install

# Run pre-commit on all files (optional)
uv run pre-commit run --all-files

Docker Installation

docker build -t pypaperflow .
docker run -p 8000:8000 pypaperflow

Usage

The service exposes three endpoints:

Root Endpoint

GET /

Returns a welcome message confirming the service is running.

Convert HTML/URL to PDF Endpoint

POST /convert-html

Request body:

{
    "content": "string",
    "type": "html" | "url"
}
  • content: HTML string or URL to convert
  • type: Either "html" for HTML content or "url" for web pages (default: "html")

Response:

{
    "pdf": "base64_encoded_string",
    "message": "PDF generated successfully"
}

Convert Word to PDF Endpoint

POST /convert-word

Request body:

{
    "content": "string"  // base64 encoded Word document
}

Response:

{
    "pdf": "base64_encoded_string",
    "message": "Word document converted to PDF successfully"
}

Example

import requests
import base64

# Convert HTML to PDF
url = "http://localhost:8000/convert-html"
payload = {
    "content": "<h1>Hello World</h1>",
    "type": "html"
}
response = requests.post(url, json=payload)
pdf_data = response.json()["pdf"]

# Convert Word to PDF
with open("document.docx", "rb") as word_file:
    word_base64 = base64.b64encode(word_file.read()).decode("utf-8")

payload = {
    "content": word_base64
}
response = requests.post("http://localhost:8000/convert-word", json=payload)
pdf_data = response.json()["pdf"]

Requirements

  • Python ≥ 3.13
  • FastAPI
  • Playwright
  • See pyproject.toml for full dependencies

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Ben Selleslagh (@BenSelleslagh)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

A FastAPI-based service that converts HTML content or URLs to PDF documents using Playwright.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published