arxiv Paper Outline and Summarization API

A TensorLake application that processes research papers (PDFs) using Google's Gemini AI to create structured outlines and detailed section expansions, storing the metadata and outline in Postgres tables.

Features

PDF ingestion: Fetches and processes PDFs from URLs
Outline generation: Extracts title, authors, abstract, keywords, and full section hierarchy
Section expansion: Produces detailed per-section summaries, key findings, methods, results, and notable references
Database storage: Saves all structured output into PostgreSQL (tested with Neon; works with Supabase or any Postgres)

Why TensorLake?

Write code like a monolith, get a distributed system for free
Your app is just Python functions calling each other. TensorLake runs each function in its own container, scales them independently, and parallelizes requests without any orchestration code.
Automatic queueing, scaling, and backpressure
You don’t need Celery, Kafka, Kubernetes, autoscalers, or job runners. The runtime queues requests, spins up more containers for bottleneck functions, and processes workloads at whatever concurrency the code can handle.
Durable, restartable execution
If a long-running request crashes halfway (PDF too large, LLM timeout, network blip), it resumes from the last function boundary instead of restarting from scratch.

Architecture

The application consists of four main functions:

create_outline(pdf_url): Downloads PDF and creates structured outline using Gemini
expand_section(pdf_url, section_title, section_description): Expands a single section with detailed structured data
expand_all_sections(outline): Orchestrates parallel expansion of all sections
write_to_postgres(outline, expanded_sections): Stores all data in PostgreSQL
process_paper(pdf_url): Main orchestration function that chains all steps

Setup

Prerequisites

TensorLake CLI installed (pip install tensorlake)
Gemini API key
PostgreSQL database

Configuration

Authenticate with TensorLake:
```
tensorlake login
tensorlake whoami
```

Set up secrets:

# Gemini API key
tensorlake secrets set GEMINI_API_KEY=your_gemini_api_key

# PostgreSQL connection string
tensorlake secrets set POSTGRES_CONNECTION_STRING="postgresql://user:password@host:port/database"

Deployment

Deploy the application to TensorLake:

tensorlake deploy paper_outline_app.py

Once the application is deployed, it's available as an HTTP API -

https://api.tensorlake.ai/applications/process_paper

Usage

Via HTTP

curl https://api.tensorlake.ai/applications/process_paper \
-H "Authorization: Bearer $TENSORLAKE_API_KEY" \
--json '"https://www.arxiv.org/pdf/2510.18234"'

Status

The application doesn't return any data back when the request finishes, it writes the processed data in the database. You can poll for the request ID to know the status of the request.

curl https://api.tensorlake.ai/applications/process_paper/requests/h-0XJD_eE1JTH90ylW4f- \
-H "Authorization: Bearer $TENSORLAKE_API_KEY"
#{"id":"h-0XJD_eE1JTH90ylW4f-","outcome":"success", ... }

Output

The outputs from the application are written in Postgres. We used Neon for testing, you can chose any other database

Dashboard

You can observe the state of the request on Tensorlake's UI as well.

local development:

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export GEMINI_API_KEY=your_key
export POSTGRES_CONNECTION_STRING=your_connection_string

# Test locally
python paper_outline_app.py

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
paper_outline_app.py		paper_outline_app.py
requirements.txt		requirements.txt
setup_database.sql		setup_database.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

arxiv Paper Outline and Summarization API

Features

Why TensorLake?

Architecture

Setup

Prerequisites

Configuration

Deployment

Usage

Via HTTP

Status

Output

Dashboard

local development:

About

Uh oh!

Releases

Packages

Languages

License

tensorlakeai/paper-outline

Folders and files

Latest commit

History

Repository files navigation

arxiv Paper Outline and Summarization API

Features

Why TensorLake?

Architecture

Setup

Prerequisites

Configuration

Deployment

Usage

Via HTTP

Status

Output

Dashboard

local development:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages