End-to-end PDF intelligence app: upload and view PDFs, extract AI insights, and generate a podcast—all behind a single container on port 8080.
• Frontend: Next.js (App Router) + Tailwind • Backend: Express (file upload, related content, Python orchestration) • Reverse proxy: Nginx (serves everything on port 8080) • Python: local models + PDF/search pipeline (round_1b)
Open http://localhost:8080 after running the container.
Add your screenshots to docs/screenshots/ with these filenames to render below. Each image has a short caption to match the requested order.
- Landing Page
- PDF Viewer with highlighted sections and subsection analysis (Allows click to jump to that specific page)
- AI Insights page
- Podcast page
- Upload PDF page
- Related Findings for selected texts (Allows click to jump to that specific page)
Build (linux/amd64):
docker build --platform linux/amd64 -t yourimageidentifier .Run (env vars inline):
docker run --rm --name axon-docs \
-e ADOBE_EMBED_API_KEY=8f94cb51c25d45b6ad598687c409fdcb \
-e LLM_PROVIDER=gemini \
-e GEMINI_MODEL=gemini-2.5-flash \
-e GEMINI_API_KEY=<GEMINI_API_KEY> \
-e TTS_PROVIDER=azure \
-e AZURE_TTS_KEY=<AZURE_TTS_KEY> \
-e AZURE_TTS_ENDPOINT=<AZURE_TTS_ENDPOINT> \
-p 8080:8080 \
yourimageidentifierThen open http://localhost:8080
- Upload PDFs: Drag-and-drop or pick files to place under
frontend/public/pdfsfor serving. - PDF Viewer: In-browser view with highlighted sections and sub-section analysis.
- AI Insights: Gemini-based structured insights across single or multiple PDFs.
- Podcast: Gemini-created two-speaker script synthesized via Azure TTS to MP3.
- Nginx on 8080
- Proxies Next.js on 3000 and Express on 5001
- Serves static assets and
/runtime-env.js(browser runtime configuration)
- Next.js (App Router)
- Pages under
frontend/app - API routes for AI features:
POST /api/generate-insightsPOST /api/generate-overview-insightsPOST /api/generate-podcastPOST /api/generate-overview-podcast
- Pages under
- Express backend (5001)
- Uploads, related content, Python process orchestration
- Python pipeline (
round_1b/)- Local models + PDF/text embedding, ranking, and inference
Required for core features:
- ADOBE_EMBED_API_KEY: Adobe PDF Embed (client-side)
- LLM_PROVIDER: Must be
geminifor current implementation - GEMINI_MODEL: e.g.,
gemini-2.5-flash - GEMINI_API_KEY: Google Generative AI key
- TTS_PROVIDER: Must be
azure - AZURE_TTS_KEY: Azure Cognitive Services Speech key
- AZURE_TTS_ENDPOINT: e.g.,
https://eastus.tts.speech.microsoft.com/cognitiveservices/v1- Alternatively, set
AZURE_TTS_REGION(e.g.,eastus) and omit endpoint
- Alternatively, set
Runtime config is exposed to the browser via /runtime-env.js (generated at container start). To match the Adobe-style pattern, window.__ENV includes keys like ADOBE_EMBED_API_KEY, GEMINI_API_KEY, and others.
backend/ # Express server (5001)
frontend/ # Next.js app (3000)
app/
api/ # Next.js route handlers (AI endpoints)
pdfviewer/, upload/ # Main pages
components/ # UI components (Header, Sidebar, PDFViewer, etc.)
public/pdfs/ # PDFs served to the browser
nginx/ # Nginx config (reverse proxy)
round_1b/ # Python models, requirements, and scripts
Dockerfile # Multi-stage build and runtime setup
- Browser sends API calls to
/api/*on port 8080. - Nginx routes
generate-*to Next.js; other/api/*to Express. - Next.js APIs call Gemini to generate insights/scripts; Podcast endpoints call Azure TTS.
- Express endpoints handle uploads/related content and optionally call Python.
- Static PDFs are read from
frontend/public/pdfswhen needed.






