ia2audio — Internet Archive to Audiobook

Internet Archive → OCR (OpenAI Vision) → Markdown → TXT/EPUB → Edge TTS → M4B.

Features:

Automates IA BookReader capture with Playwright (login/borrow flow, 2-up theater view).
OCR via OpenAI Vision (gpt-4o) with concurrency/backoff to content.json.
Exports clean Markdown and unwrapped TXT (no mid-sentence pauses in TTS).
Synthesizes audiobook with Microsoft Edge TTS (via aedocw/epub2tts-edge), cover embed.
Resume-safe: SKIP_IA=1 to reuse captured pages/OCR and just regenerate text/audio.

Important: Respect the Internet Archive’s Terms of Service. Borrow books legitimately and do not bypass protections.

Prerequisites

Node.js >= 18 and pnpm
Python 3.12+ with venv
ffmpeg, pandoc, jq
Playwright Chromium (pnpm playwright:install)
OpenAI API key (for OCR)

macOS (Homebrew):

brew install ffmpeg pandoc jq python

Linux (apt-based):

sudo apt-get update && sudo apt-get install -y ffmpeg pandoc jq python3-venv

Quickstart

Install dependencies:

pnpm install
pnpm playwright:install

Copy .env.example to .env and set variables:

IA_EMAIL, IA_PASSWORD (IA account credentials)
OPENAI_API_KEY
IA_ID or IA_URL

Run the full pipeline (capture → OCR → export → TTS):

IA_ID=<internet_archive_id> bin/ia2audio.sh

To resume from existing outputs and only rebuild TXT/EPUB/audio:

SKIP_IA=1 IA_ID=<internet_archive_id> bin/ia2audio.sh

Outputs are written under out/<IA_ID>/, including book.md, book.txt, book.epub, and audio/<IA_ID>.m4b (or .m4a).

Environment Variables

IA: IA_ID, IA_URL, IA_EMAIL, IA_PASSWORD
Capture: IA_START_PAGE (default 1), IA_RESUME (0/1), IA_MAX_PAGES, IA_PAGE_DELAY_MS, IA_TILE_STABLE_MS, IA_SPINNER_TIMEOUT_MS, IA_MAX_RETRIES
OCR: OPENAI_API_KEY, OCR_CONCURRENCY, OCR_MAX_RETRIES
TTS: VOICE (default en-US-AndrewNeural), EDGE_TTS_CONCURRENCY, EDGE_TTS_RETRIES, EDGE_TTS_BACKOFF
Orchestration: SKIP_IA=1 to reuse captured pages/OCR

See .env.example for a starter template.

How it Works

Capture (src/extract-ia-book.ts)
- Logs in and tries multiple ‘Borrow’ variants.
- Forces 2-up + theater view to capture full spreads (inner element, no chrome).
- Waits for spinners to vanish, images to load, and a tile-stability window.
- Single-spread advance with verification to avoid double-turns.
- Change detection (tile signature + SHA1) prevents duplicate screenshots.
OCR (src/transcribe-book-content.ts)
- Sends each page image to OpenAI Vision (gpt-4o) with concurrency/backoff.
- Creates content.json sorted by capture index then page.
Export (src/export-book-markdown.ts)
- Builds metadata.json if missing; exports clean Markdown.
- Folds single newlines, preserves paragraph breaks across spreads.
TTS (bin/ia2audio.sh)
- Converts Markdown → TXT with --wrap=none (prevents mid-sentence pauses).
- Runs Edge TTS via epub2tts-edge in a local virtualenv.
- Embeds a cover image derived from the first page.

Install Edge TTS locally

The wrapper will bootstrap epub2tts-edge automatically:

scripts/install_edge_tts.sh

It clones https://github.com/aedocw/epub2tts-edge and installs it into .venv.

Troubleshooting

Stuck spinner or missing BookReader: increase IA_SPINNER_TIMEOUT_MS; ensure view=theater&mode=2up.
Double page turns: the capture loop verifies a single advance and retries.
Mid-sentence pauses: ensured TXT is unwrapped (pandoc --wrap=none).
Resume at wrong page: set IA_RESUME=0 and IA_START_PAGE=1 (default) to override IA resume.

License

Apache-2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
bin		bin
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ia2audio — Internet Archive to Audiobook

Prerequisites

Quickstart

Environment Variables

How it Works

Install Edge TTS locally

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ia2audio — Internet Archive to Audiobook

Prerequisites

Quickstart

Environment Variables

How it Works

Install Edge TTS locally

Troubleshooting

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages