Skip to content

RevEngine3r/http-range-reader

HTTP Range Reader

CI PyPI Python License: MIT

Minimal, production-ready HTTP byte-range reader that behaves like a read-only file object. It supports 2‑chunk LRU caching, parallel prefetch, and clean random access into large remote files (think: ZIP archives, tarballs, parquet splits, ISO images) without downloading the whole object.

Python 3.9+. Transport: requests (HTTP/1.1).

Features

  • Single-file, zero-deps (runtime) except requests
  • 2-chunk LRU (current + previous) to reduce re-fetches on back-seeks
  • Background prefetch of the next chunk for smooth sequential reads
  • If-Range with ETag/Last-Modified to prevent mixing chunks after remote updates
  • Graceful fallback when servers ignore Range (200 OK)
  • Works anywhere a file-like object works (zipfile, tarfile, PIL.Image.open, etc.)

Install

pip install http-range-reader

(or use directly by copying src/http_range_reader/reader.py into your project)

Quickstart

from zipfile import ZipFile
from http_range_reader import HTTPRangeReader

url = "https://github.com/psf/requests/archive/refs/heads/main.zip"
rdr = HTTPRangeReader(url, chunk_size=1024*1024, prefetch=True)

with rdr:
    with ZipFile(rdr) as zf:
        print(len(rdr), "bytes over HTTP")
        print("first 5 entries:")
        for info in zf.infolist()[:5]:
            print("-", info.filename, info.file_size)
        data = zf.read(zf.infolist()[0].filename)
        print("read", len(data), "bytes from first member")

When to use this

  • You need random access into large objects over HTTP
  • You want to avoid full downloads and keep RAM small
  • You can rely on standard HTTP servers/CDNs that support Range requests

FAQ

Does it cache the whole file? No. It caches at most two chunks at a time.

HTTP/2 or HTTP/3? Default transport is requests (HTTP/1.1). You can swap your own transport if needed.

Thread safety? Intended for single-reader usage. The internal executor is only for prefetching.

CLI demo

python -m examples.http_zip_demo --url https://github.com/psf/requests/archive/refs/heads/main.zip --list 10

Roadmap

  • Optional httpx transport (HTTP/2)
  • Adaptive prefetch sizing
  • Multi-range coalescing (multipart/byteranges) when beneficial

License

MIT

About

Stream big files over HTTP like they’re local.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages