Chunked download in "Monitoring download progress" corrupts large (~100MB+) files #1979
Replies: 2 comments 5 replies
-
@mjuopperi Hi, are you able to reproduce this with other endpoints? Could it be that I was not able to reproduce with a local setup as follows:
import httpx
from tqdm import tqdm
with open("download.bin", "wb") as download_file:
url = "http://localhost:8000/100MB.bin"
with httpx.stream("GET", url) as response:
total = int(response.headers["Content-Length"])
with tqdm(total=total, unit_scale=True, unit_divisor=1024, unit="B") as progress:
num_bytes_downloaded = response.num_bytes_downloaded
for chunk in response.iter_bytes():
download_file.write(chunk)
progress.update(response.num_bytes_downloaded - num_bytes_downloaded)
num_bytes_downloaded = response.num_bytes_downloaded
print(download_file.name)
$ md5 download.bin
36afeebff1091bae0c95e103789effd6
$ md5 100MB.bin.1
36afeebff1091bae0c95e103789effd6
$ ls -la
-rw-r--r-- 1 florimond staff 104857600 25 Dec 19:54 100MB.bin.1
-rw-r--r-- 1 florimond staff 104857600 25 Dec 19:54 download.bin |
Beta Was this translation helpful? Give feedback.
-
@florimondmanca I started investigating this because downloading this file started failing: https://www.fuzzwork.co.uk/dump/postgres-schema-latest.dmp.bz2 It's a postgres dump and I've verified it works if I download it with wget and uncompress with When trying to uncompress this after downloading with |
Beta Was this translation helpful? Give feedback.
-
Using the example from here:
https://www.python-httpx.org/advanced/#monitoring-download-progress
If I download the file with
wget
themd5
is2f282b84e7e608d5852449ed940bfc51
, with this code it iscf3e683301e09fb372dd5e3191a2dede
.File sizes also differ:
230784
forhttpx
and230752
forwget
.Beta Was this translation helpful? Give feedback.
All reactions