-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
download of big files time out and don't resume #169
Comments
Those are all very good suggestion. Would you be up for making creating a PR that implements the most general ones? |
Yes, I can do that. |
Great! Thank you 🙏🏽 |
any update on this - or thoughts on an alternative solution in the mean time? Downloading from the ENA ftp servers appears to be slow at our institute currently for some reason, so the curl command is failing on the runtime. Thanks! |
I already have implemented a patch. I'm testing it locally. But the tests take longer than usual due to ENA/SRA being slow at the moment. I will submit a pull request with the changes I have so far, then you can test it in parallel and benefit from the enhancements now already. |
Here is the PR: #229. Feel free to test and give feedback. Thanks. Given the slowness of ENA/SRA at the moment, you probably want to bump up the maximum runtime of the download process from the default of 4h to 1d:
|
Currently running into similar trouble with failed FTP downloads that don't get to finish, even after a couple of retries. @suhrig thanks for submitting your PR, but it doesn't look like it's been merged yet due to some conflicts. Could you have another look? |
A hope to get around doing this next week. Sorry for the delay. |
Should be fixed in #241. Please feel free to re-open if any problems persist. |
Description of the bug
The process
NFCORE_FETCHNGS:SRA:SRA_FASTQ_FTP
is configured to cancel a download if it takes longer than 1200 seconds (see here). Long-running downloads are thus interrupted and the following error is raised:Unfortunately,
curl
does not resume a download when it retries - even when the parameter--continue-at -
is used. As stated in the error message, it "throws away" the downloaded data and starts over. This effectively means that the download never completes.--max-time
? Is it to catch stalling downloads? Wouldn't it be better to detect stalling downloads as such, instead of assuming that any download that takes longer than 1200s must have stalled? Namely,--speed-limit 1 --speed-time 60
instructscurl
to assume a download has stalled if the speed was less than 1 byte/sec for 60 seconds in a row?curl
command in a bash for-loop in conjunction with--continue-at -
. Alternatively: Why not usewget
, which does not throw away downloaded data upon retry and also has stall detection (--read-timeout=60
)?The text was updated successfully, but these errors were encountered: