Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

download of big files time out and don't resume #169

Closed
suhrig opened this issue Jun 11, 2023 · 10 comments
Closed

download of big files time out and don't resume #169

suhrig opened this issue Jun 11, 2023 · 10 comments
Labels
bug Something isn't working
Milestone

Comments

@suhrig
Copy link

suhrig commented Jun 11, 2023

Description of the bug

The process NFCORE_FETCHNGS:SRA:SRA_FASTQ_FTP is configured to cancel a download if it takes longer than 1200 seconds (see here). Long-running downloads are thus interrupted and the following error is raised:

curl: (28) Operation timed out after 1200000 milliseconds with 23601274614 out of 49920124494 bytes received
Warning: Problem : timeout. Will retry in 1 seconds. 5 retries left.
Throwing away 23601274614 bytes

Unfortunately, curl does not resume a download when it retries - even when the parameter--continue-at - is used. As stated in the error message, it "throws away" the downloaded data and starts over. This effectively means that the download never completes.

  1. What's the point of the parameter --max-time? Is it to catch stalling downloads? Wouldn't it be better to detect stalling downloads as such, instead of assuming that any download that takes longer than 1200s must have stalled? Namely, --speed-limit 1 --speed-time 60 instructs curl to assume a download has stalled if the speed was less than 1 byte/sec for 60 seconds in a row?
  2. This does not resolve the main issue that downloads are not resumed upon retries. One has to wrap the curl command in a bash for-loop in conjunction with --continue-at -. Alternatively: Why not use wget, which does not throw away downloaded data upon retry and also has stall detection (--read-timeout=60)?
@suhrig suhrig added the bug Something isn't working label Jun 11, 2023
@Midnighter
Copy link
Contributor

Those are all very good suggestion. Would you be up for making creating a PR that implements the most general ones?

@suhrig
Copy link
Author

suhrig commented Sep 4, 2023

Yes, I can do that.

@drpatelh
Copy link
Member

drpatelh commented Sep 4, 2023

Great! Thank you 🙏🏽

@samleenz
Copy link

any update on this - or thoughts on an alternative solution in the mean time?

Downloading from the ENA ftp servers appears to be slow at our institute currently for some reason, so the curl command is failing on the runtime.

Thanks!

@suhrig
Copy link
Author

suhrig commented Oct 18, 2023

I already have implemented a patch. I'm testing it locally. But the tests take longer than usual due to ENA/SRA being slow at the moment. I will submit a pull request with the changes I have so far, then you can test it in parallel and benefit from the enhancements now already.

@suhrig suhrig mentioned this issue Oct 18, 2023
5 tasks
@suhrig
Copy link
Author

suhrig commented Oct 18, 2023

Here is the PR: #229. Feel free to test and give feedback. Thanks.

Given the slowness of ENA/SRA at the moment, you probably want to bump up the maximum runtime of the download process from the default of 4h to 1d:

process {
   withName:SRA_FASTQ_FTP {
      time = '1d'
   }
}

@amizeranschi
Copy link

Currently running into similar trouble with failed FTP downloads that don't get to finish, even after a couple of retries.

@suhrig thanks for submitting your PR, but it doesn't look like it's been merged yet due to some conflicts. Could you have another look?

@suhrig
Copy link
Author

suhrig commented Nov 24, 2023

A hope to get around doing this next week. Sorry for the delay.

@drpatelh drpatelh added this to the 1.12.0 milestone Jan 3, 2024
@drpatelh
Copy link
Member

drpatelh commented Jan 3, 2024

x-ref #241 replacement for PR #229

@drpatelh
Copy link
Member

drpatelh commented Jan 4, 2024

Should be fixed in #241. Please feel free to re-open if any problems persist.

@drpatelh drpatelh closed this as completed Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants