You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fs = fsspec.filesystem("http")
out = fs.cat([url1, url2, url3]) # fetches data concurrently
Our current option is to possibly use multithreading with the parallel option of the datafetcher, that is in httpstore.open_mfdataset. With this design, we apply pre/post-processing of Argo data on chunks in parallel, but that is different from downloading in parallel, then processing in parallel (possibly with another mechanism)
asyncdefget(url, session):
try:
asyncwithsession.get(url=url) asresponse:
resp=awaitresponse.read()
print("Successfully got url {} with resp of length {}.".format(url, len(resp)))
exceptExceptionase:
print("Unable to get url {} due to {}.".format(url, e.__class__))
asyncdefmain(urls):
asyncwithaiohttp.ClientSession() assession:
ret=awaitasyncio.gather(*(get(url, session) forurlinurls))
print("Finalized all. Return is a list of len {} outputs.".format(len(ret)))
urls=websites.split("\n")
start=time.time()
asyncio.run(main(urls))
end=time.time()
The text was updated successfully, but these errors were encountered:
This may be a design already implemented in the
test_data
CLI used to populate CI tests data in mocked http servers.However, I wonder if we should do this when fetching a large amount of file from one of the GDAC servers (https and s3) ?
The fsspec http store is already asynchronous but I don't quite understand how is parallelisation implemented for multi-files download:
Our current option is to possibly use multithreading with the
parallel
option of the datafetcher, that is in httpstore.open_mfdataset. With this design, we apply pre/post-processing of Argo data on chunks in parallel, but that is different from downloading in parallel, then processing in parallel (possibly with another mechanism)eg: https://stackoverflow.com/questions/57126286/fastest-parallel-requests-in-python
The text was updated successfully, but these errors were encountered: