-
Is it possible to just download CDF files from CDASWEB without loading them? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 10 replies
-
@Beforerr not easily, the main issue is that the Speasy/CDAWEB module doesn't directly access cdf files from the archive, it instead uses the CDAWEB webservice to generate cdf files on the fly with only the requested variable on the requested time range. Plus those CDF files are never saved on disk, they are read and converted to Speasy variables immediately. Depending on what you want to achieve, there could be other solutions.
Something like this: from speasy.core.any_files import list_files
import requests
import os
import tqdm
remote_dir ="https://cdaweb.gsfc.nasa.gov/pub/data/mms/mms1/fgm/srvy/l2/2016/06/"
remote_files = list_files(remote_dir, file_regex=".*\.cdf")
destdir = "data"
os.makedirs(destdir, exist_ok=True)
for file_name in tqdm.tqdm(remote_files):
with open(f"{destdir}/{file_name}", 'wb') as f:
f.write(requests.get(f"{remote_dir}{file_name}").content)
import speasy as spz
import pickle
mms1_fgm_b_bcs_srvy_l2 = spz.get_data(spz.inventories.data_tree.cda.MMS.MMS1.FGM.MMS1_FGM_SRVY_L2.mms1_fgm_b_bcs_srvy_l2, "2018-01-01", "2018-01-02")
fname = f"{mms1_fgm_b_bcs_srvy_l2.name}-{mms1_fgm_b_bcs_srvy_l2.time[0]}-{mms1_fgm_b_bcs_srvy_l2.time[-1]}.pkl"
with open(fname, "wb") as f:
f.write(pickle.dumps(mms1_fgm_b_bcs_srvy_l2))
mms1_fgm_b_bcs_srvy_l2_loaded = pickle.load(open(fname, "rb"))
mms1_fgm_b_bcs_srvy_l2 == mms1_fgm_b_bcs_srvy_l2_loaded Do not hesitate if you have more questions or if your use-case if different. |
Beta Was this translation helpful? Give feedback.
@Beforerr, another thing you might consider is using the archive module it might be faster than your actual pipeline. The main downside is that you have to write some YAML files to describe the archive you want to access. You have an example shipped with Speasy. It will both caches CDF files and SpeasyVariables we use it for MMS data and it is way faster than using regular access methods, it can even be faster than using PyCDFPP if your CDF are compressed (to be verified).
Do not hesitate to ask if you need help to elaborate one of those YAML files.