-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validphys script to download hepdata tables #1717
Conversation
FWIW a minimum non-scripting usage of requests involves
Otherwise the request may fail with a status code and write the wrong thing to the file. In practice we'd have some reasonable error handling for that as well. |
""" | ||
self.folder.mkdir(exist_ok=True) | ||
ins_id = self.url.split("/")[-1] | ||
filename = f"HEPData-{ins_id}-v{self.version}-Table_{table_id}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like table_id=None works particularly well here. Why is it in the function signature?
validphys2/setup.py
Outdated
@@ -33,6 +33,7 @@ | |||
'vp-nextfitruncard = validphys.scripts.vp_nextfitruncard:main', | |||
'vp-hyperoptplot = validphys.scripts.vp_hyperoptplot:main', | |||
'vp-deltachi2 = validphys.scripts.vp_deltachi2:main', | |||
'get_hepdata = validphys.scripts.get_hepdata:main', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I love this name. Certainly not the underscore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right! For sure people can propose a better name, but it is correct that at least in order be consistent with the previous ones the underscore should be -
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vp-data
or vp-hepdata
seem quite natural
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like a reasonable suggestion. But perhaps for more clarity we should go for vp-getdata
(see 1ded03f)?
if "YAML" in url_tab["description"]: | ||
get_reads_yaml = requests.get(url_tab["contentUrl"]) | ||
self.write_tables(get_reads_yaml.content, table_id) | ||
log.info(f"Table {table_id} downloaded and stored properly.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this: Surely the table is only written when self.write_tables is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! The indentation of the log was misplaced, this should be fixed now.
Agree! The error handling is the main part that is still missing here. I was hoping that people will try, document the errors, and we can address them all. |
Closing as no longer required. |
As far as I understood from the report of the last CM, the script to download the HepData tables should be implemented in a separate PR and not part of #1693 as mentioned in #1699.
The following is a slight improvement on the script initially introduced in #1500. It is working fine for some of the datasets I tried in #1684 and #1699 but people should try it.