-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New commondata format] Jets #1699
Conversation
@author: Mark N. Costantini | ||
""" | ||
|
||
import yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general note, I'd prefer using the ruamel.yaml library (or from reportengine.compat import yaml
).
It doesn't matter very much if you want parse into a simple dict, but it is better to use only one yaml library in the whole codebase, and also the capability to know about things like line numbers and comments is useful for things like this https://validobj.readthedocs.io/en/latest/examples.html#yaml-line-numbers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relatedly, I wonder if the processing of the hepdata raw files could be done with validobj (like we already do for the commondata new format). May or may not be simpler, but very likely more reusable.
@enocera @Radonirinaunimi what was the status of the thing that auto-downloaded the tables? Shouldn't it be used in this kind of scripts? |
dat_file = '/Users/markcostantini/codes/nnpdfgit/nnpdf/buildmaster/results/DATA_ATLAS_2JET_7TEV_R06.dat' | ||
sys_file = '/Users/markcostantini/codes/nnpdfgit/nnpdf/buildmaster/results/systypes/SYSTYPE_ATLAS_2JET_7TEV_R06_DEFAULT.dat' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be generalized.
Clum = np.einsum('ij,kj->ik',Alum,Alum) | ||
|
||
# construct Block diagonal statistical Covariance matrix | ||
ndata = [21, 21, 19, 17, 8, 4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a bunch of magic numbers here that feels like they should be read from somewhere.
# construct Block diagonal statistical Covariance matrix | ||
ndata = [21, 21, 19, 17, 8, 4] | ||
stat = pd.read_csv(f"rawdata/dijet_statcov/hepcov_R06_Eta0.txt", sep = " ", header = None) | ||
BD_stat = stat.loc[13:,1:ndata[0]].to_numpy().astype(float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these casts needed?
with open("uncertainties_weaker.yaml",'w') as file: | ||
yaml.dump(uncertainties_yaml,file, sort_keys=False) | ||
|
||
return covmat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like we are using the return value.
It must be indeed part of a PR in which the changes in #1693 are introduced. AFAIC, adding this feature will just amount to moving the implementation in #1500 and ask people to test it. However, I am not sure if the utils in #1693 are already pointing to the correct branch (based on the exchanges that seems to be not yet the case). |
…r the three correlation scenarios
b2b0b14
to
b0ce842
Compare
PR for the implementation of existing Jet datasets into new CommonData format.