Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish analysis of a multiclosure test #1982

Open
wants to merge 61 commits into
base: master
Choose a base branch
from
Open

Conversation

comane
Copy link
Member

@comane comane commented Mar 5, 2024

Here we collect new functions and template that allow the analysis of multiclosure tests.
@comane @giovannidecrescenzo @mariaubiali @andreab1997

@comane comane force-pushed the 240305_multict_analysis branch 4 times, most recently from 06db65c to b2b8557 Compare March 12, 2024 10:10
@comane comane force-pushed the 240305_multict_analysis branch 2 times, most recently from bf843e1 to 46bba4f Compare March 17, 2024 16:57
Copy link
Member Author

@comane comane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the PR is ready for review.
The only things I would change as commented above is that the single data point ratio bias variance analysis does not reside in kinematics.py but rather in multiclosure_inconsistent.py / multiclosure_inconsistent_output.py

@andreab1997
Copy link
Contributor

@comane thanks for this. If you want, ask myself for the review (but also add one between @scarlehoff and @RoyStegeman given that part of this PR is based on my work and it does not make sense for me to review my own functions )

@comane comane force-pushed the 240305_multict_analysis branch 2 times, most recently from 509a521 to a09d394 Compare April 24, 2024 16:31
Copy link
Member Author

@comane comane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ciao @andreab1997 , I added some minor comments on some of the parts of the code that you and Giovanni wrote.

When you have the time, feel free to add a review to the inconsistent_closuretest folder

validphys2/src/validphys/scripts/vp_multiclosure.py Outdated Show resolved Hide resolved
Comment on lines +23 to +40
compare_settings:
current:
fit: {id: "240210_mnc_dis_ict_lam02"}
pdf: {id: "240210_mnc_dis_ict_lam02", label: "Current Fit"}
theory:
from_: fit
theoryid:
from_: theory
speclabel: "Current Fit"

reference:
fit: {id: "240210_mnc_dis_ict_lam04"}
pdf: {id: "240210_mnc_dis_ict_lam04", label: "Reference Fit" }
theory:
from_: fit
theoryid:
from_: theory
speclabel: "Reference Fit"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andreab1997 , a weird thing here is that we can basically compare as many fits (see lambdavalues below) as we want in terms of pca_template, single_point_template, etc..

However, correct me if I am wrong, the part of this script that does vp-comparefits only compares two.

So if the above is right, I think that it would be nicer to have a multi vp-comparefits (meaning that more than 2 fits can be compared in principle) feature for this script.

validphys2/src/validphys/kinematics.py Outdated Show resolved Hide resolved
validphys2/src/validphys/closuretest/multiclosure.py Outdated Show resolved Hide resolved
validphys2/src/validphys/closuretest/multiclosure.py Outdated Show resolved Hide resolved


dataset_inputs:
- {dataset: ATLAS_SINGLETOP_7TEV_TCHANNEL-XSEC, variant: legacy}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you using the legacy versions? Is that necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fits that I use in that runcard were done using th 200. I think this is the reason, why I added the legacy version



@check_multifit_replicas
def internal_multiclosure_dataset_loader_pca(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this function is too big for my taste, maybe it could be splitted in some more unit functions

Plot the L2 condition number of the covariance matrix as a function of the explained variance ratio.
The plot gives an idea of the stability of the covariance matrix as a function of the
exaplained variance ratio and hence the number of principal components used to reduce the dimensionality.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

validphys2/src/validphys/kinematics.py Outdated Show resolved Hide resolved
@comane comane force-pushed the 240305_multict_analysis branch 2 times, most recently from b0ec210 to 59e3eea Compare June 11, 2024 16:00
@scarlehoff
Copy link
Member

This is then ready to be merged? (@andreab1997 @comane )

@andreab1997
Copy link
Contributor

This is then ready to be merged? (@andreab1997 @comane )

I don't think so, surely I need to review this again but in any case I would wait for the paper to be published.

@scarlehoff
Copy link
Member

This piece is complete and already rebased on top of master isn't it? If so it should be merged, worst case scenario you can note down the checksum of the commit but leaving it as a branch risks nobody taking care of it after the paper is out (which is what happened with the previous closure test branches and basically meant redoing a lot of stuff that mwilson already did)

@andreab1997
Copy link
Contributor

This piece is complete and already rebased on top of master isn't it? If so it should be merged, worst case scenario you can note down the checksum of the commit but leaving it as a branch risks nobody taking care of it after the paper is out (which is what happened with the previous closure test branches and basically meant redoing a lot of stuff that mwilson already did)

Ok I agree but still I would like to have another look. I will do before the end of this week in such a way we can merge this before saturday. Is that ok?

Copy link
Contributor

@andreab1997 andreab1997 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I had a look and more or less nothing relevant changed since the last time I reviewed. @comane if you can answer the comments I left last time we can probably merge this soon.

@comane
Copy link
Member Author

comane commented Jul 24, 2024

Hi @andreab1997, I addressed most of the comments that you left, as well as those that I wrote myself.

There is one main issue with this PR at present, namely, the way in which the compare inconsistent closure test script works.

This script is problematic, because it's not possible to have a dataset_inputs which is different from from_: fit, note that this is a problem since we want to have out of sample datasets.

The problem, I think, can be solved by removing the vp-compare fits from compareinconsistentclosuretemplates.
We don't need to have another vp-comparefits anyways.

If you could take care of this that would be great.

@andreab1997
Copy link
Contributor

andreab1997 commented Jul 24, 2024

Hi @andreab1997, I addressed most of the comments that you left, as well as those that I wrote myself.

There is one main issue with this PR at present, namely, the way in which the compare inconsistent closure test script works.

This script is problematic, because it's not possible to have a dataset_inputs which is different from from_: fit, note that this is a problem since we want to have out of sample datasets.

The problem, I think, can be solved by removing the vp-compare fits from compareinconsistentclosuretemplates. We don't need to have another vp-comparefits anyways.

If you could take care of this that would be great.

Just to understand, this issue is only there if you use the CLI or even if you write your own runcard and template?

@comane
Copy link
Member Author

comane commented Jul 24, 2024

Just to understand, this issue is only there if you use the CLI or even if you write your own runcard and template?

I think it's there if I run validphys multiclosure_analysis.yaml (with dataset inputs that is not from the fit)

comane and others added 29 commits July 24, 2024 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants