Polish analysis of a multiclosure test #1982

comane · 2024-03-05T16:17:13Z

Here we collect new functions and template that allow the analysis of multiclosure tests.
@comane @giovannidecrescenzo @mariaubiali @andreab1997

validphys2/src/validphys/kinematics.py

comane

I think that the PR is ready for review.
The only things I would change as commented above is that the single data point ratio bias variance analysis does not reside in kinematics.py but rather in multiclosure_inconsistent.py / multiclosure_inconsistent_output.py

andreab1997 · 2024-03-18T10:10:03Z

@comane thanks for this. If you want, ask myself for the review (but also add one between @scarlehoff and @RoyStegeman given that part of this PR is based on my work and it does not make sense for me to review my own functions )

comane

Ciao @andreab1997 , I added some minor comments on some of the parts of the code that you and Giovanni wrote.

When you have the time, feel free to add a review to the inconsistent_closuretest folder

validphys2/src/validphys/scripts/vp_multiclosure.py

comane · 2024-04-24T16:39:40Z

validphys2/src/validphys/compareinconsistentclosuretemplates/multiclosure_analysis.yaml

+compare_settings:
+  current:
+    fit: {id: "240210_mnc_dis_ict_lam02"}
+    pdf: {id: "240210_mnc_dis_ict_lam02", label: "Current Fit"}
+    theory:
+      from_: fit
+    theoryid:
+      from_: theory
+    speclabel: "Current Fit"
+
+  reference:
+    fit: {id: "240210_mnc_dis_ict_lam04"}
+    pdf: {id: "240210_mnc_dis_ict_lam04", label: "Reference Fit" }
+    theory:
+      from_: fit
+    theoryid:
+      from_: theory
+    speclabel: "Reference Fit"


Hi @andreab1997 , a weird thing here is that we can basically compare as many fits (see lambdavalues below) as we want in terms of pca_template, single_point_template, etc..

However, correct me if I am wrong, the part of this script that does vp-comparefits only compares two.

So if the above is right, I think that it would be nicer to have a multi vp-comparefits (meaning that more than 2 fits can be compared in principle) feature for this script.

validphys2/src/validphys/kinematics.py

validphys2/src/validphys/closuretest/multiclosure.py

validphys2/src/validphys/closuretest/multiclosure_output.py

andreab1997 · 2024-05-07T09:26:30Z

validphys2/examples/pca_bias_variance_report.yaml

+
+
+dataset_inputs:
+  - {dataset: ATLAS_SINGLETOP_7TEV_TCHANNEL-XSEC, variant: legacy}


Why are you using the legacy versions? Is that necessary?

The fits that I use in that runcard were done using th 200. I think this is the reason, why I added the legacy version

validphys2/src/validphys/closuretest/inconsistent_closuretest/multiclosure_inconsistent.py

andreab1997 · 2024-05-07T09:30:46Z

validphys2/src/validphys/closuretest/inconsistent_closuretest/multiclosure_inconsistent.py

+
+
+@check_multifit_replicas
+def internal_multiclosure_dataset_loader_pca(


In general this function is too big for my taste, maybe it could be splitted in some more unit functions

andreab1997 · 2024-05-07T09:33:13Z

...phys2/src/validphys/closuretest/inconsistent_closuretest/multiclosure_inconsistent_output.py

+    Plot the L2 condition number of the covariance matrix as a function of the explained variance ratio.
+    The plot gives an idea of the stability of the covariance matrix as a function of the
+    exaplained variance ratio and hence the number of principal components used to reduce the dimensionality.
+


Suggested change

validphys2/src/validphys/closuretest/multiclosure_output.py

validphys2/src/validphys/kinematics.py

scarlehoff · 2024-07-22T07:59:06Z

This is then ready to be merged? (@andreab1997 @comane )

andreab1997 · 2024-07-24T08:08:46Z

This is then ready to be merged? (@andreab1997 @comane )

I don't think so, surely I need to review this again but in any case I would wait for the paper to be published.

scarlehoff · 2024-07-24T08:18:15Z

This piece is complete and already rebased on top of master isn't it? If so it should be merged, worst case scenario you can note down the checksum of the commit but leaving it as a branch risks nobody taking care of it after the paper is out (which is what happened with the previous closure test branches and basically meant redoing a lot of stuff that mwilson already did)

andreab1997 · 2024-07-24T08:21:23Z

This piece is complete and already rebased on top of master isn't it? If so it should be merged, worst case scenario you can note down the checksum of the commit but leaving it as a branch risks nobody taking care of it after the paper is out (which is what happened with the previous closure test branches and basically meant redoing a lot of stuff that mwilson already did)

Ok I agree but still I would like to have another look. I will do before the end of this week in such a way we can merge this before saturday. Is that ok?

andreab1997

Ok I had a look and more or less nothing relevant changed since the last time I reviewed. @comane if you can answer the comments I left last time we can probably merge this soon.

comane · 2024-07-24T10:36:34Z

Hi @andreab1997, I addressed most of the comments that you left, as well as those that I wrote myself.

There is one main issue with this PR at present, namely, the way in which the compare inconsistent closure test script works.

This script is problematic, because it's not possible to have a dataset_inputs which is different from from_: fit, note that this is a problem since we want to have out of sample datasets.

The problem, I think, can be solved by removing the vp-compare fits from compareinconsistentclosuretemplates.
We don't need to have another vp-comparefits anyways.

If you could take care of this that would be great.

andreab1997 · 2024-07-24T10:42:47Z

Hi @andreab1997, I addressed most of the comments that you left, as well as those that I wrote myself.

There is one main issue with this PR at present, namely, the way in which the compare inconsistent closure test script works.

This script is problematic, because it's not possible to have a dataset_inputs which is different from from_: fit, note that this is a problem since we want to have out of sample datasets.

The problem, I think, can be solved by removing the vp-compare fits from compareinconsistentclosuretemplates. We don't need to have another vp-comparefits anyways.

If you could take care of this that would be great.

Just to understand, this issue is only there if you use the CLI or even if you write your own runcard and template?

comane · 2024-07-24T10:50:18Z

Just to understand, this issue is only there if you use the CLI or even if you write your own runcard and template?

I think it's there if I run validphys multiclosure_analysis.yaml (with dataset inputs that is not from the fit)

…bootstrapped multiclosuretests

…to compute error bands after bootstrap

comane force-pushed the 240305_multict_analysis branch 4 times, most recently from 06db65c to b2b8557 Compare March 12, 2024 10:10

comane force-pushed the 240305_multict_analysis branch 2 times, most recently from bf843e1 to 46bba4f Compare March 17, 2024 16:57

comane commented Mar 17, 2024

View reviewed changes

validphys2/src/validphys/kinematics.py Outdated Show resolved Hide resolved

comane commented Mar 17, 2024

View reviewed changes

comane requested review from scarlehoff, RoyStegeman and andreab1997 March 19, 2024 20:22

comane force-pushed the 240305_multict_analysis branch from 5fca583 to 91d493d Compare April 11, 2024 21:35

comane force-pushed the 240305_multict_analysis branch 2 times, most recently from 509a521 to a09d394 Compare April 24, 2024 16:31

comane commented Apr 24, 2024

View reviewed changes

andreab1997 requested changes May 7, 2024

View reviewed changes

comane force-pushed the 240305_multict_analysis branch 2 times, most recently from b0ec210 to 59e3eea Compare June 11, 2024 16:00

comane force-pushed the 240305_multict_analysis branch from d6c0929 to 76001f4 Compare June 27, 2024 13:42

scarlehoff added the closure tests label Jul 17, 2024

comane force-pushed the 240305_multict_analysis branch from 76001f4 to cc5f8ac Compare July 17, 2024 14:24

andreab1997 reviewed Jul 24, 2024

View reviewed changes

comane and others added 29 commits July 24, 2024 14:22

removed sklearn dep from conda recipe meta file

5606f3e

removed variancepdf as unused

ee8bf09

added check_multifit_replicas check

7144ac9

use plotting dataset labels for rbv vs lambda titles

ce5ff2d

added hlines for rbv = 1

328a892

added bootstrapped_internal_multiclosure_dataset_loader for tuple of …

1f4eacc

…bootstrapped multiclosuretests

bootstrap of PCA regularised multiclosure tests

8689a18

bootstrap for internal_data_loadeer objecgs

2690bc9

added bootstrapped_principal_components_bias_variance_dataset

cc77ed9

added bootstrap table to report

493e769

changed defaults of bootstrap

fb4fb4e

added title for single data point in latex mode

b154273

Add PCA on corr matrix on full dataset

a1d6a08

fixed inconsistency with single data point and suggest different way …

d1808ec

…to compute error bands after bootstrap

added delta plots

7c20ec8

use consistent bootstrap def and separate table datasets from table data

6f2a8ef

fixed PCA of correlation matrix for full dataset

75773d4

compute rbv scan using bootstrap uncertainty quantification

cb61d22

added full data bootstrapped table

07b16a6

slight change in def of delta hist

099af95

added definition of delta in line with eq. 2.22

89398ef

added definition of delta in line with eq. 2.22

8d71a2e

added rbv scan for full dataset

8df13b2

added bootstrapped xi indicator function for full dataset

defcd96

removed unused import from vp_multiclosure.py script

e03c29e

removed new lines

5b03cf4

removed unused variables

ddc3c0d

_covmats as array instead of list of arrays

e4fe438

added eigendecomposition function

7cfdd9a

comane force-pushed the 240305_multict_analysis branch from 0740706 to 7cfdd9a Compare July 24, 2024 13:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polish analysis of a multiclosure test #1982

Polish analysis of a multiclosure test #1982

comane commented Mar 5, 2024

comane left a comment

andreab1997 commented Mar 18, 2024

comane left a comment

comane Apr 24, 2024

andreab1997 May 7, 2024

comane Jul 24, 2024

andreab1997 May 7, 2024

andreab1997 May 7, 2024

scarlehoff commented Jul 22, 2024

andreab1997 commented Jul 24, 2024

scarlehoff commented Jul 24, 2024

andreab1997 commented Jul 24, 2024

andreab1997 left a comment

comane commented Jul 24, 2024

andreab1997 commented Jul 24, 2024 •

edited by comane

Loading

comane commented Jul 24, 2024



		dataset_inputs:
		- {dataset: ATLAS_SINGLETOP_7TEV_TCHANNEL-XSEC, variant: legacy}



		@check_multifit_replicas
		def internal_multiclosure_dataset_loader_pca(

Polish analysis of a multiclosure test #1982

Are you sure you want to change the base?

Polish analysis of a multiclosure test #1982

Conversation

comane commented Mar 5, 2024

comane left a comment

Choose a reason for hiding this comment

andreab1997 commented Mar 18, 2024

comane left a comment

Choose a reason for hiding this comment

comane Apr 24, 2024

Choose a reason for hiding this comment

andreab1997 May 7, 2024

Choose a reason for hiding this comment

comane Jul 24, 2024

Choose a reason for hiding this comment

andreab1997 May 7, 2024

Choose a reason for hiding this comment

andreab1997 May 7, 2024

Choose a reason for hiding this comment

scarlehoff commented Jul 22, 2024

andreab1997 commented Jul 24, 2024

scarlehoff commented Jul 24, 2024

andreab1997 commented Jul 24, 2024

andreab1997 left a comment

Choose a reason for hiding this comment

comane commented Jul 24, 2024

andreab1997 commented Jul 24, 2024 • edited by comane Loading

comane commented Jul 24, 2024

andreab1997 commented Jul 24, 2024 •

edited by comane

Loading