Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduction of an inconsistency within a Closure Test #1682

Open
wants to merge 108 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
108 commits
Select commit Hold shift + click to select a range
ce32d07
fixed bug affecting, for instance, vp-comparefits introduced from com…
Dec 23, 2022
4456035
Merge branch 'master' of github.com:NNPDF/nnpdf
comane Feb 27, 2023
a72185f
added MULT and ADD rescale methods to CommonData class
comane Feb 27, 2023
275e4ac
added logic to introduce inconsistency within a closure test
comane Feb 27, 2023
e8d802e
generate MULT errors in L1 covmat using the L0 central values
comane Feb 27, 2023
a6b2e5b
commented functions
comane Feb 28, 2023
e226442
add forgotten filter_closure_data_by_experiment
comane Feb 28, 2023
08948ab
generate L1 MULT unc of cov mat using exp central values
comane Feb 28, 2023
2e39d06
simplify logic of process_commondata
comane Mar 2, 2023
e56bba0
reduce error rescaling methods to one
comane Mar 2, 2023
ef5b042
reduce error rescaling methods to one
comane Mar 2, 2023
a2e58cd
removed unnecessary rescaling methods
comane Mar 2, 2023
25afee5
modified pseudodata and added Test dir
Mar 9, 2023
f009a10
revert last commit
Mar 9, 2023
9b2ac59
Added possible way to perform a level 1 inconsistent closure test
Mar 14, 2023
1edff9b
Added comment in validphys.pseudodata.make_level1_data
Mar 14, 2023
854b4ba
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
Mar 14, 2023
6d19e0a
add InconsistentCommonData class inheriting from CommonData
comane Mar 20, 2023
68c8698
added tests for methods of InconsistenCommonData class
comane Mar 20, 2023
b79245f
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
comane Mar 20, 2023
15cf785
added test for process_commondata
comane Mar 21, 2023
e98306a
Changed inconsistent fits labels
Mar 21, 2023
8e07ffd
methods for inconsistent ct moved to inconsistent_ct.py
comane Mar 21, 2023
f6830cc
use InconsistentCommonData instance
comane Mar 21, 2023
fe80a54
removed superfluos code
comane Mar 21, 2023
fc83349
use InconsistentCommonData instance
comane Mar 21, 2023
2772e61
use new inconsis labels
comane Mar 21, 2023
bbad659
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
Mar 27, 2023
ac900d9
changed last label lvl1->type1 inconsistent
Mar 27, 2023
02ca89a
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
Mar 28, 2023
a85da6e
minor mod
Apr 13, 2023
3e1e58e
comment function filter_closure_data_by_experiment
comane Apr 17, 2023
52b3f2f
added doc to make_level1
comane Apr 17, 2023
7f0f1b5
set rtol to default
comane Apr 17, 2023
32cb98a
added plot for sqrt_bias_variance_ratio
comane Apr 18, 2023
80994cd
corrected type
comane Apr 18, 2023
0c46e4d
changed label
comane Apr 18, 2023
6727ca2
added function for b/v plots
Apr 18, 2023
f869e80
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
Apr 18, 2023
90a5e02
minor label
Apr 18, 2023
0d4bab8
labels
Apr 18, 2023
918c42d
added funciton for difference between modified/original covmat. Modif…
Apr 18, 2023
709121e
added logging
comane Apr 18, 2023
abd9949
added function to compute b/v ratio for each dataset
comane Apr 18, 2023
f05a03e
added table for b/v ratio for each dataset
comane Apr 18, 2023
628e416
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
comane Apr 18, 2023
b3fbd76
added docs
comane Apr 18, 2023
2ee2e1e
added progressive b/v plot for all data
comane Apr 18, 2023
b740b85
modified formatting
comane Apr 19, 2023
39c6f37
internal_multiclosure_dataset_loader takes t0_covmat_from_systematics
comane Apr 19, 2023
8dfe470
removed unnecessary funcs
comane Apr 19, 2023
6fb1db7
added sqrt column to table
comane Apr 19, 2023
f407196
.
comane Apr 19, 2023
fe146ea
changed doc
comane Apr 19, 2023
9667cfb
changed doc
comane Apr 19, 2023
53bdaa4
use correct mean value for sqrt b/v plot label
comane Apr 19, 2023
45c95e6
deleted bias_variance_ratio_for_each_dataset function that was introd…
comane Apr 25, 2023
0f7e15e
added table for Rbv by process
comane Apr 25, 2023
71fef24
deleted sqrt_experiments_bias_variance_ratio function as the same as …
comane Apr 25, 2023
eb1902b
deleted sqrt_datasets_bias_variance_ratio since the same as datasets_…
comane Apr 25, 2023
b354a7f
added function for inconsistency impact
Apr 25, 2023
7a6e575
.
Apr 25, 2023
4207044
added functions for bias and variance table
Apr 26, 2023
82ab29f
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
Apr 26, 2023
a839c26
added plot xq2 with rbv
comane Apr 26, 2023
e17907c
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
comane Apr 26, 2023
4c75ea5
.
comane Apr 26, 2023
0c19a36
modified func
Apr 26, 2023
7f37d54
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
Apr 26, 2023
a5b615d
.
Apr 26, 2023
66473d8
.
Apr 26, 2023
967395c
..
Apr 26, 2023
0c33c16
added plot function for Rbv sensitivity to missing test dataset
comane Apr 26, 2023
98d9708
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
comane Apr 26, 2023
7fc132e
mod doc
comane Apr 27, 2023
40a6e3f
minor mod
Apr 28, 2023
55d3a44
.
Apr 28, 2023
2581a26
show correct label in sqrt rbv plot
comane Apr 28, 2023
9ac8b96
Merge branch 'inconSYStencies' of http://github.com/NNPDF/nnpdf into …
comane Apr 28, 2023
d683b5a
added weighted average for progressive R_bv
May 4, 2023
d8a71ab
added function for bias/var plotting
May 9, 2023
cccdf44
formatted + add datasets bias variance collect
comane May 10, 2023
3d995a7
reverted previous weight commit
May 11, 2023
b3e49c2
modified covmat_diff in closure_results.py
May 13, 2023
ffee474
minor mod
May 13, 2023
4c2634d
changed formatting
comane May 16, 2023
b37823a
modified doc to func
comane May 16, 2023
eea237e
removed plot_Rbv_sensitivity_to_test_datasets introduced by me before…
comane May 16, 2023
90bb97f
removed plot_xq2_with_Rbv introduced by me, but useless
comane May 16, 2023
43f047d
removed seaborn import
comane May 16, 2023
533c4b1
Merge branch 'master' into inconSYStencies
comane May 22, 2023
4d5d473
added sep_mult for consistent L1 data generation]
comane May 22, 2023
ed266ae
removed matplotlib.pyplot import
comane May 26, 2023
2f58693
subplots from plotutils rather than matplotlib.pyplot
comane May 26, 2023
8ba15bb
corrected type1 inconsistency methodology
comane May 26, 2023
50550bf
removed old comment
comane May 26, 2023
e1404e1
added pdf_err + exp covmat
comane May 29, 2023
24c2985
added pdferr only without regularization
comane May 30, 2023
26530f4
added possibility to introduce inconsistency for inter dataset correl…
comane May 30, 2023
bef1cb5
CORR excludes SPECIAL sys
comane May 30, 2023
f1aaed2
pdf_err only covmat regularized with the calcutils.regularize_covmat
comane May 30, 2023
1371de2
pdf_err only covmat regularized with the calcutils.regularize_covmat
comane May 30, 2023
d90201a
added plotting function for impact of inconsistency on covmat trace
May 31, 2023
d9a421b
Merge branch 'inconSYStencies' of https://github.com/NNPDF/nnpdf into…
May 31, 2023
655320a
added to init.py in validphys closure test incnsistent_plots.py and a…
May 31, 2023
1064d35
modified labels
May 31, 2023
a6f0b15
Modified plotting function for inconsistent trace
Jun 8, 2023
e309d4a
removed pdf covmat changes
comane Jun 9, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions validphys2/src/validphys/closuretest/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@
from validphys.closuretest.multiclosure_pdf_output import *
from validphys.closuretest.multiclosure_preprocessing import *
from validphys.closuretest.multiclosure_pseudodata import *
from validphys.closuretest.inconsistent_plots import *
66 changes: 65 additions & 1 deletion validphys2/src/validphys/closuretest/closure_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

from reportengine import collect
from reportengine.table import table
from reportengine.figure import figure, figuregen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move these things to closure_plots.py?


from validphys.calcutils import calc_chi2, bootstrap_values
from validphys.checks import check_pdf_is_montecarlo
Expand All @@ -20,7 +21,9 @@
check_fits_same_filterseed,
check_fits_underlying_law_match,
)

from validphys import plotutils
from validphys.inconsistent_ct import InconsistentCommonData
from validphys.covmats import dataset_inputs_covmat_from_systematics

BiasData = namedtuple("BiasData", ("bias", "ndata"))

Expand Down Expand Up @@ -394,3 +397,64 @@ def fit_underlying_pdfs_summary(fit, fitunderlyinglaw):
def summarise_closure_underlying_pdfs(fits_underlying_pdfs_summary):
"""Collects the underlying pdfs for all fits and concatenates them into a single table"""
return pd.concat(fits_underlying_pdfs_summary, axis=1)


@table
def covmat_diffs(data, inconsistent_datasets, sys_rescaling_factor):
"""Calculate trace difference between consistent and inconsistent covmat. Put results
in table labeling by the type of inconsistency modified and the dataset in which the inconsistency
was introduced

"""

dataset_input_list = list(data.dsinputs)
commondata_wc = data.load_commondata_instance()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I am not understanding. You are overwriting the content of this variable the line below so what is the reason?

commondata_wc = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are inconsistent please give it a telling name, like inconsistent_commondata_wc

InconsistentCommonData(setname=cd.setname, ndata=cd.ndata,
commondataproc=cd.commondataproc,
nkin=cd.nkin, nsys=cd.nsys,
commondata_table = cd.commondata_table,
systype_table = cd.systype_table)
for cd in commondata_wc
]
consistent_covmat = dataset_inputs_covmat_from_systematics(
commondata_wc,
dataset_input_list,
use_weights_in_covmat=False,
norm_threshold=None,
_list_of_central_values=None,
_only_additive=False,
)

trace = np.trace(consistent_covmat)

# Study the impact on the trace of the covariance matrix if uncertainties
# are rescaled by sys_rescaling_factor. Label by the type of error rescaled:
# ADD/CORR
# ADD/UNCORR
# MULT/CORR
# MULT/UNCORR
# use the following entries_dict as input for process_commondata
entries_dict = {"A/C":[True,False,True,False],"A/U":[True,False,False,True],
"M/C":[False,True,True,False],"M/U":[False,True,False,True]}
impact_dict = {}
for inconsist_ds in inconsistent_datasets:
cov_dict = {}
for entry in entries_dict:

inp = entries_dict[entry]
commondata_wc_temp = [cd.process_commondata(inp[0],inp[1],inp[2],inp[3],
inconsist_ds,sys_rescaling_factor)
for cd in commondata_wc]
modified_covmat = dataset_inputs_covmat_from_systematics(
commondata_wc_temp,
dataset_input_list,
use_weights_in_covmat=False,
norm_threshold=None,
_list_of_central_values=None,
_only_additive=False,
)
cov_dict[entry] = (trace-np.trace(modified_covmat))/trace*100
impact_dict[inconsist_ds] = cov_dict
df = pd.DataFrame.from_records(impact_dict)
return df
107 changes: 107 additions & 0 deletions validphys2/src/validphys/closuretest/inconsistent_plots.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
"""
closuretest/inconsistent_plots.py

Useful plots for analysis of inconsistent closure tests
"""
from reportengine.figure import figure
from reportengine.table import table
from validphys import plotutils
from validphys.inconsistent_ct import InconsistentCommonData
from validphys.covmats import dataset_inputs_covmat_from_systematics
import numpy as np
import pandas as pd

def covmat_trace(dataset_input_list,commondata_wc):
"""Return trace of experimental matrix
"""
normal_covmat = dataset_inputs_covmat_from_systematics(
commondata_wc,
dataset_input_list,
use_weights_in_covmat=False,
norm_threshold=None,
_list_of_central_values=None,
_only_additive=False,
)
return np.trace(normal_covmat)

def mod_covmat_trace(dataset_input_list,commondata_wc, inconsistent_datasets, ADD, MULT, CORR, UNCORR, SPECIAL, sys_rescaling_factor):
""" Calculate trace of inconsistent covmat rescaled by
sys_rescaling_factor affecting ADD/MULT & CORR/UNCORR/SPECIAL.
"""
commondata_wc_temp = [cd.process_commondata(ADD,MULT,CORR,UNCORR,SPECIAL,
inconsistent_datasets,sys_rescaling_factor)
for cd in commondata_wc]
modified_covmat = dataset_inputs_covmat_from_systematics(
commondata_wc_temp,
dataset_input_list,
use_weights_in_covmat=False,
norm_threshold=None,
_list_of_central_values=None,
_only_additive=False,
)
#Calculate trace of modified trace (either for type 1 or 2)
modified_trace = np.trace(modified_covmat)
return modified_trace

@figure
def plot_trace_impact(data, inconsistent_datasets, ADD,MULT,CORR,UNCORR,SPECIAL):
"""
Plot trace ratio for different sys_rescaling_factors. Specify what kind
of error has been modified and for which datasets.
The marked points are the one for which the trace ratio corresponds between type1/type2 inconsistent fit.
"""

# Load here all the data, does not make sense to load them each time the funciton is called
dataset_input_list = list(data.dsinputs)
commondata_wc = data.load_commondata_instance()
commondata_wc_ic = [
InconsistentCommonData(setname=cd.setname, ndata=cd.ndata,
commondataproc=cd.commondataproc,
nkin=cd.nkin, nsys=cd.nsys,
commondata_table = cd.commondata_table,
systype_table = cd.systype_table)
for cd in commondata_wc
]
normal_trace = covmat_trace(dataset_input_list,commondata_wc)
lam_factors = np.arange(0,3,0.02)
ratios = []
fig, ax = plotutils.subplots()
points = []
i = 0
for lam in lam_factors:
mod_trace = mod_covmat_trace(dataset_input_list,commondata_wc_ic, inconsistent_datasets,
ADD,MULT,CORR,UNCORR,SPECIAL,
lam)
if lam < 1: ratios.append(mod_trace/normal_trace*100)
if lam >= 1: ratios.append(normal_trace/mod_trace*100)
if i%10 == 0 and lam < 1:
ax.plot(lam_factors[i],ratios[-1],marker = ".",
markersize = 10,
label = "lambda type 2: " + str(round(lam_factors[i],3)) + "; ratio: " + str(round(ratios[-1],3)))
points.append(ratios[-1])
i += 1
for point in points:
a,b = find_intersections(np.asarray(lam_factors), np.asarray(ratios), point)
ax.plot(a,b,marker = ".",markersize = 10,label = "lambda type1: " + str(round(a[0],3))+"; ratio: " + str(round(b[0],3)))
type_a_m = ""
type_c_u_s = ""
if ADD: type_a_m = " ADD "
if MULT: type_a_m = type_a_m + " MULT "
if CORR: type_c_u_s = " CORR "
if UNCORR: type_c_u_s = type_c_u_s + " UNCORR "
if SPECIAL: type_c_u_s = type_c_u_s + " SPECIAL"
ax.plot(lam_factors, ratios, label = "percentage ratios")
title = "Impact of inconsistency of type " + str(type_a_m) + " and " + str(type_c_u_s) + " in \n" + str(inconsistent_datasets) + " wrt all ds. \n"
ax.legend()
ax.set_title(title)
ax.set_xlabel("rescaling factor")
ax.set_ylabel("percentage ratio")
return fig

def find_intersections(x, y, C):
# Contains numpy indexing tricks that can be hard to reproduce
# in the case where both functions are non-constants
ii, = np.nonzero((y[1:]-C)*(y[:-1]-C) < 0.) # intersection indices
x_intersections = x[ii] + (C - y[ii])/(y[1+ii] - y[ii])*(x[1+ii] - x[ii])
y_intersections = C * np.ones(len(ii))
return x_intersections, y_intersections
Loading
Loading