Unweighting implementation #2083

toonhasenack · 2024-05-16T10:43:32Z

Calculation of the unweighted weights in the vp framework. Module can be called with
vp-unweight "path to chi2".csv "number of datapoints"

RoyStegeman · 2024-05-16T12:42:52Z

Why is this a cli command and not just a vp action?

toonhasenack · 2024-05-18T11:05:00Z

@RoyStegeman, I followed the suggestion of @scarlehoff to do it this way.

scarlehoff · 2024-05-18T19:28:16Z

pyproject.toml

@@ -106,7 +107,7 @@ qed = ["fiatlux"]
 nolha = ["pdfflow", "lhapdf-management"]

 [tool.poetry-dynamic-versioning]
-enable = true
+enable = false


Why did you disable versioning?

(it needs to be reenabled, but i'd like to know the exact problem that led to needing to disable it)

I think this happened because I built NNPDF locally. I reverted it.

But in principle you should be able to install with pip install -e . (or without -e) and it should not modify this file.

Strange. It did happen in my case...

Is your environment managed by poetry?

scarlehoff · 2024-05-18T19:32:56Z

validphys2/src/validphys/scripts/vp_unweight.py

+import pandas as pd
+from scipy.special import xlogy
+from typing import Tuple
+from tqdm import tqdm


I'm not too fond of tqdm because it usually makes logs and debugging harder to read.
But if you are set on using it you must add it to the dependencies as well.

scarlehoff · 2024-05-18T19:39:00Z

validphys2/src/validphys/scripts/vp_unweight.py

+    parser.add_argument("chi2_name", help = "Add the filename of the chi2 dataset (.csv)")
+    parser.add_argument("N", help = "Add the amount of experimental datapoints that the chi2 is based on")
+    args = parser.parse_args()
+    chi2 = pd.read_csv(args.chi2_name).values


How is the csv generated? This should be part of the script.

Ideally the script should have two steps

Check whether the csv exist and if It doesn't generate it

Do the actual computation

In the validphys language these would be two actions, where the second depends on the first, but we can make it into proper validphys actions at the end, it can remain a script for the time being.

For the first action you can use a lot of vp stuff (e.g., functions to compute the chi2, or load pdfs).

This I could not find a better solution for. The calculation of the chi2 per replica is done in the nnpdf/external/ poljet code and if I would shift the calculation of the chi2 to validphys, I'm afraid I must migrate more functionality. However, I don't think the solution will be more elegant.

Well, you are already using validphys in that code.

So a first stage can be a simple port, and then later on instead of doing self.l.check_commondata(data_name) or API.dataset_inputs_covmat_from_systematics(**inp) you would have a function that takes as input results and then according to the definiton of dataset_inputs in the runcard / input you would get a results object already populated with the dataset, covariance matrix, statistic / systematic errors, etc.

scarlehoff · 2024-05-18T19:39:23Z

validphys2/src/validphys/scripts/vp_unweight.py

+
+        return Nps, entropies, Nopt
+
+def main(chi2, N, store = True):


Please add also here some doctrings explaining what it does.

scarlehoff · 2024-05-18T19:41:23Z

validphys2/src/validphys/scripts/vp_unweight.py

+        """
+        Initialize the Unweight class.
+
+        Args:


We are trying to follow the numpy conventions https://numpydoc.readthedocs.io/en/latest/format.html#parameters at least for new code (and specially for parameters, returns, etc). It helps consistency.

Radonirinaunimi · 2024-05-20T21:45:15Z

@toonhasenack Just to echo what has been above about the re/un-weighting being self-consistent. Everything except the generation of the theory predictions should be here, ie from the computations of the $\chi^2$ to the final dumping of the PDFs. Ideally, the theory predictions should also be stored in ../NNPDF/theories.

Also agreed on this being vp actions instead of scripts.

If you want, I could take over this PR from this point.

toonhasenack added 3 commits May 16, 2024 12:14

unweight script

c7e687b

working.

e6d3c51

small weights fix

2fd2f04

toonhasenack requested a review from scarlehoff May 18, 2024 11:08

scarlehoff reviewed May 18, 2024

View reviewed changes

Juan's comments

5dbadcd

toonhasenack added 3 commits May 21, 2024 15:32

add plot_entropy functionality

9c6feab

figure layout

bb3598b

better plotting

2e51982

Radonirinaunimi marked this pull request as draft May 24, 2024 08:26

pass minmax kinematics when loading data

71271be

Radonirinaunimi added the dont-merge label Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unweighting implementation #2083

Unweighting implementation #2083

toonhasenack commented May 16, 2024

RoyStegeman commented May 16, 2024

toonhasenack commented May 18, 2024

scarlehoff May 18, 2024 •

edited

Loading

toonhasenack May 20, 2024

scarlehoff May 20, 2024

toonhasenack May 20, 2024

RoyStegeman May 20, 2024

scarlehoff May 18, 2024

scarlehoff May 18, 2024

toonhasenack May 20, 2024

scarlehoff May 20, 2024

scarlehoff May 18, 2024

scarlehoff May 18, 2024

Radonirinaunimi commented May 20, 2024

Unweighting implementation #2083

Are you sure you want to change the base?

Unweighting implementation #2083

Conversation

toonhasenack commented May 16, 2024

RoyStegeman commented May 16, 2024

toonhasenack commented May 18, 2024

scarlehoff May 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Radonirinaunimi commented May 20, 2024

scarlehoff May 18, 2024 •

edited

Loading