Skip to content

Commit f63627c

Browse files
Update discrete systematic README (Ultrasurface use) (#846)
* Update discrete systematic README (Ultrasurface use) The previous version described the use of scripts that aren't part of pisa: * remove the pisa unrelated descriptions * update the use instructions with specific config and pipeline example * Implement PR review comments
1 parent 1969a31 commit f63627c

File tree

1 file changed

+53
-26
lines changed

1 file changed

+53
-26
lines changed

pisa/stages/discr_sys/README.md

+53-26
Original file line numberDiff line numberDiff line change
@@ -42,39 +42,66 @@ This will generate N different `.json` for N systematics.
4242
All the info from the fit, including the fit function itself is stored in that file.
4343
Plotting is also available via `-p/--plot' and is HIGHLY recomended to inspect the fit results.
4444

45-
### Ultrasurfaces
46-
47-
This is the novel treatment of detector systematics via likelihood-free inference. It assigns gradients to _every event_ to allow event-by-event re-weighting as a function of detector uncertainties in a way that is fully decoupled from flux and oscillation effects.
4845

49-
Once ready, the results are stored in a `.feather` file containing all events of the nominal MC set and their associated gradients.
46+
### Ultrasurfaces
5047

51-
### Preparation
48+
Treatment of detector systematics via likelihood-free inference. Polynomial coefficients, assigned to every event, allow continuous re-weighting as a function of detector uncertainties in a way that is fully decoupled from flux and oscillation effects. The results are stored in a feather file containing all events of the nominal MC set and their associated polynomial coefficients.
5249

53-
The scripts producing the gradients are located in `$FRIDGE_DIR/analysis/oscnext_ultrasurfaces`. To produce the gradient feather file, we first need to convert PISA HDF5 files to `.feather` using the `pisa_to_feather.py` script. We need to pass the input file, the output file, and a flag setting the sample (variable names) to be used (either `--verification-sample`, `--flercnn-sample`, `--flercnn-hnl-sample`, `--upgrade-sample`, or no additional flag for the Retro sample).
50+
To use this in a PISA analysis pipeline, you will need to set up an ultrasurface config file looking like this:
5451

52+
```ini
53+
[discr_sys.ultrasurfaces]
54+
55+
calc_mode = events
56+
apply_mode = events
57+
58+
# DOM efficiency
59+
param.dom_eff = 1.0 +/- 0.1
60+
param.dom_eff.fixed = False
61+
param.dom_eff.range = [0.8, 1.2] * units.dimensionless
62+
param.dom_eff.tex = \epsilon_{\rm{DOM}}
63+
64+
# hole ice scattering
65+
param.hole_ice_p0 = +0.101569
66+
param.hole_ice_p0.fixed = False
67+
param.hole_ice_p0.range = [-0.6, 0.5] * units.dimensionless
68+
param.hole_ice_p0.prior = uniform
69+
param.hole_ice_p0.tex = \rm{hole \, ice}, \: p_0
70+
71+
# hole ice forward
72+
param.hole_ice_p1 = -0.049344
73+
param.hole_ice_p1.fixed = False
74+
param.hole_ice_p1.range = [-0.2, 0.2] * units.dimensionless
75+
param.hole_ice_p1.prior = uniform
76+
param.hole_ice_p1.tex = \rm{hole \, ice}, \: p_1
77+
78+
# bulk ice absorption
79+
param.bulk_ice_abs = 1.0
80+
param.bulk_ice_abs.fixed = False
81+
param.bulk_ice_abs.range = [0.85, 1.15] * units.dimensionless
82+
param.bulk_ice_abs.prior = uniform
83+
param.bulk_ice_abs.tex = \rm{ice \, absorption}
84+
85+
# bulk ice scattering
86+
param.bulk_ice_scatter = 1.05
87+
param.bulk_ice_scatter.fixed = False
88+
param.bulk_ice_scatter.range = [0.90, 1.20] * units.dimensionless
89+
param.bulk_ice_scatter.prior = uniform
90+
param.bulk_ice_scatter.tex = \rm{ice \, scattering}
91+
92+
# These nominal points are the nominal points that were used to fit the gradients
93+
# and might not agree with the nominal points of the parameter prior.
94+
nominal_points = {"dom_eff": 1.0, "hole_ice_p0": 0.101569, "hole_ice_p1": -0.049344, "bulk_ice_abs": 1.0, "bulk_ice_scatter": 1.0}
95+
96+
fit_results_file = /path/to/ultrasurface_fits/genie_all_knn_200pc_weight_weighted_aeff_poly_2.feather
5597
```
56-
python pisa_to_feather.py -i /path/to/pisa_hdf5/oscnext_genie_0151.hdf5 -o /path/to/pisa_hdf5/oscnext_genie_0151.feather {"--verification-sample", "--flercnn-sample", ""}
57-
```
58-
After converting all files and setting the appropriate paths in `$FRIDGE_DIR/analysis/oscnext_ultrasurfaces/datasets/data_loading.py`, we produce gradients in two steps.
59-
60-
**First**: Calculate event-wise probabilities with (assuming we `cd`'d into `$FRIDGE_DIR/analysis/oscnext_ultrasurfaces/knn`)
6198

62-
(Note here that this needs to be run with an earlier version of sklearn, due to deprecation of some used functions, e.g. use: `scikit-learn = 1.1.2`)
99+
Here you specify the detector systematic parameters to be varied in the fit, with their nominal values and allowed ranges. Additionally, you have to specify the nominal point at which the ultrasurfaces were fit (`nominal_points`), since this might be different from the nominal point used in your analysis. Finally, you have to point to the file where the polynomial coefficients are stored (`fit_results_file`).
63100

64-
```
65-
python calculate_knn_probs.py --data-sample {"verification", "flercnn", "flercnn_hnl", "retro"} --root-dir /path/to/pisa_feather/ --outfile /path/to/ultrasurface_fits/genie_all_bulkice_pm10pc_knn_200pc.feather --neighbors-per-class 200 --datasets 0000 0001 0002 0003 0004 0100 0101 0102 0103 0104 0105 0106 0107 0109 0151 0500 0501 0502 0503 0504 0505 0506 0507 --jobs 24
66-
```
101+
Your pipeline's order could then look like this:
67102

68-
**Second**: Calculate the gradients that best fit the probabilities with:
69-
70-
```
71-
python calculate_grads.py --input /path/to/ultrasurface_fits/genie_all_bulkice_pm10pc_knn_200pc.feather --output /path/to/ultrasurface_grads_vs/genie_all_bulkice_pm10pc_knn_200pc_poly2.feather --include-systematics dom_eff hole_ice_p0 hole_ice_p1 bulk_ice_abs bulk_ice_scatter --poly-features 2 --jobs 24
103+
```ini
104+
order = data.simple_data_loader, flux.honda_ip, flux.mceq_barr, osc.prob3, xsec.genie_sys, xsec.dis_sys, aeff.aeff, discr_sys.ultrasurfaces, utils.hist
72105
```
73106

74-
### Usage
75-
76-
The gradients are stored in a `.feather` file containing all events of the nominal MC set and their associated gradients. The Ultrasurface PISA stage needs to be pointed to the location of this file. In the unblinding version of this analysis, the file is
77-
78-
```
79-
/path/to/ultrasurface_grads_vs/genie_all_bulkice_pm10pc_knn_200pc_poly2.feather
80-
```
107+
It's important to include the ultrasurface stage **before** the histogramming stage, unlike it's done for the hypersurfaces. Now you should be good to go.

0 commit comments

Comments
 (0)