Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation of 8GeV Signal #1215

Merged
merged 10 commits into from
Oct 11, 2023
Merged

Validation of 8GeV Signal #1215

merged 10 commits into from
Oct 11, 2023

Conversation

tomeichlersmith
Copy link
Member

I am updating ldmx-sw, here are the details.

What are the issues that this addresses?

  • This applies some tweaks to the dark brem configuration in Biasing/python/target.py so that the generation is efficient and appropriate for 8GeV.
  • Introduces a signal validation sample which uses dark-brem-lib-gen to generate a reference library during the validation sample init procedure.
  • Update the dark brem DQM analyzer to add the extracted information to the event bus and fill histograms for the Validation module.
  • Add a dark_brem module to Validation so that the plots below can be remade in the future with ease.
    • @EinarElen I moved the tick_labels to be the bin centers rather than the bin edges. I applied this update to the hcal and photonuclear plots by removing the trailing empty tick label '' from all the tick label lists in those modules. Please confirm this was correct or guide on how to fix this if not.

Check List

  • I successfully compiled ldmx-sw with my developments
  • I ran my developments and the following shows that they are successful. (below)
  • NA I attached any sub-module related changes to this PR.

Running

Rough outline, have not confirmed this can work out of the box.

# got dark brem libraries from SLAC
# extract them into small CSV libraries for quicker re-running (not necessary)
for dblib in dark-brem-libraries/*; do
  tar -C dark-brem-libraries -xzf ${dblib}
  ldmx g4db-extract-library ${dblib}
  gzip ${dblib}.csv
done
# run the simulation
for dblib in dark-brem-libraries/*.csv.gz; do
  ldmx fire config.py ${dblib} --out-dir valid
done
# make the plot
python3 -m Validation valid --systems dark_brem.kinematics --label '$m_A$ [GeV]' --param mA

The config.py I used is the same as the one in the new signal validation sample, but with some tweaks to have the dark brem library be passed on the command line.

diff config.py .github/validation_samples/signal/config.py

6,23d5
< import os
< import argparse
< 
< parser = argparse.ArgumentParser()
< parser.add_argument('dblib', help='Dark Brem event library to use as refernece. The run and dark photon mass are taken from the basenaem of this library as well.')
< parser.add_argument('--out-dir',help='directory in which to put output file')
< args = parser.parse_args()
< 
< def basename_no_extension(path):
<     # have to apply splitext twice since we have the chance of two extensions '.csv.gz'
<     return os.path.splitext(os.path.splitext(os.path.basename(path))[0])[0]
< 
< os.makedirs(args.out_dir, exist_ok=True)
< 
< parameter_list = basename_no_extension(args.dblib).split('_')
< print(parameter_list)
< mA = float(parameter_list[parameter_list.index('mA')+1])
< 
25c7,15
< mySim = target.dark_brem(mA*1000, args.dblib, 'ldmx-det-v14-8gev')
---
> mySim = target.dark_brem(
>     #A' mass in MeV - set in init.sh to same value in GeV
>     10.0,
>     # library path is uniquely determined by arguments given to `dbgen run` in init.sh
>     #   easiest way to find this path out is by running `. init.sh` locally to see what
>     #   is produced
>     'electron_tungsten_MaxE_8.0_MinE_4.0_RelEStep_0.1_UndecayedAP_mA_0.01_run_1',
>     'ldmx-det-v14-8gev'
> )
35,36c25,26
< p.maxEvents = 10000
< p.run = int(parameter_list[parameter_list.index('run')+1])
---
> p.maxEvents = int(os.environ['LDMX_NUM_EVENTS'])
> p.run = int(os.environ['LDMX_RUN_NUMBER'])
38,39c28,29
< p.histogramFile = f'{args.out_dir}/type_histos_beam_8gev_mA_{mA}_run_{p.run}.root'
< p.outputFiles = [ f'{args.out_dir}/type_events_beam_8gev_mA_{mA}_run_{p.run}.root' ]
---
> p.histogramFile = f'hist.root'
> p.outputFiles = [f'events.root']

Plots

Generally, we see the expected behavior. The A' takes a majority of the energy. The incident particle is close to the beam energy (most of the time), and the dark brem occurs in the tungsten target (most of the time). The heaviest mass point starts to happen more frequently in the trigger pads, but that is not too worrisome. This may inform us to include carbon and hydrogen in the dark brem reference library, but it is not necessary since the recoil electron loses so much energy either way in this high mass scenario.

You may notice that there is a bin for the dark photon energy above the beam energy. This is a non-empty bin because we are trying to replicate the dark brem process from MadGraph without the nucleus. G4DarkBreM chooses to put all of this messiness into the invisible dark photon so that the recoil electron's distributions follow the true process to the highest degree possible. In the high mass scenario, this means the dark photon's energy can easily exceed the beam energy since it also (in some sense) includes the energy that would be given to the nucleus. The two electrons involved (incident and recoil) also have this bin, but (if their energies are calculated correctly1) they stay within physical bounds and faithfully model the beam electron prior to dark brem and the dark brem recoiling electron respectively.

(Apologies for the legend labels being out of order, they are merely ordered by how they get listed within python and I didn't feel like implementing a sorting mechanism within Validation.)

db_kinematics_aprime_energy
db_kinematics_aprime_pt
db_kinematics_dark_brem_element
db_kinematics_dark_brem_material
db_kinematics_dark_brem_z
db_kinematics_incident_energy
db_kinematics_incident_pt
db_kinematics_recoil_energy
db_kinematics_recoil_pt

Footnotes

  1. The calculation for the incident energy needs to use the recoil and dark photon's three momentum to calculate the incident's three momentum and then calculate its energy using its mass. This is necessary since G4DarkBreM chooses to conserve three momentum when ignoring the nucleus which therefore "shoves" a lot of "error" into the dark photon's total energy. This calculation is done correctly in the DQM processor in this PR.

replace old ntuplizer with one that puts data in event bus

better energy calculation in old, but the new adds the info to the event
bus and extracts vertex material information for us
use mA=10MeV and use the template simulation from Biasing.target module

use the init.sh script infrastructure to generate the reference lbirary before attempting the simulation
not updating labels for material/elements but that's how it goes
- move tick labels to bin centers so the number of labels is equal to
  the number of bins (and not the number of bin edges)
- update hcal and photonuclear modules to conform to this change,
  removing the trailing empty label present in those tick label lists
this does not affect the distribution shapes greatly since the weights
are extremely close to exactly 1/B within each sample
@tomeichlersmith
Copy link
Member Author

The PR Validations are passing except for the new signal validation sample which is failing at the comparison step since there is no gold to compare against.

@tomeichlersmith
Copy link
Member Author

At the last SW dev meeting, I was requested to investigate the cross section calculation at the new beam energy as well. The good news is that I've already extensively investigated the xsec calculation. The bad news is that it is really complicated!

I have a standing issue in SimCore related to this where I'm putting plots and the code/data I've used to make them. Specifically, the tar-ball linked there has the data files holding the total xsec as estimated by MadGraph/MadEvent which I am using as the source of "truth".

#1314

TLDR: The ratio of the cross section used within our simulation and the cross section estimated by MG/ME is roughly constant over the energy range we care about. The "constant-ness" of this ratio gets worse as the dark photon mass grows, but I don't think it has a large enough effect to matter at this point.

Long Answer

In the linked SimCore issue, I compare a "normalized" cross section calculated by the different cross section methods available in G4DB to a "normalized" cross section estimated by taking the mean of several MG/ME runs. The normalization is done to remove overall-scale effects that can be easily handled when calculating the total expected rate at analysis time. Currently, we use the G4DB default xsec method which uses "Full" for dark photon masses below 25MeV and "Improved" for masses above. Looking at the plots and focusing on our energy range of interest in the insets, these xsec calculations are roughly flat when compared to MG/ME (emphasis on roughly). While we could spend time optimizing this xsec calculation in order to more perfectly align with the MG/ME shape, I don't think that has enough benefit. I believe if we want the xsec calculation to match MG/ME we should simply import the MG/ME xsec and do an interpolation of it within G4DB rather than trying to tweak/hack a WW approximation. This would require some more development and this hurdle combined with the fact that the xsec is already kinda close causes me to conclude we are good (for now). Perhaps in the future it will be worth it to update G4DB or G4DB will be superseded by another generator like Pythia.

@bryngemark
Copy link
Contributor

This looks good so far, and I'm particularly happy to see a suite of validation DQM histograms + the corresponding plotting being set up.

Just waiting to see the 4 GeV comparison and then I think this is good to go. The next step after merging would be to share your config to the LDCS repo where I will template-ify it and start production.

@EinarElen
Copy link
Contributor

Looks good to me, I just had a minor comment suggestion and (as usual) a request for a corresponding header file for the source file. I (and I think a lot of people) will look in the header directory to find out what's available in a project, so if there's only a source file you'll miss that that translation unit exists.

Histogram bin update looks good, was a bit hacky before :)

@tomeichlersmith
Copy link
Member Author

@EinarElen I moved the declarations into its own header. I cannot see another "minor comment suggestion", did you "finish your review" so that it gets posted for me?

@tomeichlersmith
Copy link
Member Author

The 4GeV versions of the same histograms (with only 1k events sampled and doing density instead of overall count) have similar shapes and similar dependence on the A' mass. This is as expected because a 4->8GeV change is not too drastic of a change in the context of this model.

WARNING: the colors are different in this set of plots compared to the 8GeV plots above. This is because I just used my Validation module to plot the directory of plots again and did not bother trying to do an overlay or side-by-side comparison with matplotlib. I could, but I don't think it is worth it given how close these distributions are.

db_kinematics_aprime_energy(1)
db_kinematics_aprime_pt(1)
db_kinematics_dark_brem_element(1)
db_kinematics_dark_brem_material(1)
db_kinematics_dark_brem_z(1)
db_kinematics_incident_energy(1)
db_kinematics_incident_pt(1)
db_kinematics_recoil_energy(1)
db_kinematics_recoil_pt(1)

explain in doxygen comment why this function exists basically copying the rational written for the produce function as well.
@tomeichlersmith tomeichlersmith merged commit 7047cba into trunk Oct 11, 2023
1 check passed
@tomeichlersmith tomeichlersmith deleted the signal-valid branch October 11, 2023 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants