Skip to content

Conversation

@jomatthi
Copy link
Collaborator

@jomatthi jomatthi commented Aug 6, 2025

This pull request introduces several important updates to the configuration, calibration, and selection for the top tagging scale factor and working point analyses, with a focus on extending support to 2023 campaigns, fixing jet calibration, including the jetId selection, and fixing file system and workflow configuration. Also, columnflow and the cmsdb are updated.

Campaign and Configuration Expansion

  • Added full support for 2023 preBPix and postBPix campaigns in both scale factor (analysis_sf.py) and working point (analysis_wp.py) configurations, including standard, limited, and medium-limited file sets. This includes un-commenting and enabling previously disabled code, and updating the number of limited files from 1 to 2 to also test merging tasks properly.
  • Updated the add_config logic to recognize 2023 as an implemented year, allowing the new campaigns to be processed without raising errors.

Calibration and Jet Handling Improvements

  • Refactored jet calibration in topsf/calibration/default.py to use explicit JEC/JER calibrators (jec_ak4, jer_ak4, etc.) instead of the generic jets_ak4/jets_ak8. The previous implementation only ran one of the jets calibrators as the subclassing wasn't working as expected within columnflow. In addition, MET phi calibrations are now included also for Run 3 analyses as the needed corrections are now available.
  • Improved raw factor handling in jet cleaning: now explicitly computes and sets raw pt/mass before zeroing the correction factor, with comments explaining the logic. Nothing real changed here, just as clarification of the process for future reference.

Configuration and Workflow Enhancements
The following changes came about when issues with writing on dcache were present in these analyses but not in the the mtt analysis. Therefor the law.cfg files of these analyses were aligned further and lead to successful writing on the remote filesystem again (with occasional gfal2 errors).

  • Updated the default dataset and enabled software sharing for HTCondor in law.cfg, and expanded the list of file systems for output, improving compatibility and performance for distributed workflows.
  • Added new file system definitions and cache settings for DESY storage, and updated scheduler host configuration for improved reliability.
  • Added new job file directory templating and improved output directory management for local workflows.

Dataset and Subprocess List Updates

  • Enabled additional datasets and subprocesses in the 2022 configuration, removing some previous exclusions to increase coverage.

Dependency Updates

  • Updated submodule references for cmsdb and columnflow to newer commits, ensuring compatibility with the above changes.

Summary
The current version of the code to be merged in this PR now includes the setup for the derivation of cut based top tagging scale factors and working points for the data taking eras 22preEE, 22postEE, 23preBPix, and 23postBPix. As far as I'm aware, all calibration and selections steps are in place, allowing for rather quick creation of the necessary histograms for fitting. The combine fitting tasks are still not fixed and need to be called with separate scripts.

Next, I'm planning to restructure the way the configs are written by using different scripts for the different aspects needed: datasets, selection parameters, corrections, SF, etc. This will make the configs easier to maintain as there's ideally one place to adjust things which should be easier to find. The necessary changes won't be part of this PR though and will be their own PR.

@jomatthi jomatthi requested a review from dsavoiu August 6, 2025 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants