-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: Add page describing how to build Upper Limits table #57
Conversation
832826e
to
9cafdec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't be able to give this a full review until later in the week, but I don't think we don't have rich
in book/requirements.txt
and so we need to add at least that there and then relock the book/requirements.lock
.
@matthewfeickert - it seems
|
Why here and not |
Sorry, you're right. I was going from memory, but you do want that to be in
Oh bummer. If you just run |
|
That's because you're apparently using Python 3.8
and the lock file was built assuming Python 3.10+ (again, I should Dockerize this, sorry). I'll fix this now and review hopefully tonight. |
7624808
to
26b4809
Compare
Rebased to pick up PR #58 to get |
* Correct some typos and note Table 20 * Change word order to make it clear the models are from the published analysis being described. * Remove cell output (gets generated at build) * Add abbreviations being used, add "DOI: " to make it clear what is being referenced. * Add `level=0.05` kwarg to make it explicit that the 95% CL upper limit is being used. * Add "ifb" to print statement to make it more explicit (already have it in a comment which is good) * Capitalize Equation 52, and use arXiv abstract instead of PDF, use "b" for background only as b-only does not render well and it is already assumed if b is by itself that it is background only. * Added "[fb]" to XSec column. Split out "S95Obs" to "S 95 Obs" as "95Obs" is difficult to read without mentally flipping the "0" to a zero. Can revise this as needed. Also just reduced CLb(S 95 Obs) to CLb to match table column better. * As not all people acknowledged for their help are on the ATLAS Stats Forum (apologies if I'm wrong!) I thought this read better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think this is great — many thanks for writing this!
I made some minor language/typo/revision edits in the commit I pushed but I tried to make it clear what they all were in the commit message. Obviously they can all be rolled back if you disagree with them.
I've tried to just break down my review comments by section as trying to make comments in the notebook on GitHub isn't great. :( Happy to chat of course and clarify my thinking (it is after 02:00 when I write this).
I'd also say that you should double check my option on everything with @alexander-held.
- Expected Events
References:
Is there a reason that these references are chosen? Is there some HEP reference paper on stats that cites the J. Phys. Chem. A 2001 paper? If not, let's check the PDG to see if this is included there, unless there was something particularly striking about this reference.
- Discovery
$p$ -values
From Equation 52 in arXiv:1503.07622:
Maybe I need to read this section more carefully, but how does Equation 52 show the following statements? Regardless of how it does, can we make this more explicit here?
$p_{\mu}$ (and$\mathrm{CL}_b$ )
$\mathrm{CL}_\mathrm{b}$ provides a measure of compatibility of the observed data with the 95% CL signal strength hypothesis relative to fluctuations of the background
This is mixing multiple ideas of an upper limit signal strength and the definition of
See the paragraph above plot on page two in this document.
Is there another reference that we can use? I know that this one comes up in search results because it is the only bloody thing that isn't firewalled by ATLAS, but I've never been a huge fan of this note and it kinda bugs me that it doesn't have any stable reference. It isn't even on CDS (private to ATLAS) as far as I know, but I would love to be wrong on this!
$p_0$
$p(s)=0$ measures compatibility of the observed data with the background-only (zero signal strength) hypothesis relative to fluctuations of the background.
- Relationship to
$\mathrm{CL}_s$
I'm not really crazy about these slides in general and I feel like while they do provide context in the lecture he gave that they don't actually help here. So I would personally suggest removing this section and that we write up something better later.
Making the table
- It would probably be useful to mention that
DRInt_cuts
should correspond to Signal Region "DR-Int-EWK" and comment on why the results we get here are slightly different form the published table. This is maybe beating the point over the head, but I think it would help (would help 02:00 Matthew!).
Once again though, I think this is an excellent educational material and I'm super thankful that you went into a deep dive to write it. 🚀
39.2.5 has a really small blurb and not useful for anyone imo. The 2001 paper is not only the top result when googling for the method but it's also been linked by stats forum in emails and by @alexander-held as well. I'm not sure there's anything better.
you'd have to read through equation 52 to see how it sets the value of
This is the official ATLAS/CMS statement here. I didn't change anything because this is agreed upon. Take it up with those collaborations from their decision in like 2012. As in, this is the definition that has been used to make the tables in quite nearly every single SUSY paper published by both collaborations. Which is why it has to be written out more explicitly here, or nobody will figure out how to get those numbers.
Nope.
The caption is the official agreed upon text from the ATLAS Stats Committee + ATLAS SUSY group. That will have to be taken up with them. I can even link you the corresponding twikis that have this phrasing: https://twiki.cern.ch/twiki/bin/view/AtlasProtected/StandardSUSYRefs#Suggested_text_agreed_with_StatF
These are added until something better comes along which is not going to be covered in this notebook so trivially which is already complicated enough.
I don't know for sure why they're different. Part of the problem is that the table up there has some additional rounding that comes into play in certain pieces. I already confirmed that the calculations I show match up with the calculations from regular RooFit/RooStats -- so there's other stuff going on. The differences are close enough given the larger uncertainties/variations that I'm not concerned anyone will be confused that it's not exactly the same. Just remember this notebook isn't really anything particularly from my own words, but just combining all the various bits and pieces scattered across various documents into one place, to make sure everything that goes into making that table is actually described. Whether or not the descriptions are correct/factual is up to those documents unfortunately. |
I don't understand. This is not an ATLAS document and it is not a summary of ATLAS documents, it is a summary of a procedure in our own words. We're under no constraints to write things that aren't correct because they appear somewhere else or to have to use language that was written once a decade ago by someone in ATLAS and then never updated. |
A few comments on discussion items: I think I got the reference for error propagation from either Google or some Stackoverflow thread. A better reference if we were to write a paper is probably some textbook, but I think this one does its job well. I don't think the figures from Alex fit in here really well. They would at least need a detailed explanation to connect them to the text. Right now they feel disconnected. For the CLs reference: alternatives might be the CLs paper (probably harder to digest). The statistics info note for the Higgs discovery https://cds.cern.ch/record/1379837 also has something on this I believe. |
* Add back additional information to tabel label for CLb * Add back language of ATLAS Stats Forum
We can drop the images for sure. The problem is more that people in ATLAS and CMS (and outside) will use pyhf to make these tables, but also to reproduce what is in the papers. Right now, that procedure as described in the notebook is exactly what is being done. We could make this an ATLAS-internal only thing instead (but we don't have an ATLAS-internal pyhf documentation -- and this also impacts CMS as they are using the same definitions). Also this language has gotten updated somewhat frequently. |
@kratsg I think we're going to have a lot of discussion on these points, but you've also done a lot of work here that would be better served getting merged in and then having us revise it as needed. I'd propose that you apply @alexander-held's comments, basically just
and then we merge this and I open up a GitHub Issue from my original comments where we can discuss this and then make any necessary revisions in a follow up PR. I also think it is fine to drop the images even though you mentioned
as I plan to do exactly this in scikit-hep/pyhf#2274. I want to "get things right" (to my mind) usually in the first pass, but I also realize that I slow down our code velocity quite a bit with this and so I'm trying (bear with me while I slowly work on growing here) to take a few more pages out of Matt Rocklin's views on the purpose and roles of review: Small Scope and Fast Review: The secret to happy devs
I agree. The procedure is fine! Things that I want to change are not the procedure, but the description of what is happening in the procedure. Some of the statements (made by others that you are just reproducing, so not an attack on you) are just not true at all. We don't have any obligation to repeat such things, and I would argue we should actively avoid doing so!
No, I would say that we should actively avoid making anything pyhf related ATLAS internal. There are obvious things that only can be done with ATLAS at the moment (i.e. full probability models) and if we inside of ATLAS want to produce internal resources for the collaboration that use pyhf that's fine (but why make it internal?), but as a pyhf team product I think it is important that everything is always public. So we're in agreement here about wanting to avoid this. 👍
Then it should hopefully be easier to correct internally at some point in the future. |
Done, images dropped. I added a quick blurb for the link to scikit-hep/pyhf#2274 and that should cover that portion of the section there. Thanks for the thorough review (admittedly, I still haven't gotten any feedback from the Stats Forum on this notebook). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kratsg Thanks for this. Can you check if the commit message summary I added to the PR body looks good? If so, go ahead and merge when ready.
This came out of scikit-hep/pyhf#2263.
This describes how to build an upper limit table exactly to what ATLAS publishes in various papers, particularly for the SUSY standard tables. This builds on an existing published analysis EWK 2L2J and its corresponding HEPData entry.
The current rendering can be seen here.