Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a distribution to draw sky locations from a HEALPix map #4848

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

JulienGreuter
Copy link

@JulienGreuter JulienGreuter commented Aug 8, 2024

Add a class HealpixSky to pycbc/pycbc/distributions/sky_location.py that allow one to sample the distribution given by a HEALPix sky map by using the rejection sampling method.

Standard information about the request

This change follows style guidelines (See e.g. PEP8) and includes running black to reformat sky_location.py.

This change adds a new feature to the distributions module, mainly for use in PyGRB.

Motivation

Versions of PyGRB used for past analyses approximated the uncertainly in the GRB sky location with circular patches, or using "IPN boxes". In many cases, the uncertainties actually are not circular at all. This is typically the case for IPN localizations, which can resemble thin rings over large patches of the sky. Nowadays, most GRBs have their sky location uncertainties represented as complete probability densities over the sky, encoded via the HEALPix scheme, and stored in FITS files. This change will allow PyGRB to directly read those FITS files, without making approximations.

Contents

The class HealpixSky samples the distribution given by a HEALPix map by using the rejection sampling method.
To have an practical acceptance rate even with precise localizations, a bounding rectangle in (alpha,delta) is first determined as a preprocessing step, which encloses a given amount of probability (specified by the coverage parameter). In order to do this in a simple way, the map is temporarily rasterized to a resolution specified by rasterization_nside. The choice of this parameter is not very critical, it mainly affects the run time. It is probably possible to modify the algorithm so that this step is not necessary, and the coverage and rasterization_nside can be dropped. This is something to be looked at in a future pull request. Anyway, the present method seems to work, providing 10000 samples in order of tens of seconds max for most IPN maps.

In order to use this distribution with pycbc_create_injections, the corresponding .ini file must contain a section that looks like this:

[prior-ra+dec]
name = healpix_sky
; Path to the skymap file in FITS format. Can be gzipped (.fits.gz).
healpix_file =  /path/to/skymap.fits
; Must be a power of 2, 128 seems to works best in terms of run time.
; Optional parameter, internal default set to 64.
rasterization_nside =  128
; Must be between 0 and 1, 1 meaning that you consider all the space where the data is non zero.
; Optional parameter, internal default set to 0.9999.
coverage = 0.999

Links to any issues or associated PRs

This addresses #4267.

Testing performed

Test on the following GRBs are to be done to check accuracy and speed:

GRB211124A_IPN_map_hpx.fits.gz
GRB220203A_IPN_map_hpx.fits.gz
GRB220209A_IPN_map_hpx.fits.gz
GRB220211A_IPN_map_hpx.fits.gz
GRB220711C_IPN_map_hpx.fits.gz
GRB220905A_IPN_map_hpx.fits.gz
GRB220927A_IPN_map_hpx.fits.gz
GRB221022C_IPN_map_hpx.fits.gz
GRB221124A_IPN_map_hpx.fits.gz
GRB221226A_IPN_map_hpx.fits.gz
GRB230511D_IPN_map_hpx.fits.gz
GRB230518A_IPN_map_hpx.fits.gz
GRB230625B_IPN_map_hpx.fits.gz
GRB230712A_IPN_map_hpx.fits.gz
GRB230712C_IPN_map_hpx.fits.gz
GRB230715D_IPN_map_hpx.fits.gz
GRB231018A_IPN_map_hpx_moc.fits.gz
GRB231224A_IPN_map_hpx_moc.fits.gz
GRB240119A_IPN_map_hpx_moc.fits.gz
GRB240124B_IPN_map_hpx.fits.gz
GRB240125A_IPN_map_hpx_moc.fits.gz
GRB240125B_IPN_map_hpx_moc.fits.gz
GRB240205A_IPN_map_hpx_moc.fits.gz
GRB240209A_IPN_map_hpx_moc.fits.gz
GRB240410A_IPN_map_hpx.fits.gz
GRB240504A_IPN_map_hpx_moc.fits.gz
GRB240514B_IPN_map_hpx_moc.fits.gz
GRB240529A_IPN_map_hpx_moc.fits.gz
GRB240531A_IPN_map_hpx_moc.fits.gz
GRB240603A_IPN_map_hpx_moc.fits.gz
GRB240607A_IPN_map_hpx_moc.fits.gz
GRB240615A_IPN_map_hpx_moc.fits.gz
GRB240628B_IPN_map_hpx_moc.fits.gz
GRB240715A_IPN_map_hpx_moc.fits.gz

Additional notes

This code requires mhealpy 0.3.4 or later. Previous versions have a bug that prevents a function from working.

  • The author of this pull request confirms they will adhere to the code of conduct

@titodalcanton titodalcanton added the PyGRB PyGRB development label Aug 8, 2024
@titodalcanton
Copy link
Contributor

Reporting a summary of the tests that @JulienGreuter did on a variety of GRB IPN skymaps.

Drawing 100000 samples takes between 5 and 300 s depending on the properties of the input map (uniform resolution vs MOC, maximum resolution, morphology). The slowest cases appear to be dominated by the code that determines the bounding rectangle at initialization. We can probably optimize this later if needed.

To test the correctness of the method, we plotted the number of samples that fall in each pixel as a function of the probability contained in that pixel. This should show a linear trend with slope equal to the total number of samples, and a scatter given by the Poisson distribution. Some examples follow.

image

image

image

image

In a couple cases, the result is not that good:

image

image

This happens because the HealpixSky class uses linear interpolation to get the probability density between pixels, while the test plotting code assumes that the probability density is constant across each pixel. It turns out that a few IPN maps are sampled with a relatively coarse resolution, comparable to the angular scale of the variation of the probability density, and this makes the distinction between the two codes important. In most cases, however, the pixels are small enough that this distinction is irrelevant.

@ahnitz
Copy link
Member

ahnitz commented Sep 6, 2024

@JulienGreuter @titodalcanton

Out of curiosity, is there a reason that rejection sampling is used here? Given that healpix format is a discretization of the sky space, one should be able to do a simple inverse cdf draw by treating it as a 1-d space over the pixel id. That would be both faster, and would allow one do say use a healpix map as a prior.

@titodalcanton
Copy link
Contributor

@ahnitz good question. We started with rejection sampling for simplicity, and we found (perhaps surprisingly) that it works sufficiently well for sampling. I am not sure how available Julien will be for further contribution, but I do have an alternative implementation that does as you suggest. It is somewhat simpler (in the sense that it does not take additional parameters apart from the file path) and, for some skymaps, much faster (much slower for others, however). I would like to merge this one first to acknowledge Julien's work, and then I will probably open a new PR with the new algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PyGRB PyGRB development
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants