Skip to content

Conversation

@ochase10
Copy link
Contributor

I added a cosmo_seed parameter to give the option for separating the density realization from the sampling realization. That gives the ability to sample the same initial conditions repeatedly and getting different discrete points. That way, one can build up a large catalog by adding up smaller catalogs without generating the large one all at once.

@rainwoodman
Copy link
Member

rainwoodman commented Oct 24, 2025 via email

@sbird
Copy link
Collaborator

sbird commented Oct 26, 2025

Is it just that you want to sample the HOD mocks separately from the initial density field?

@ochase10
Copy link
Contributor Author

What I am updating here is the code for generating log normal mock catalogs.

How those work is by first using a random number to create a density field which follows some power spectrum. You need the random number because there are many possible random fields for a given power spectrum. Once you have a density field, you then look in each grid cell and do Poisson sampling to determine whether there is a source there. This is, of course, random.

Therefore, there are 2 separate stages of randomness involved in making a mock catalog. First, one needs the initial conditions to determine the exact density field, and two, the density field needs to be randomly sampled to generate sources.

The way the code currently works is to use the same seed for both of these random processes (making the density field and sampling it). Now suppose I make a mock catalog and realize it was too small. If I use a new seed for the density field (with the same power spectrum), the two mocks will be incompatible at the field level because the specific locations of the density peaks would be uncorrelated between them. Their power spectra would match, but if you combined them and computed a power spectrum you would not get the right answer. However, in the current implementation, the only way to use the same initial conditions (same density field) again for a new mock is to also use the same Poisson sampling. In other words, there is only 1 random sample of sources possible from each random density field. So, if I ever want to increase the size of my mock, I have to recreate all the sources I already have. This is an issue not only due to wasted compute, it limits the size of mocks I can possibly make to the density I can fit in the RAM. If I want anything bigger (more dense), I simply cannot do it because I will always get the same source catalog from a given set of initial conditions.

What my update does is separate the seeds for these two random processes to allow for each set of initial conditions to result in myriad possible source instantiations. If only the standard 'seed' is provided or no seed is provided at all, the behavior is identical to before and should preserve the functionality of all legacy code (I think). There is simply a new argument which allows me to fix the initial conditions (using the random seed for the density field) while leaving the sampling seed free to vary. Or, at least, that is its intention.

@sbird
Copy link
Collaborator

sbird commented Oct 27, 2025

Thanks, it makes sense.

Can you confirm what tests you did to make sure it works?

Also can you please add a comment to the documentation surrounding seed and cosmo_seed summarizing your last reply and explaining why one may want to use different values for the two seeds?

Once those are done, I am happy to merge, the changes are modest. Thanks!

@sbird sbird merged commit c29f379 into bccp:master Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants