Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DLA-Lya cross-correlation is not correct #87

Open
jfarr03 opened this issue Jul 19, 2019 · 12 comments
Open

DLA-Lya cross-correlation is not correct #87

jfarr03 opened this issue Jul 19, 2019 · 12 comments

Comments

@jfarr03
Copy link
Collaborator

jfarr03 commented Jul 19, 2019

[More details to be added imminently]

When measuring the DLA-Lya cross-correlation, there is an issue with DLA-delta pairs that are close to along the line-of-sight:

Screenshot 2019-07-17 at 12 17 43

Specifically, this is only present for mu positive:

Screenshot 2019-07-17 at 15 43 46

When plotting xi against rp in the lowest rt bins, there seems to be a "step" of some kind for positive rp:

Screenshot 2019-07-17 at 19 13 37

Interestingly, the "step" is also present when using random DLAs (3rd panel below):

Lya_DLA_cross_contributions_zoom

Evidently there is an issue here, either with the DLAs themselves, or the way in which I am measuring the correlation.

The DLA auto-correlation appears to look approximately correct, though the measurement is noisy:

berkeley_auto_0 2_DLA

@londumas
Copy link
Contributor

@jfarr03, this is very good. Can you do a version of the last plot for mu positive in one plot and mu negative in the other? If the last plot of your ticket is for |mu| you should update the legend.

@jfarr03
Copy link
Collaborator Author

jfarr03 commented Jul 19, 2019

@londumas The last plot is for the DLA auto-correlation, so I only measured for rp in (0,200)Mpc/h and I don't have any negative mu bins

@londumas
Copy link
Contributor

Sory, I see. It looks good and the level of noise is normal to me. for the high chi2, it is highly probable that the picca estimator of the variance is not perfect. But it is good enough for a plot in a paper.
Could you add the different plots for the 3d cross HCDxLya? both for mu positive and mu negative?

@jfarr03
Copy link
Collaborator Author

jfarr03 commented Jul 19, 2019

@londumas no worries! The 3D plots are below. The dotted line shows a "theoretical" prediction, using b_LYA=-0.119, beta_LYA=1.53, b_DLA=2.0, beta_DLA=0.48

"Standard" randoms removal:
berkeley_cross_0 2_DLA

"New" randoms removal
berkeley_cross_0 2_DLA_subbin

And the fit for this second method is as follows:

bias_eta_LYA = -0.2418 +/- 0.0093
beta_LYA = 2.6949 +/- 0.3409
bias_eta_DLA = 1.0000 (fixed)
beta_DLA = 0.4824 (fixed)
ap = 1.0000 (fixed)
at = 1.0000 (fixed)
growth_rate = 0.9649 (fixed)
chi2: 3443.0/(2354-3)

@londumas
Copy link
Contributor

I think it looks very good.
Let's do a combined fit with the auto-correlation of lya, and leave free
(bias_eta_lya, beta_lya, beta_DLA, growth_rate, ap, at). Like so you will have a measure of the bias of DLA for the paper. Very good.

@londumas
Copy link
Contributor

londumas commented Jul 19, 2019

@jfarr03, if you have the time and the energy, could you compute the 3D cross HCD x QSO? As I showed you, using the SaclayMocks (igmhub/SaclayMocks#14 (comment)), I found deviation from the expectation along the line-of-sight. However, it is not a show stopper for the production of the 100 realizations.

@jfarr03
Copy link
Collaborator Author

jfarr03 commented Aug 21, 2019

Update:
It seems as if the deviation from expectation in the DLA-Lya cross-correlation is a feature of some sorts rather than a bug. The below cartoons attempt to explain:

Consider a pair of QSOs that are near each other on the sky, one (QSO1) behind the other (QSO2). Some cells in QSO1's spectrum - cell1 for example - will be close to QSO2, and so will be more likely to be in a high-density environment. Thus the value of delta_F in this cell will be "biased" towards low values.

Consider correlating cells in QSO1's spectrum with DLAs in QSO2's. Assume that a rest-frame wavelength cut has been put on the DLAs, so that only DLAs with lambda_r < lambda_r,cut are included in the catalog. The separation between lambda_alpha and lambda_r,cut can be converted into a distance along the line of sight, r_cut (this is redshift dependent).

If we consider correlating cell1 with DLAs (crosses) in the spectrum of QSO2, there are 3 key regions:

  • Region 1: a correlation is measured. It is negative-biased due to cell1 being negative-biased.
  • Region 2: no correlation is measured as DLAs have been removed by the rest-frame wavelength cut
  • Region 3: no correlation is measured as this is beyond QSO2, so no DLAs are observed

As such, you would expect to measure no signal up to r_p=r_cut, and then a suppressed signal beyond this point.

image

Now imagine another cell that is not close to QSO2 - cell2 for example - and so has no bias in the value of delta_F. Consider correlating it with DLAs in the spectrum of QSO2. This time there are 5 key regions:

  • Region 1: a correlation is measured. It shows no bias as there is no bias in cell2
  • Region 2: a correlation is measured. It shows no bias as there is no bias in cell2
  • Region 3a: a correlation is measured. It shows no bias as there is no bias in cell2
  • Region 3b: no correlation is measured as DLAs have been removed by the rest-frame wavelength cut
  • Region 3c: no correlation is measured as this is beyond QSO2, so no DLAs are observed

image

Thus, imagining summing contributions from both cells in each region:

  • Region 1: correlation is negative biased (cell1 contribution has negative bias, cell2 contribution has no bias)
  • Region 2: correlation has no bias (cell1 has no contribution, cell2 contribution has no bias)
  • Region 3: correlation has no bias (cell1 has no contribution, cell2 contribution has no bias)

and so we would expect to see no bias in the correlation until r_p = r_cut, at which point there will be a (relatively) sudden shift downwards.

This is tested in the below plot. In the top left, no rest frame cut is used in the DLA catalog, and there is a clear shift for r_p>0. In the top right, bottom left and bottom right, successively stronger rest-frame cuts are used, and so r_cut gradually increases, and thus the onset of the downward shift moves to higher r_p.

image

@londumas
Copy link
Contributor

@jfarr03, Thank you very much for these explanation. What I am confused about is that cell1 and cell2 are already quite far from their hosting quasar: the forest lies in [1040,1200] A and the quasar is at 1215.67 A. Thus the closest cell to the quasar is at 15.67A, if I remember correctly this translates into ~16 Mpc/h. Do you mean that the quasarxLYA correlation is even too strong there? If this is the case it means that:

  • this will have an impact on the 3D auto-correlation of Lya
  • this has an important impact on the 1D power spectrum. Has anyone looked at the power spectrum for different ranges of rest-frame wavelength? @Cyeche and @solenechabanier ?

@londumas
Copy link
Contributor

Here is the stack of the transmission and of delta_transmission along the line of sight, in lambda_RF. It is pretty flat a l<1200, but it is possible that a residual still exists and that what you measure. This also explains why this is not measured using the cooked mocks, because the fit of the continuum, removes this effect.

T_vs_lRF
T_vs_lRF_zoom

@londumas
Copy link
Contributor

So to conclude, the problem must come from the fact that delta_transmission is defned
with respect to <T> as a function of observed wavelength and not as a function of both rest-frame wavelenght and observed wavelength, as in the cooked mocks.
So it tends toward changing the estimator of delta_transmission from:

dT = T/[ <T>(lObs) ] - 1

to something like:

dT = T/[ <T>(lObs) <T>(lRF)  ] - 1

@londumas
Copy link
Contributor

This might also fix the fact that we need randoms for the cross using the raw.

@andreufont
Copy link
Collaborator

@jfarr03, Thank you very much for these explanation. What I am confused about is that cell1 and cell2 are already quite far from their hosting quasar: the forest lies in [1040,1200] A and the quasar is at 1215.67 A.

It's not about the distance to the hosting quasar, it's about the distance to the other quasar, the one that has the DLAs in the spectrum.

Thus the closest cell to the quasar is at 15.67A, if I remember correctly this translates into ~16 Mpc/h.

You are missing a factor (1+z) here, so more like 50-60 Mpc/h.

Do you mean that the quasarxLYA correlation is even too strong there? If this is the case it means that:

* this will have an impact on the 3D auto-correlation of Lya

* this has an important impact on the 1D power spectrum. Has anyone looked at the power spectrum for different ranges of rest-frame wavelength? @Cyeche and @solenechabanier ?

No, I don't think this is the right interpretation. Happy to discuss this in Paris.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants