add cross reference

Naeemkh · Naeemkh · commit 79844ee9c8ce · 2024-03-05T09:01:58.000-05:00
diff --git a/paper/paper.md b/paper/paper.md
@@ -35,7 +35,7 @@ We present the GPCERF R package, which employs a novel Bayesian approach based o
 
 In the GPCERF R package we have introduced a novel Bayesian approach. This method utilizes Gaussian Processes (GPs) as a prior for counterfactual outcome surfaces, offering a flexible way to estimate the CERF with automatic uncertainty quantification. Additionally, it can incorporate prior information about the level of smoothness of the underlying causal ERF through specifically designed covariance functions. Popular R packages for estimating causal ERF, such as CausalGPS [@CausalGPS_R; @wu_2022], ipw [@ipw_paper], npcausal [@Kennedy2017npcausal] and CBPS [@CBPS_R; @Imai_2013; @Fong_2018], are primarily built on frequentist frameworks. To the best of the authors’ knowledge, however, Bayesian nonparametric alternatives are relatively scarce. causaldrf [@causaldrf_R] uses Bayesian Additive Regression Trees (BART) for flexible causal ERF estimation. BCEE [@bcee_R; @Talbot_2015; @Talbot_2022] applies a Bayesian model averaging approach for causal ERF estimation. bkmr [@bkmr_R; @Bobb_2014] employs a kernel-based Bayesian model, which is equivalent to a GP prior, to estimate the effect of multivariate exposure on the outcome of interest. However, since it does not explicitly address confounding in the observational data, the resulting estimate does not have causal interpretation.
 
-While various R packages, like GauPro [@GauPro_2023], mlegp [@mlegp_2022], and GPfit [@GPfit_2019; @GPfit_paper_2015], offer Gaussian process regression capabilities, we chose not to use them. The primary reason is that these packages rely on traditional techniques for hyper parameter tuning, such as sampling from the hyper-parameters’ posterior distributions or maximizing the marginal likelihood function. Our approach, in contrast, aims to achieve optimal covariate balancing. By utilizing the posterior distributions of model parameters, we can automatically assess the uncertainty in our CERF estimates [for further details, see @Ren_2021_bayesian]. Since standard GPs are infamous for their scalability issues—particularly due to operations involving the inversion of covariance matrices—we adopt a nearest-neighbor GP (nnGP) [@Datta_2016] prior to ensure computationally efficient inference of the CERF in large-scale datasets. The \ref{performance} section presents comparisons of the wall clock times between standard GP and nnGP.
+While various R packages, like GauPro [@GauPro_2023], mlegp [@mlegp_2022], and GPfit [@GPfit_2019; @GPfit_paper_2015], offer Gaussian process regression capabilities, we chose not to use them. The primary reason is that these packages rely on traditional techniques for hyper parameter tuning, such as sampling from the hyper-parameters’ posterior distributions or maximizing the marginal likelihood function. Our approach, in contrast, aims to achieve optimal covariate balancing. By utilizing the posterior distributions of model parameters, we can automatically assess the uncertainty in our CERF estimates [for further details, see @Ren_2021_bayesian]. Since standard GPs are infamous for their scalability issues—particularly due to operations involving the inversion of covariance matrices—we adopt a nearest-neighbor GP (nnGP) [@Datta_2016] prior to ensure computationally efficient inference of the CERF in large-scale datasets. The \autoref{performance} section presents comparisons of the wall clock times between standard GP and nnGP.
 
 # Overview