Collabration
diff --git a/docs/authors.html b/docs/authors.html
index 80b1202..a3c3c16 100644
--- a/docs/authors.html
+++ b/docs/authors.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/index.html b/docs/index.html
index 3e71f2b..5223ab4 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -47,6 +47,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/news/index.html b/docs/news/index.html
index 312931f..7eecc30 100644
--- a/docs/news/index.html
+++ b/docs/news/index.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml
index 6f7483e..17ab986 100644
--- a/docs/pkgdown.yml
+++ b/docs/pkgdown.yml
@@ -1,12 +1,13 @@
-pandoc: 3.1.2
+pandoc: 2.19.2
pkgdown: 2.0.7
pkgdown_sha: ~
articles:
+ A-Note-on-Choosing-Hyperparameters: A-Note-on-Choosing-Hyperparameters.html
Developers-Guide: Developers-Guide.html
GPCERF: GPCERF.html
Nearest-neighbor-Gaussian-Processes: Nearest-neighbor-Gaussian-Processes.html
Standard-Gaussian-Processes: Standard-Gaussian-Processes.html
-last_built: 2023-08-08T21:15Z
+last_built: 2023-11-21T17:36Z
urls:
reference: https://NSAPH-Software.github.io/GPCERF/reference
article: https://NSAPH-Software.github.io/GPCERF/articles
diff --git a/docs/reference/GPCERF-package.html b/docs/reference/GPCERF-package.html
index 3a78a1a..f972d2d 100644
--- a/docs/reference/GPCERF-package.html
+++ b/docs/reference/GPCERF-package.html
@@ -34,6 +34,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/autoplot.cerf_gp.html b/docs/reference/autoplot.cerf_gp.html
index 70d963e..f227a5e 100644
--- a/docs/reference/autoplot.cerf_gp.html
+++ b/docs/reference/autoplot.cerf_gp.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/autoplot.cerf_nngp.html b/docs/reference/autoplot.cerf_nngp.html
index 875c1c9..508ffee 100644
--- a/docs/reference/autoplot.cerf_nngp.html
+++ b/docs/reference/autoplot.cerf_nngp.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_deriv_nn.html b/docs/reference/compute_deriv_nn.html
index 35ab402..eb46a26 100644
--- a/docs/reference/compute_deriv_nn.html
+++ b/docs/reference/compute_deriv_nn.html
@@ -32,6 +32,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_deriv_weights_gp.html b/docs/reference/compute_deriv_weights_gp.html
index e40f9ec..8684536 100644
--- a/docs/reference/compute_deriv_weights_gp.html
+++ b/docs/reference/compute_deriv_weights_gp.html
@@ -32,6 +32,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_inverse.html b/docs/reference/compute_inverse.html
index 0937fca..d6f4c08 100644
--- a/docs/reference/compute_inverse.html
+++ b/docs/reference/compute_inverse.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_m_sigma.html b/docs/reference/compute_m_sigma.html
index d3600d2..e946dc1 100644
--- a/docs/reference/compute_m_sigma.html
+++ b/docs/reference/compute_m_sigma.html
@@ -34,6 +34,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_posterior_m_nn.html b/docs/reference/compute_posterior_m_nn.html
index d2a0f88..565f1b0 100644
--- a/docs/reference/compute_posterior_m_nn.html
+++ b/docs/reference/compute_posterior_m_nn.html
@@ -34,6 +34,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_posterior_sd_nn.html b/docs/reference/compute_posterior_sd_nn.html
index febab4a..781afbc 100644
--- a/docs/reference/compute_posterior_sd_nn.html
+++ b/docs/reference/compute_posterior_sd_nn.html
@@ -32,6 +32,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_rl_deriv_gp.html b/docs/reference/compute_rl_deriv_gp.html
index d338a7d..1e22a84 100644
--- a/docs/reference/compute_rl_deriv_gp.html
+++ b/docs/reference/compute_rl_deriv_gp.html
@@ -32,6 +32,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_rl_deriv_nn.html b/docs/reference/compute_rl_deriv_nn.html
index 9c309f8..9685354 100644
--- a/docs/reference/compute_rl_deriv_nn.html
+++ b/docs/reference/compute_rl_deriv_nn.html
@@ -34,6 +34,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_sd_gp.html b/docs/reference/compute_sd_gp.html
index 01feef6..8655167 100644
--- a/docs/reference/compute_sd_gp.html
+++ b/docs/reference/compute_sd_gp.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_w_corr.html b/docs/reference/compute_w_corr.html
index 721285d..0e2e494 100644
--- a/docs/reference/compute_w_corr.html
+++ b/docs/reference/compute_w_corr.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/compute_weight_gp.html b/docs/reference/compute_weight_gp.html
index 8e94cc8..0e6c342 100644
--- a/docs/reference/compute_weight_gp.html
+++ b/docs/reference/compute_weight_gp.html
@@ -32,6 +32,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/estimate_cerf_gp.html b/docs/reference/estimate_cerf_gp.html
index d72f289..479e3ad 100644
--- a/docs/reference/estimate_cerf_gp.html
+++ b/docs/reference/estimate_cerf_gp.html
@@ -34,6 +34,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/estimate_cerf_nngp.html b/docs/reference/estimate_cerf_nngp.html
index 40130a7..b7cb24a 100644
--- a/docs/reference/estimate_cerf_nngp.html
+++ b/docs/reference/estimate_cerf_nngp.html
@@ -36,6 +36,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/estimate_gps.html b/docs/reference/estimate_gps.html
index 6fced35..f6fffa2 100644
--- a/docs/reference/estimate_gps.html
+++ b/docs/reference/estimate_gps.html
@@ -32,6 +32,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/estimate_mean_sd_nn.html b/docs/reference/estimate_mean_sd_nn.html
index d3ef1df..511c588 100644
--- a/docs/reference/estimate_mean_sd_nn.html
+++ b/docs/reference/estimate_mean_sd_nn.html
@@ -32,6 +32,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/estimate_noise_gp.html b/docs/reference/estimate_noise_gp.html
index 2e2965e..b016908 100644
--- a/docs/reference/estimate_noise_gp.html
+++ b/docs/reference/estimate_noise_gp.html
@@ -34,6 +34,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/estimate_noise_nn.html b/docs/reference/estimate_noise_nn.html
index 13f4e05..c2807ad 100644
--- a/docs/reference/estimate_noise_nn.html
+++ b/docs/reference/estimate_noise_nn.html
@@ -32,6 +32,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/find_optimal_nn.html b/docs/reference/find_optimal_nn.html
index e10305d..f20e6b3 100644
--- a/docs/reference/find_optimal_nn.html
+++ b/docs/reference/find_optimal_nn.html
@@ -32,6 +32,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/generate_synthetic_data.html b/docs/reference/generate_synthetic_data.html
index da353dc..5c79ba2 100644
--- a/docs/reference/generate_synthetic_data.html
+++ b/docs/reference/generate_synthetic_data.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/get_logger.html b/docs/reference/get_logger.html
index f19ba2f..9038e9a 100644
--- a/docs/reference/get_logger.html
+++ b/docs/reference/get_logger.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/index.html b/docs/reference/index.html
index afdd89d..3cc8042 100644
--- a/docs/reference/index.html
+++ b/docs/reference/index.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/log_system_info.html b/docs/reference/log_system_info.html
index 03be160..f540e94 100644
--- a/docs/reference/log_system_info.html
+++ b/docs/reference/log_system_info.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/plot.cerf_gp.html b/docs/reference/plot.cerf_gp.html
index 501ad49..76f988c 100644
--- a/docs/reference/plot.cerf_gp.html
+++ b/docs/reference/plot.cerf_gp.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/plot.cerf_nngp.html b/docs/reference/plot.cerf_nngp.html
index 217a8c5..ca867cf 100644
--- a/docs/reference/plot.cerf_nngp.html
+++ b/docs/reference/plot.cerf_nngp.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/print.cerf_gp.html b/docs/reference/print.cerf_gp.html
index a6cf7d6..85488dd 100644
--- a/docs/reference/print.cerf_gp.html
+++ b/docs/reference/print.cerf_gp.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/print.cerf_nngp.html b/docs/reference/print.cerf_nngp.html
index a0d43bd..3f93f80 100644
--- a/docs/reference/print.cerf_nngp.html
+++ b/docs/reference/print.cerf_nngp.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/set_logger.html b/docs/reference/set_logger.html
index 19997c5..0aeac62 100644
--- a/docs/reference/set_logger.html
+++ b/docs/reference/set_logger.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/summary.cerf_gp.html b/docs/reference/summary.cerf_gp.html
index df85ebe..267356a 100644
--- a/docs/reference/summary.cerf_gp.html
+++ b/docs/reference/summary.cerf_gp.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/reference/summary.cerf_nngp.html b/docs/reference/summary.cerf_nngp.html
index 8669548..3c94e9b 100644
--- a/docs/reference/summary.cerf_nngp.html
+++ b/docs/reference/summary.cerf_nngp.html
@@ -30,6 +30,7 @@
Standard Gaussian Processes
Nearest-neighbor Gaussian Processes
+
A Note on Choosing Hyperparameters
Developers Guide
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index 2024b07..ee24031 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -9,6 +9,9 @@
https://NSAPH-Software.github.io/GPCERF/LICENSE.html
+
+ https://NSAPH-Software.github.io/GPCERF/articles/A-Note-on-Choosing-Hyperparameters.html
+
https://NSAPH-Software.github.io/GPCERF/articles/Developers-Guide.html
diff --git a/paper/paper.bib b/paper/paper.bib
index be6e32d..367ec25 100644
--- a/paper/paper.bib
+++ b/paper/paper.bib
@@ -7,12 +7,13 @@ @Manual{R_2022
url = {https://www.R-project.org/},
}
-@Manual{CausalGPS_R,
- title = {CausalGPS: Matching on Generalized Propensity Scores with Continuous Exposures},
- author = {Naeem Khoshnevis and Xiao Wu and Danielle Braun},
- note = {R package version 0.3.0},
- year = {2023},
- url = {https://CRAN.R-project.org/package=CausalGPS},
+@misc{CausalGPS_R,
+ title={CausalGPS: An {R} Package for Causal Inference With Continuous Exposures},
+ author={Naeem Khoshnevis and Xiao Wu and Danielle Braun},
+ year={2023},
+ eprint={2310.00561},
+ archivePrefix={arXiv},
+ primaryClass={stat.CO}
}
@Article{MatchIt_R,
@@ -127,3 +128,27 @@ @Article{wu_2020
publisher = {American Association for the Advancement of Science},
doi={10.1126/sciadv.aba5692},
}
+
+@Manual{GauPro_2023,
+ title = {GauPro: Gaussian Process Fitting},
+ author = {Collin Erickson},
+ year = {2023},
+ note = {R package version 0.2.11},
+ url = {https://CRAN.R-project.org/package=GauPro},
+}
+
+@Manual{mlegp_2022,
+ title = {mlegp: Maximum Likelihood Estimates of Gaussian Processes},
+ author = {Garrett M. Dancik},
+ year = {2022},
+ note = {R package version 3.1.9},
+ url = {https://CRAN.R-project.org/package=mlegp},
+}
+
+@Manual{GPfit_2019,
+ title = {GPfit: Gaussian Processes Modeling},
+ author = {Blake MacDoanld and Hugh Chipman and Pritam Ranjan},
+ year = {2019},
+ note = {R package version 1.0-8},
+ url = {https://CRAN.R-project.org/package=GPfit},
+}
diff --git a/paper/paper.md b/paper/paper.md
index b94c912..b4c2395 100644
--- a/paper/paper.md
+++ b/paper/paper.md
@@ -33,7 +33,7 @@ We present the GPCERF R package, which employs a novel Bayesian approach based o
# Statement of need
-Existing R packages for estimating causal exposure-response functions of continuous exposures usually requires resampling approaches, such as boostrap, to obtain uncertainty of the estimates [e.g., @CausalGPS_R]. However, when the number of observations is large, resampling based algorithms can be computationally expensive. To address this gap, we have implemented a novel Bayesian approach that uses a Gaussian Processes (GPs) prior for the counterfactual outcome surfaces to enable flexible estimation of the CERF. By leveraging the posterior distributions of the model parameters, we can automatically quantify the uncertainty of the estimated CERF [for more details see @Ren_2021_bayesian]. Since standard GPs are notorious for their lack of scalability due to the operations involving inversion of the covariance matrices, we consider a nearest-neighbour GP (nnGP) prior to achieve computationally efficient inference of CERF when dealing with large-scale datasets.
+Existing R packages for estimating causal exposure-response functions with continuous exposures typically require resampling approaches, such as bootstrap, to determine the uncertainty of the estimates [e.g., @CausalGPS_R]. However, these resampling-based algorithms can become computationally burdensome when handling large datasets. To bridge this gap, we have developed a unique Bayesian methodology that employs a Gaussian Processes (GPs) prior for counterfactual outcome surfaces, thereby enabling more flexible estimation of the CERF. While various R packages, like GauPro [@GauPro_2023], mlegp [@mlegp_2022], and GPfit [@GPfit_2019], offer Gaussian process regression capabilities, we chose not to use them. The primary reason is that these packages rely on traditional techniques for hyper-parameter tuning, such as sampling from the hyper-parameters' posterior distributions or maximizing the marginal likelihood function. Our approach, in contrast, aims to achieve optimal covariate balancing. By utilizing the posterior distributions of model parameters, we can automatically assess the uncertainty in our CERF estimates [for further details, see @Ren_2021_bayesian]. Since standard GPs are infamous for their scalability issues—particularly due to operations involving the inversion of covariance matrices—we adopt a nearest-neighbor GP (nnGP) prior to ensure computationally efficient inference of the CERF in large-scale datasets.
# Overview
@@ -49,7 +49,12 @@ where $h : [0, \infty) \rightarrow [0, 1]$ is a non-increasing function; and $\a
The primary goal in GPCERF is to find appropriate values for the hyper-parameters. In the context of causal inference, ''appropriate'' values of the hyper-parameters are those that make the estimator of CERF as if it is generated from a study with randomized design. To be more concrete, note that the GP estimates $R(w)$ by creating a pseudo-population that is a weighted version of the original dataset [see more details in @Ren_2021_bayesian]. The weight for each sample in the original dataset is a function of the hyperparameters. By tuning the hyperparameters, we can minimize the sample correlations between $W$ and each component of $C$ in this pseudo-population, rendering the pseduo-population to be more balanced on these covariates $C$. In practice, we minimize the covariate balance, which is a summary of the sample correlations between $W$ and each of $C$ to tune our hyper-parameters. Covaraite balance is computed by assessing the correlation between $W$ and $C$ in the pseudo-population using the _wCorr_ R package [@wCorr_R].
-Both GP and nnGP approaches involve two primary steps - tuning and estimation. GPCERF conducts a grid search on the range of provided $\alpha$, $\beta$, and $\gamma/\sigma$. The kernel function is also selected if the user provides multiple candidates. During the tuning step, covariate balance is minimized by choosing the optimal hyperparameters. In the estimation step, the optimal parameters are used to estimate the posterior mean and standard deviation of $R(w)$ at a set of exposure values of interest. The outcome data is not used during the tuning step, separating the design and analysis phases. @Ren_2021_bayesian discusses the implemented approaches in detail. In the following we provide an example for running the package for each implemented models.
+Both GP and nnGP approaches involve two primary steps - tuning and estimation. GPCERF conducts a grid search on the range of provided $\alpha$, $\beta$, and $\gamma/\sigma$. The kernel function is also selected if the user provides multiple candidates. During the tuning step, covariate balance is minimized by choosing the optimal hyperparameters.
+
+The scaling parameter $\alpha$ and $\beta$ determine how much information the estimation will draw from the two coordinates: GPS score ($s(W, X)$) and exposure level ($W$). A large scaling parameter suggests that varying the corresponding coordinates is only associated with a minor change in the outcome, that is, this coordinate does not contribute too much to the variation of the outcome. The signal-to-noise ratio parameter $\gamma/\sigma$ encodes how different observed data is from pure noise. A large $\gamma/\sigma$ indicates strong associations between the outcome and the coordinates of GP while a small $\gamma/\sigma$ suggests the observed outcome is likely to be drawn from a random process that is independent of the coordinates. In the setting of observational studies and under the no unobserved confounding assumption, which GPCERF is specifically designed for, both the exposure level and the GPS score encode important information for the estimation of CERF. As a result, the range of their scaling parameter should be comparable and the covariate balance will determine which coordinate is more important (smaller scaling factor). The range should also cover both ends of the importance from extremely important to nearly irrelevant. We choose to achieve this by considering the range on the $log_{10}$ scale with equally spaced candidate values. The range of also follows the same strategy when the prior belief about the strength of causal effect of the exposure is weak.
+
+In the estimation step, the optimal parameters are used to estimate the posterior mean and standard deviation of $R(w)$ at a set of exposure values of interest. The outcome data is not used during the tuning step, separating the design and analysis phases. @Ren_2021_bayesian discusses the implemented approaches in detail. In the following we provide an example for running the package for each implemented models.
+
## Example 1: Standard GP models
diff --git a/vignettes/A-Note-on-Choosing-Hyperparameters.Rmd b/vignettes/A-Note-on-Choosing-Hyperparameters.Rmd
new file mode 100644
index 0000000..2c4ac7f
--- /dev/null
+++ b/vignettes/A-Note-on-Choosing-Hyperparameters.Rmd
@@ -0,0 +1,31 @@
+---
+title: "A Note on Choosing Hyperparameters"
+output: rmarkdown::html_vignette
+vignette: >
+ %\VignetteIndexEntry{A-Note-on-Choosing-Hyperparameters}
+ %\VignetteEngine{knitr::rmarkdown}
+ %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+ collapse = TRUE,
+ comment = "#>"
+)
+```
+
+
+This document outlines guidelines for choosing a spectrum of hyperparameters. We first summarize the interpretation of the three hyper-parameters and the impact of their values on the estimation to motivate the various choices of the hyper-parameter grid we propose. The scaling parameters $\alpha$ and $\beta$ determine how much information the estimation will draw from the two coordinates, GPS score, and exposure level. A large scaling parameter suggests that varying the corresponding coordinate is only associated with a small change in the outcome; that is, this coordinate does not contribute too much to the variation of the outcome. The signal-to-noise ratio parameter $\gamma/\sigma$ (`g_sigma` in the code) encodes how different the observed data is from pure noise. A large $\gamma/\sigma$ indicates strong associations between the outcome and the coordinates of the GP, while small $\gamma/\sigma$ suggests the observed outcome is likely to be drawn from a random process that is independent of the coordinates.
+
+In the setting of observational studies and under the no unobserved confounding assumption (note that this is the specific setting for which GPCERF is expressly designed), both the exposure level and the GPS score encode important information for the estimation of CERF; as a result, the range of their scaling parameters should be comparable, and the covariate balance will determine which coordinate is more important. The range should also cover both ends of the importance, from extremely important to nearly irrelevant. We choose to achieve this by considering the range on the log-10 scale with equally spaced candidate values. The specification of a range of $\gamma/\sigma$ also follows the same rationale when the prior belief about the strength of the causal effect of the exposure is weak. Here is an example grid that covers both ends.
+
+```r
+params_lst <- list(alpha = 10 ^ seq(-2, 2, length.out = 10),
+ beta = 10 ^ seq(-2, 2, length.out = 10),
+ g_sigma = c(0.1, 1, 10),
+ tune_app = "all")
+```
+
+
+In some instances, the observed data could be derived from randomized trials in which the exposure is entirely independent of any existing covariates. In this case, the estimated GPS score is, in fact, a noisy version of the exposure level (since the mean of the exposure level is not a function of the covariates), and it does not provide any additional information beyond the exposure itself. In this case, we have a strong prior belief that the importance of the GPS score is low, and thus, a large value of $\alpha$ can be specified without any additional tuning. The range of $\beta$ should focus on smaller values to capture the information carried by the exposure. The specification of $\gamma/\sigma$ should rely on some qualitative characterization, such as simple visualization, of the relationship between the outcome and the exposure. The reason is that under randomized design, the CERF can be estimated by regressing the outcome on the exposure directly with a flexible model. If the scatterplot of the outcome vs. the exposure exhibits a clear trend, we should specify a large $\gamma/\sigma$ to reflect the prior belief of a strong causal effect of the exposure.
+In reality, we may not know whether the study is a randomized trial. We can then compute the absolute correlations between the exposure and each of the covariates to detect potential deviation from randomization. When all the absolute correlation is smaller than a pre-specified threshold, for instance, 0.05, we can employ the specification for a hyper-parameter grid as described above for randomized trials.