Skip to content

Commit

Permalink
Merge pull request #84 from NSAPH-Software/JOSS
Browse files Browse the repository at this point in the history
Joss
  • Loading branch information
Naeemkh authored Mar 1, 2024
2 parents 7456a22 + 4de488f commit de0747c
Show file tree
Hide file tree
Showing 44 changed files with 932 additions and 677 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
matrix:
config:
- {os: windows-latest, r: 'release'}
#- {os: macOS-latest, r: 'release'} #deactivate until snprintf issue resolves.
- {os: macOS-latest, r: 'release'}
- {os: ubuntu-20.04, r: 'release', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}
#- {os: ubuntu-20.04, r: 'devel', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}

Expand Down
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@
- `estimate_cerf_nngp` takes `outcome_col`, `treatment_col`, and `covariates_col` names as inputs.
- `estimate_cerf_gp` takes `outcome_col`, `treatment_col`, and `covariates_col` names as inputs.

## Added

- `estimate_cerf_gp` and `estimate_cerf_nngp` have notes on selecting `w`.



# GPCERF 0.2.2 (2024-02-16)

Expand Down
18 changes: 15 additions & 3 deletions R/estimate_cerf_gp.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@
#' for the provided set of hyperparameters.
#'
#' @param data A data.frame of observation data.
#' @param w A vector of exposure level to compute CERF.
#' @param w A vector of exposure level to compute CERF (please also see the
#' notes).
#' @param gps_m An S3 gps object including:
#' gps: A data.frame of GPS vectors.
#' - Column 1: GPS
Expand All @@ -31,6 +32,18 @@
#' used by internal packages.
#' @param kernel_fn A kernel function. A default value is a Gaussian Kernel.
#'
#' @note
#' Please note that `w` is a vector representing a grid of exposure levels at
#' which the CERF is to be estimated. This grid can include both observed and
#' hypothetical values of the exposure variable. The purpose of defining this
#' grid is to provide a structured set of points across the exposure spectrum
#' for estimating the CERF. This approach is essential in nonparametric models
#' like Gaussian Processes (GPs), where the CERF is evaluated at specific points
#' to understand the relationship between the exposure and outcome variables
#' across a continuum. It facilitates a comprehensive analysis by allowing
#' practitioners to examine the effect of varying exposure levels, including
#' those not directly observed in the dataset.
#'
#' @return
#' A cerf_gp object that includes the following values:
#' - w, the vector of exposure levels.
Expand Down Expand Up @@ -221,8 +234,6 @@ estimate_cerf_gp <- function(data, w, gps_m, params,
})
}

logger::log_debug("Number of generated tuning results: {length(tune_res)}")

# Tuning results include:
# cb: covariate balance for each confounder. This is the average of all
# all covariate balance for each requested exposure values.
Expand All @@ -235,6 +246,7 @@ estimate_cerf_gp <- function(data, w, gps_m, params,
# Select the combination of hyperparameters that provides the lowest
# covariate balance ----------------------------------------------------------
if (nrow(tune_params_subset) > 1) {
logger::log_debug("Number of generated tuning results: {length(tune_res)}")
opt_idx <- order(sapply(tune_res, function(x) {mean(x$cb)}))[1]
} else {
opt_idx <- 1
Expand Down
15 changes: 14 additions & 1 deletion R/estimate_cerf_nngp.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@
#' match (the lowest covariate balance) for the provided set of hyperparameters.
#'
#' @param data A data.frame of observation data.
#' @param w A vector of exposure level to compute CERF.
#' @param w A vector of exposure level to compute CERF (please also see the
#' notes).
#' @param gps_m An S3 gps object including:
#' gps: A data.frame of GPS vectors.
#' - Column 1: GPS
Expand Down Expand Up @@ -36,6 +37,18 @@
#' @param nthread An integer value that represents the number of threads to be
#' used by internal packages.
#'
#' @note
#' Please note that `w` is a vector representing a grid of exposure levels at
#' which the CERF is to be estimated. This grid can include both observed and
#' hypothetical values of the exposure variable. The purpose of defining this
#' grid is to provide a structured set of points across the exposure spectrum
#' for estimating the CERF. This approach is essential in nonparametric models
#' like Gaussian Processes (GPs), where the CERF is evaluated at specific points
#' to understand the relationship between the exposure and outcome variables
#' across a continuum. It facilitates a comprehensive analysis by allowing
#' practitioners to examine the effect of varying exposure levels, including
#' those not directly observed in the dataset.
#'
#' @return
#' A cerf_nngp object that includes the following values:
#' - w, the vector of exposure levels.
Expand Down
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,12 +73,12 @@ Optimal hyper parameters(#trial: 300):
alpha = 12.9154966501488 beta = 12.9154966501488 g_sigma = 0.1
Optimal covariate balance:
cf1 = 0.072
cf1 = 0.069
cf2 = 0.082
cf3 = 0.062
cf4 = 0.068
cf3 = 0.063
cf4 = 0.066
cf5 = 0.056
cf6 = 0.082
cf6 = 0.081
Original covariate balance:
cf1 = 0.222
Expand All @@ -87,7 +87,7 @@ Original covariate balance:
cf4 = 0.318
cf5 = 0.198
cf6 = 0.257
----***----
----***----
```

<p>
Expand Down Expand Up @@ -148,10 +148,10 @@ Optimal hyper parameters(#trial: 300):
alpha = 0.0278255940220712 beta = 0.215443469003188 g_sigma = 0.1
Optimal covariate balance:
cf1 = 0.058
cf2 = 0.071
cf3 = 0.087
cf4 = 0.066
cf1 = 0.062
cf2 = 0.070
cf3 = 0.091
cf4 = 0.062
cf5 = 0.076
cf6 = 0.088
Expand All @@ -162,7 +162,7 @@ Original covariate balance:
cf4 = 0.296
cf5 = 0.208
cf6 = 0.225
----***----
----***----
```

<p>
Expand Down
4 changes: 2 additions & 2 deletions docker_singularity/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,9 @@ RUN R -e "install.packages(c( \
'rlang', \
'Rfast', \
'SuperLearner', \
'ranger', \
'wCorr'), repos='https://cloud.r-project.org')"

ENV RENV_VERSION 0.15.1
RUN R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"
RUN R -e "remotes::install_github('rstudio/renv@${RENV_VERSION}')"


4 changes: 2 additions & 2 deletions docker_singularity/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,9 @@ Run the following code to download and spin up the image.

```s
docker run -it --rm \
-p 8230:8787 \
-p 8231:8787 \
-e USER=rstudio \
-e PASSWORD=pass \
-v "/path/to/your/folder/on/host:/home/rstudio/Project" nsaphsoftware/gpcerf_dev
-v $PWD:/home/rstudio/Project nsaphsoftware/gpcerf_dev
```
3 changes: 0 additions & 3 deletions functional_tests/README.md

This file was deleted.

14 changes: 0 additions & 14 deletions functional_tests/ft_compute_deriv_nn.R

This file was deleted.

11 changes: 0 additions & 11 deletions functional_tests/ft_compute_derive_weights_gp.R

This file was deleted.

35 changes: 0 additions & 35 deletions functional_tests/ft_compute_inverse.R

This file was deleted.

24 changes: 0 additions & 24 deletions functional_tests/ft_compute_m_sigma.R

This file was deleted.

51 changes: 0 additions & 51 deletions functional_tests/ft_compute_posterior_m_nn.R

This file was deleted.

41 changes: 0 additions & 41 deletions functional_tests/ft_compute_posterior_sd_nn.R

This file was deleted.

34 changes: 0 additions & 34 deletions functional_tests/ft_compute_sd_gp.R

This file was deleted.

Loading

0 comments on commit de0747c

Please sign in to comment.