diff --git a/DESCRIPTION b/DESCRIPTION index 3a8d4df3..68679d60 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -2,7 +2,7 @@ Package: baker Type: Package Title: Nested Partially Latent Class Models Version: 1.0.2 -Date: 2023-12-14 +Date: 2023-12-18 Authors@R: c( person("Zhenke", "Wu", email="zhenkewu@gmail.com",role=c("cre","aut","cph"), comment = c(ORCID = "0000-0001-7582-669X")), @@ -14,14 +14,14 @@ Authors@R: c( comment = c(ORCID = "0000-0002-9366-8506")) ) Description: Provides functions to specify, fit and visualize - nested partially-latent class models (Wu et al., 2016, - 'JRSS-C' ; - Wu et al., 2017, 'Biostatistics' ; - Wu and Chen, 2021, Statistics in Medicine ) for + nested partially-latent class models ( + 'Wu, Deloria-Knoll, Hammitt, and Zeger, 2016, JRSS-C' ; + 'Wu, Deloria-Knoll, and Zeger, 2017, Biostatistics' ; + 'Wu and Chen, 2021, Statistics in Medicine' ) for inference of population disease etiology and individual diagnosis. In the motivating Pneumonia Etiology Research for Child Health (PERCH) study, because both quantities of interest sum to one hundred percent, the PERCH scientists frequently refer to - them as "population etiology pie" and "individual etiology pie", hence the name of the package. + them as 'population etiology pie' and 'individual etiology pie', hence the name of the package. Depends: R(>= 4.3.0) Imports: diff --git a/R/clean-perch-data.R b/R/clean-perch-data.R index e33ff0cf..122b0773 100644 --- a/R/clean-perch-data.R +++ b/R/clean-perch-data.R @@ -7,7 +7,7 @@ #' @param clean_options The list of options for cleaning PERCH data. #' Its elements are defined as follows: #' -#' \itemize{ +#' \describe{ #' \item{`raw_meas_dir`}{: The file path to the raw data;} #' \item{`case_def`}{: Variable name in raw data for **case** definition;} #' \item{`case_def_val`}{: The value for **case** definition;} @@ -194,20 +194,20 @@ clean_perch_data <- function(clean_options) { #' The default is NULL, which means not reading in any covariate. #' #' @return A list of data. -#' \itemize{ +#' \describe{ #' \item{Mobs}{ -#' \itemize{ -#' \item{MBS} A list of Bronze-Standard (BrS) measurements. +#' \describe{ +#' \item{MBS}{ A list of Bronze-Standard (BrS) measurements. #' The names of the list take the form of `specimen`_`test`. #' Each element of the list is a data frame. The rows of the data frame -#' are for subjects; the columns are for measured pathogens. -#' \item{MSS} A list of Silver-Standard (SS) measurements. -#' The formats are the same as `MBS` above. -#' \item{MGS} A list of Gold-Standard (GS) measurements. -#' It equals `NULL` if no GS data exist. +#' are for subjects; the columns are for measured pathogens.} +#' \item{MSS}{ A list of Silver-Standard (SS) measurements. +#' The formats are the same as `MBS` above.} +#' \item{MGS}{ A list of Gold-Standard (GS) measurements. +#' It equals `NULL` if no GS data exist.} #' } #' } -#' \item{X} A data frame with columns specified by `extra_covariates`. +#' \item{X}{ A data frame with columns specified by `extra_covariates`.} #' } #' #' @family raw data importing functions diff --git a/R/nplcm.R b/R/nplcm.R index 78367516..a0c2b05f 100644 --- a/R/nplcm.R +++ b/R/nplcm.R @@ -18,7 +18,8 @@ if(getRversion() >= "2.15.1") utils::globalVariables(c("set_prior_tpr","set_prio #' (effectively deleting `MGS` from `Mobs`). #' \itemize{ #' \item `MBS` a list of data frame of bronze-standard (BrS) measurements. -#' Rows are subjects, columns are causative agents (e.g., pathogen species). +#' For each data frame (referred to as a 'slice'), +#' rows are subjects, columns are causative agents (e.g., pathogen species). #' We use `list` here to accommodate the possibility of multiple sets of BrS data. #' They have imperfect sensitivity/specificity (e.g. nasopharyngeal polymerase chain #' reaction - NPPCR). @@ -46,32 +47,35 @@ if(getRversion() >= "2.15.1") utils::globalVariables(c("set_prior_tpr","set_prio #' A vector of characters strings; can be one or more from `"BrS"`, `"SS"`, `"GS"`. #' } #' \item{`likelihood`}{ -#' \itemize{ -#' \item{cause_list} The vector of causes (NB: specify); -#' \item{k_subclass} The number of nested subclasses in each +#' \describe{ +#' \item{cause_list}{ The vector of causes (NB: specify);} +#' \item{k_subclass}{ The number of nested subclasses in each #' disease class (one of case classes or the control class; the same `k_subclass` #' is assumed for each class) and each slice of BrS measurements. #' `1` for conditional independence; larger than `1` for conditional dependence. #' It is only available for BrS measurements. It is a vector of length equal to -#' the number of slices of BrS measurements; -#' \item{Eti_formula} Formula for etiology regressions. You can use +#' the number of slices of BrS measurements;} +#' \item{Eti_formula}{ Formula for etiology regressions. You can use #' [s_date_Eti()] to specify the design matrix for `R` format enrollment date; -#' it will produce natural cubic spline basis. Specify `~ 1` if no regression is intended. -#' \item{FPR_formula}formula for false positive rates (FPR) regressions; see [formula()]. +#' it will produce natural cubic spline basis. Specify `~ 1` if no regression is intended.} +#' \item{FPR_formula}{formula for false positive rates (FPR) regressions; see [formula()]. #' You can use [s_date_FPR()] to specify part of the design matrix for `R` #' format enrollment date; it will produce penalized-spline basis (based on B-splines). #' Specify `~ 1` if no regression is intended. (NB: If `effect="fixed"`, [dm_Rdate_FPR()] -#' will just specify a design matrix with appropriately standardized dates.) +#' will just specify a design matrix with appropriately standardized dates.)} #' } #' } #' #' \item{`prior`}{ -#' \itemize{ -#' \item{Eti_prior}Description of etiology prior (e.g., `overall_uniform` - -#' all hyperparameters are `1`; or `0_1` - all hyperparameters are `0.1`); -#' \item{TPR_prior}Description of priors for the measurements -#' (e.g., informative vs non-informative). Its length should be the same with `M_use`. -#' (NB: not sure what M use is...) +#' \describe{ +#' \item{Eti_prior}{Description of etiology prior (e.g., `overall_uniform` - +#' all hyperparameters are `1`; or `0_1` - all hyperparameters are `0.1`);} +#' \item{TPR_prior}{Description of priors for the measurements +#' (e.g., informative vs non-informative). Its length should be the +#' same as `use_measurements` above. Please see examples for how to specify. +#' The package can also handle multiple slices of BrS, SS data, so separate +#' specification of the TPR priors are needed. +#' } #' } #' } #' } @@ -116,7 +120,7 @@ if(getRversion() >= "2.15.1") utils::globalVariables(c("set_prior_tpr","set_prio #' This function is called when there exists one or more than one discrete covariate among #' the union of the two covariate sets. The method implemented by this function #' directly lets FPR depend upon covariates. -#' This is different from Wu and Chen (2020+), which let the subclass +#' This is different from Wu and Chen (2021), which let the subclass #' weights depend upon covariates. We implemented this function for methods comparison. #' \item [nplcm_fit_Reg_discrete_predictor_NoNest] deals with the setting #' with all discrete covariates for FPRs and CSCFs. The strata defined by the two sets of @@ -126,7 +130,7 @@ if(getRversion() >= "2.15.1") utils::globalVariables(c("set_prior_tpr","set_prio #' } #' \item local dependence model for BrS measures: #' Fitted at lower level by [nplcm_fit_Reg_Nest]: This is the method introduced in -#' Wu and Chen (2020+): CSCF regression + case/control subclass weight regression. +#' Wu and Chen (2021): CSCF regression + case/control subclass weight regression. #' It does not provide a specialized function for the setting with all discrete covariates. #' } #' } @@ -994,30 +998,21 @@ nplcm_fit_NoReg<- if(file.exists(curr_data_txt_file)){file.remove(curr_data_txt_file)} dump(names(in_data.list), append = FALSE, envir = here, file = curr_data_txt_file) - ## fix dimension problem.... convert say .Dmi=7:6 to c(7,6) (an issue for templateBS_1): - bad_jagsdata_txt <- readLines(curr_data_txt_file) - #good_jagsdata_txt <- gsub( "([0-9]+):([0-9]+)", "c(\\1,\\2)", bad_jagsdata_txt,fixed = FALSE) - ## to add an additional complicatoin of dump creates a text file with , dim = but the JAGS only accepts .Dim= - good_jagsdata_txt <- gsub( ", dim =", ", .Dim=", - gsub( "([0-9]+):([0-9]+)", "c(\\1,\\2)", bad_jagsdata_txt,fixed = FALSE), - fixed = FALSE) - - - writeLines(good_jagsdata_txt, curr_data_txt_file) - - # fix dimension problem.... convert say 7:6 to c(7,6) (an issue for a dumped matrix): - inits_fnames <- list.files(mcmc_options$result.folder,pattern = "^jagsinits[0-9]+.txt", - full.names = TRUE) - for (fiter in seq_along(inits_fnames)){ - curr_inits_txt_file <- inits_fnames[fiter] - bad_jagsinits_txt <- readLines(curr_inits_txt_file) - good_jagsinits_txt <- gsub( "([0-9]+):([0-9]+)", "c(\\1,\\2)", bad_jagsinits_txt,fixed = FALSE) - writeLines(good_jagsinits_txt, curr_inits_txt_file) - } + # ## fix dimension problem.... convert say .Dmi=7:6 to c(7,6) (an issue for templateBS_1): + # bad_jagsdata_txt <- readLines(curr_data_txt_file) + # #good_jagsdata_txt <- gsub( "([0-9]+):([0-9]+)", "c(\\1,\\2)", bad_jagsdata_txt,fixed = FALSE) + # ## to add an additional complicatoin of dump creates a text file with , dim = but the JAGS only accepts .Dim= + # good_jagsdata_txt <- gsub( ", dim =", ", .Dim=", + # gsub( "([0-9]+):([0-9]+)", "c(\\1,\\2)", bad_jagsdata_txt,fixed = FALSE), + # fixed = FALSE) + # + # + # writeLines(good_jagsdata_txt, curr_data_txt_file) + ## fixed some problems of JAGS 4.3.2 not having cut function; and I(a,b) functions weirdly, even though ## the two elements are already constants (errors says that are not constant). - curr_model_txt_file <- file.path(mcmc_options$result.folder,"model_NoReg.bug") + curr_model_txt_file <- file.path(mcmc_options$result.folder,model_bugfile_name) bad_model_txt <- readLines(curr_model_txt_file) good_model_txt <- gsub( "cut\\(", "(", bad_model_txt,fixed = FALSE) good_model_txt <- gsub( "I\\(0\\.000001,0\\.999999\\)", " ", good_model_txt,fixed = FALSE) @@ -1771,14 +1766,18 @@ nplcm_fit_Reg_discrete_predictor_NoNest <- if(file.exists(curr_data_txt_file)){file.remove(curr_data_txt_file)} dump(names(in_data.list), append = FALSE, envir = here, file = curr_data_txt_file) - # fix dimension problem.... convert say .Dmi=7:6 to c(7,6) (an issue for templateBS_1): - bad_jagsdata_txt <- readLines(curr_data_txt_file) - good_jagsdata_txt <- gsub( ".Dim = ([0-9]+):([0-9]+)", ".Dim = c(\\1,\\2)", - bad_jagsdata_txt,fixed = FALSE) - writeLines(good_jagsdata_txt, curr_data_txt_file) + + ## fixed some problems of JAGS 4.3.2 not having cut function; and I(a,b) functions weirdly, even though + ## the two elements are already constants (errors says that are not constant). + curr_model_txt_file <- file.path(mcmc_options$result.folder,model_bugfile_name) + bad_model_txt <- readLines(curr_model_txt_file) + good_model_txt <- gsub( "cut\\(", "(", bad_model_txt,fixed = FALSE) + good_model_txt <- gsub( "I\\(0\\.000001,0\\.999999\\)", " ", good_model_txt,fixed = FALSE) + writeLines(good_model_txt, curr_model_txt_file) + if(is.null(mcmc_options$jags.dir)){mcmc_options$jags.dir=""} gs <- jags2_baker(data = curr_data_txt_file, - inits = in_init, + inits = xxx, parameters.to.save = out_parameter, model.file = filename, working.directory = mcmc_options$result.folder, @@ -2456,20 +2455,15 @@ nplcm_fit_Reg_NoNest <- if(file.exists(curr_data_txt_file)){file.remove(curr_data_txt_file)} dump(names(in_data.list), append = FALSE, envir = here, file = curr_data_txt_file) - ## fix dimension problem.... convert say .Dmi=7:6 to c(7,6) (an issue for templateBS_1): - bad_jagsdata_txt <- readLines(curr_data_txt_file) - good_jagsdata_txt <- gsub( ".Dim = ([0-9]+):([0-9]+)", ".Dim = c(\\1,\\2)", bad_jagsdata_txt,fixed = FALSE) - writeLines(good_jagsdata_txt, curr_data_txt_file) - - # # fix dimension problem.... convert say 7:6 to c(7,6) (an issue for a dumped matrix): - # inits_fnames <- list.files(mcmc_options$result.folder,pattern = "^jagsinits[0-9]+.txt", - # full.names = TRUE) - # for (fiter in seq_along(inits_fnames)){ - # curr_inits_txt_file <- inits_fnames[fiter] - # bad_jagsinits_txt <- readLines(curr_inits_txt_file) - # good_jagsinits_txt <- gsub( "([0-9]+):([0-9]+)", "c(\\1,\\2)", bad_jagsinits_txt,fixed = FALSE) - # writeLines(good_jagsinits_txt, curr_inits_txt_file) - # } + + ## fixed some problems of JAGS 4.3.2 not having cut function; and I(a,b) functions weirdly, even though + ## the two elements are already constants (errors says that are not constant). + curr_model_txt_file <- file.path(mcmc_options$result.folder,model_bugfile_name) + bad_model_txt <- readLines(curr_model_txt_file) + good_model_txt <- gsub( "cut\\(", "(", bad_model_txt,fixed = FALSE) + good_model_txt <- gsub( "I\\(0\\.000001,0\\.999999\\)", " ", good_model_txt,fixed = FALSE) + writeLines(good_model_txt, curr_model_txt_file) + if(is.null(mcmc_options$jags.dir)){mcmc_options$jags.dir=""} gs <- jags2_baker(data = curr_data_txt_file, inits = in_init, @@ -3222,10 +3216,16 @@ nplcm_fit_Reg_Nest <- function(data_nplcm,model_options,mcmc_options){ if(file.exists(curr_data_txt_file)){file.remove(curr_data_txt_file)} dump(names(in_data.list), append = FALSE, envir = here, file = curr_data_txt_file) - ## fix dimension problem.... convert say .Dmi=7:6 to c(7,6) (an issue for templateBS_1): - bad_jagsdata_txt <- readLines(curr_data_txt_file) - good_jagsdata_txt <- gsub( ".Dim = ([0-9]+):([0-9]+)", ".Dim = c(\\1,\\2)", bad_jagsdata_txt,fixed = FALSE) - writeLines(good_jagsdata_txt, curr_data_txt_file) + + + ## fixed some problems of JAGS 4.3.2 not having cut function; and I(a,b) functions weirdly, even though + ## the two elements are already constants (errors says that are not constant). + curr_model_txt_file <- file.path(mcmc_options$result.folder,model_bugfile_name) + bad_model_txt <- readLines(curr_model_txt_file) + good_model_txt <- gsub( "cut\\(", "(", bad_model_txt,fixed = FALSE) + good_model_txt <- gsub( "I\\(0\\.000001,0\\.999999\\)", " ", good_model_txt,fixed = FALSE) + writeLines(good_model_txt, curr_model_txt_file) + if(is.null(mcmc_options$jags.dir)){mcmc_options$jags.dir=""} gs <- jags2_baker(data = curr_data_txt_file, inits = in_init, diff --git a/R/simulate-nplcm.R b/R/simulate-nplcm.R index af8948e0..fde08ce1 100644 --- a/R/simulate-nplcm.R +++ b/R/simulate-nplcm.R @@ -4,51 +4,51 @@ #' #' #' @param set_parameter True model parameters in an npLCM specification: -#' \itemize{ -#' \item{`cause_list`} a vector of disease class names among cases (since +#' \describe{ +#' \item{`cause_list`}{ a vector of disease class names among cases (since #' the causes could be multi-agent (e.g., multiple pathogens may cause an individual case's #' pneumonia), so its length could be longer than the total number of unique -#' causative agents) -#' \item{`etiology`} a vector of proportions that sum to 100 percent -#' \item{`pathogen_BrS`} a vector of putative causative agents' names measured in bronze-standard (BrS) data. -#' This function simulates only one slice defined by {specimen}{test}{pathogen} -#' \item{`pathogen_SS`} a vector of pathogen names measured in silver-standard (SS) data. -#' \item{`meas_nm`} a list of {specimen}{test} names e.g., `list(MBS = c("NPPCR"),MSS="BCX")` +#' causative agents)} +#' \item{`etiology`}{ a vector of proportions that sum to 100 percent} +#' \item{`pathogen_BrS`}{ a vector of putative causative agents' names measured in bronze-standard (BrS) data. +#' This function simulates only one slice defined by `specimen``test``pathogen`} +#' \item{`pathogen_SS`}{ a vector of pathogen names measured in silver-standard (SS) data.} +#' \item{`meas_nm`}{ a list of `specimen``test` names e.g., `list(MBS = c("NPPCR"),MSS="BCX")` #' for nasopharyngeal (NP) specimen tested by polymerase chain reaction (PCR) - `NPPCR` and -#' blood (B) tested by culture (Cx) - `BCX` -#' \item{`Lambda`} controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} -#' a vector of `K` probabilities that sum to 1. -#' \item{`Eta`} a matrix of dimension `length(cause_list)` by `K`; +#' blood (B) tested by culture (Cx) - `BCX`} +#' \item{`Lambda`}{ controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} +#' a vector of `K` probabilities that sum to 1.} +#' \item{`Eta`}{ a matrix of dimension `length(cause_list)` by `K`; #' each row represents a disease class (among cases); the values in that row #' are subclass weights \eqn{\eta_1, \eta_2, \ldots, \eta_K} for that disease class, #' so needs to sum to one. In Wu et al. 2016 (JRSS-C), the subclass weights are the same across disease #' classes across rows. But when simulating data, one can specify rows with distinct #' subclass weights - it is a matter whether we can recover these parameters (possible when -#' some cases' true disease classes are observed) -#' \item{`PsiBS/PsiSS`} False positive rates for Bronze-Standard data and +#' some cases' true disease classes are observed)} +#' \item{`PsiBS/PsiSS`}{ False positive rates for Bronze-Standard data and #' for Silver-Standard data. For example, the rows of `PsiBS` correspond to the dimension of the particular #' slice of BrS measures, e.g., `10` for 10 causative agents measured by NPPCR; the #' columns correspond to `K` subclasses; generically, the dimension is `J` by `K` -#' `PsiSS` is supposed to be a vector of all zeros (perfect specificity in silver-standard measures). -#' \item{`ThetaBS/ThetaSS`} True positive rates \eqn{\Theta} for Bronze-Standard data and +#' `PsiSS` is supposed to be a vector of all zeros (perfect specificity in silver-standard measures).} +#' \item{`ThetaBS/ThetaSS`}{ True positive rates \eqn{\Theta} for Bronze-Standard data and #' for Silver-Standard data. Dimension is `J` by `K` (can contain `NA` if the total number of #' causative agents measured by BrS or SS exceeds the measured causative agents in SS. For example, #' in PERCH study, nasopharyngeal polymerase chain reaction (NPPCR; bronze-standard) may target 30 distinct pathogens, but blood culture (BCX; silver-standard) may only target a subset of the 30, -#' so we have to specify `NA` in `ThetaSS`for those pathogens not targeted by BCX). -#' \item{`Nu`} the number of control subjects -#' \item{`Nd`} the number of case subjects +#' so we have to specify `NA` in `ThetaSS`for those pathogens not targeted by BCX).} +#' \item{`Nu`}{ the number of control subjects} +#' \item{`Nd`}{ the number of case subjects} #' } #' #' @return A list of diagnostic test measurements, true latent statues: -#' \itemize{ -#' \item{`data_nplcm`} a list of structured data (see [nplcm()] for -#' description). -#' \item{`template`} a matrix: rows for causes (may comprise a single or multiple causative agents), +#' \describe{ +#' \item{`data_nplcm`}{ a list of structured data (see [nplcm()] for +#' description). } +#' \item{`template`}{ a matrix: rows for causes (may comprise a single or multiple causative agents), #' columns for measurements; generated as a lookup table to match disease-class specific -#' parameters (true and false positive rates) -#' \item{`latent_cat`} integer values to indicate the latent category. The integer +#' parameters (true and false positive rates)} +#' \item{`latent_cat`}{ integer values to indicate the latent category. The integer #' code corresponds to the order specified in `set_parameter$etiology`. -#' Controls are coded as `length(set_parameter$etiology)+1`.) +#' Controls are coded as `length(set_parameter$etiology)+1`.)} #' } #' #' @seealso [simulate_latent] for simulating discrete latent status, given diff --git a/R/utils.R b/R/utils.R index 358ceccf..350b412e 100644 --- a/R/utils.R +++ b/R/utils.R @@ -1696,7 +1696,7 @@ has_non_basis <- function(form){ outlab <- attr(out,"term.labels") (attr(out,"intercept")>0) || (length(grep("^s_",outlab))>=1 && length(outlab[-grep("^s_",outlab)])>=1 && attr(out,"intercept")==0) || - (length(outlab)>=1 && length(grep("^s_",outlab))==0 && attr(out,"intercept")==0) + (length(outlab)>=1 && length(grep("^s_",outlab))==0 && attr(out,"intercept")==0) } #' Make Etiology design matrix for dates with R format. @@ -1922,6 +1922,22 @@ jags2_baker <- function (data, inits, parameters.to.save, model.file = "model.bu } } lapply(names(data.list), dump, append = TRUE, file = "jagsdata.txt") + + + ## ZW fix: + ## fix a problem related to dumped matrix having a structure attribute of dim not + ## .Dim as desired by JAGS 4.3.2; also the dimension could be represented by 7:6 + ## instead of c(7,6), which may cause problems - so fixing this here. + bad_jagsdata_txt <- readLines("jagsdata.txt") + good_jagsdata_txt <- gsub( ", dim =", ", .Dim=", + gsub( "([0-9]+):([0-9]+)", "c(\\1,\\2)", bad_jagsdata_txt,fixed = FALSE), + fixed = FALSE) + writeLines(good_jagsdata_txt, "jagsdata.txt") + + #### end of data fix. + + + data <- read.jagsdata("jagsdata.txt") if (is.function(model.file)) { temp <- tempfile("model") @@ -1960,15 +1976,14 @@ jags2_baker <- function (data, inits, parameters.to.save, model.file = "model.bu with(initial.values, dump(names(initial.values), file = curr_init_txt_file)) - # fix dimension problem.... convert say 7:6 to c(7,6) (an issue for a dumped matrix): - inits_fnames <- list.files(pattern = "^jagsinits[0-9]+.txt", - full.names = TRUE) - for (fiter in seq_along(inits_fnames)){ - curr_inits_txt_file <- inits_fnames[fiter] - bad_jagsinits_txt <- readLines(curr_inits_txt_file) - good_jagsinits_txt <- gsub( "([0-9]+):([0-9]+)", "c(\\1L,\\2L)", bad_jagsinits_txt,fixed = FALSE) - writeLines(good_jagsinits_txt, curr_inits_txt_file) - } + ## ZW fix: + bad_jagsinits_txt <- readLines(curr_init_txt_file) + good_jagsinits_txt <- gsub( ", dim =", ", .Dim=", + gsub( "([0-9]+):([0-9]+)", "c(\\1,\\2)", + bad_jagsinits_txt,fixed = FALSE), + fixed = FALSE) + writeLines(good_jagsinits_txt, curr_init_txt_file) + ## end of inits fix. } diff --git a/man/clean_perch_data.Rd b/man/clean_perch_data.Rd index 04149b7b..8985e769 100644 --- a/man/clean_perch_data.Rd +++ b/man/clean_perch_data.Rd @@ -10,7 +10,7 @@ clean_perch_data(clean_options) \item{clean_options}{The list of options for cleaning PERCH data. Its elements are defined as follows: -\itemize{ +\describe{ \item{\code{raw_meas_dir}}{: The file path to the raw data;} \item{\code{case_def}}{: Variable name in raw data for \strong{case} definition;} \item{\code{case_def_val}}{: The value for \strong{case} definition;} diff --git a/man/compute_logOR_single_cause.Rd b/man/compute_logOR_single_cause.Rd index cdcb2ed3..dee901ee 100644 --- a/man/compute_logOR_single_cause.Rd +++ b/man/compute_logOR_single_cause.Rd @@ -8,39 +8,39 @@ compute_logOR_single_cause(set_parameter) } \arguments{ \item{set_parameter}{True model parameters in an npLCM specification: -\itemize{ -\item{\code{cause_list}} a vector of disease class names among cases (since +\describe{ +\item{\code{cause_list}}{ a vector of disease class names among cases (since the causes could be multi-agent (e.g., multiple pathogens may cause an individual case's pneumonia), so its length could be longer than the total number of unique -causative agents) -\item{\code{etiology}} a vector of proportions that sum to 100 percent -\item{\code{pathogen_BrS}} a vector of putative causative agents' names measured in bronze-standard (BrS) data. -This function simulates only one slice defined by {specimen}{test}{pathogen} -\item{\code{pathogen_SS}} a vector of pathogen names measured in silver-standard (SS) data. -\item{\code{meas_nm}} a list of {specimen}{test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} +causative agents)} +\item{\code{etiology}}{ a vector of proportions that sum to 100 percent} +\item{\code{pathogen_BrS}}{ a vector of putative causative agents' names measured in bronze-standard (BrS) data. +This function simulates only one slice defined by \verb{specimen``test``pathogen}} +\item{\code{pathogen_SS}}{ a vector of pathogen names measured in silver-standard (SS) data.} +\item{\code{meas_nm}}{ a list of \verb{specimen``test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} for nasopharyngeal (NP) specimen tested by polymerase chain reaction (PCR) - \code{NPPCR} and -blood (B) tested by culture (Cx) - \code{BCX} -\item{\code{Lambda}} controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} -a vector of \code{K} probabilities that sum to 1. -\item{\code{Eta}} a matrix of dimension \code{length(cause_list)} by \code{K}; +blood (B) tested by culture (Cx) - \code{BCX}} +\item{\code{Lambda}}{ controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} +a vector of \code{K} probabilities that sum to 1.} +\item{\code{Eta}}{ a matrix of dimension \code{length(cause_list)} by \code{K}; each row represents a disease class (among cases); the values in that row are subclass weights \eqn{\eta_1, \eta_2, \ldots, \eta_K} for that disease class, so needs to sum to one. In Wu et al. 2016 (JRSS-C), the subclass weights are the same across disease classes across rows. But when simulating data, one can specify rows with distinct subclass weights - it is a matter whether we can recover these parameters (possible when -some cases' true disease classes are observed) -\item{\code{PsiBS/PsiSS}} False positive rates for Bronze-Standard data and +some cases' true disease classes are observed)} +\item{\code{PsiBS/PsiSS}}{ False positive rates for Bronze-Standard data and for Silver-Standard data. For example, the rows of \code{PsiBS} correspond to the dimension of the particular slice of BrS measures, e.g., \code{10} for 10 causative agents measured by NPPCR; the columns correspond to \code{K} subclasses; generically, the dimension is \code{J} by \code{K} -\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures). -\item{\code{ThetaBS/ThetaSS}} True positive rates \eqn{\Theta} for Bronze-Standard data and +\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures).} +\item{\code{ThetaBS/ThetaSS}}{ True positive rates \eqn{\Theta} for Bronze-Standard data and for Silver-Standard data. Dimension is \code{J} by \code{K} (can contain \code{NA} if the total number of causative agents measured by BrS or SS exceeds the measured causative agents in SS. For example, in PERCH study, nasopharyngeal polymerase chain reaction (NPPCR; bronze-standard) may target 30 distinct pathogens, but blood culture (BCX; silver-standard) may only target a subset of the 30, -so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX). -\item{\code{Nu}} the number of control subjects -\item{\code{Nd}} the number of case subjects +so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX).} +\item{\code{Nu}}{ the number of control subjects} +\item{\code{Nd}}{ the number of case subjects} }} } \value{ diff --git a/man/extract_data_raw.Rd b/man/extract_data_raw.Rd index e8f456cf..c9eb8819 100644 --- a/man/extract_data_raw.Rd +++ b/man/extract_data_raw.Rd @@ -32,20 +32,20 @@ The default is NULL, which means not reading in any covariate.} } \value{ A list of data. -\itemize{ +\describe{ \item{Mobs}{ -\itemize{ -\item{MBS} A list of Bronze-Standard (BrS) measurements. +\describe{ +\item{MBS}{ A list of Bronze-Standard (BrS) measurements. The names of the list take the form of \code{specimen}_\code{test}. Each element of the list is a data frame. The rows of the data frame -are for subjects; the columns are for measured pathogens. -\item{MSS} A list of Silver-Standard (SS) measurements. -The formats are the same as \code{MBS} above. -\item{MGS} A list of Gold-Standard (GS) measurements. -It equals \code{NULL} if no GS data exist. +are for subjects; the columns are for measured pathogens.} +\item{MSS}{ A list of Silver-Standard (SS) measurements. +The formats are the same as \code{MBS} above.} +\item{MGS}{ A list of Gold-Standard (GS) measurements. +It equals \code{NULL} if no GS data exist.} } } -\item{X} A data frame with columns specified by \code{extra_covariates}. +\item{X}{ A data frame with columns specified by \code{extra_covariates}.} } } \description{ diff --git a/man/nplcm.Rd b/man/nplcm.Rd index 7253872a..f7bb2159 100644 --- a/man/nplcm.Rd +++ b/man/nplcm.Rd @@ -18,7 +18,8 @@ is not available, please specify it as, e.g., \code{MGS=NULL} (effectively deleting \code{MGS} from \code{Mobs}). \itemize{ \item \code{MBS} a list of data frame of bronze-standard (BrS) measurements. -Rows are subjects, columns are causative agents (e.g., pathogen species). +For each data frame (referred to as a 'slice'), +rows are subjects, columns are causative agents (e.g., pathogen species). We use \code{list} here to accommodate the possibility of multiple sets of BrS data. They have imperfect sensitivity/specificity (e.g. nasopharyngeal polymerase chain reaction - NPPCR). @@ -46,32 +47,35 @@ basis expansion is often needed for approximation. A vector of characters strings; can be one or more from \code{"BrS"}, \code{"SS"}, \code{"GS"}. } \item{\code{likelihood}}{ -\itemize{ -\item{cause_list} The vector of causes (NB: specify); -\item{k_subclass} The number of nested subclasses in each +\describe{ +\item{cause_list}{ The vector of causes (NB: specify);} +\item{k_subclass}{ The number of nested subclasses in each disease class (one of case classes or the control class; the same \code{k_subclass} is assumed for each class) and each slice of BrS measurements. \code{1} for conditional independence; larger than \code{1} for conditional dependence. It is only available for BrS measurements. It is a vector of length equal to -the number of slices of BrS measurements; -\item{Eti_formula} Formula for etiology regressions. You can use +the number of slices of BrS measurements;} +\item{Eti_formula}{ Formula for etiology regressions. You can use \code{\link[=s_date_Eti]{s_date_Eti()}} to specify the design matrix for \code{R} format enrollment date; -it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended. -\item{FPR_formula}formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. +it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended.} +\item{FPR_formula}{formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. You can use \code{\link[=s_date_FPR]{s_date_FPR()}} to specify part of the design matrix for \code{R} format enrollment date; it will produce penalized-spline basis (based on B-splines). Specify \code{~ 1} if no regression is intended. (NB: If \code{effect="fixed"}, \code{\link[=dm_Rdate_FPR]{dm_Rdate_FPR()}} -will just specify a design matrix with appropriately standardized dates.) +will just specify a design matrix with appropriately standardized dates.)} } } \item{\code{prior}}{ -\itemize{ -\item{Eti_prior}Description of etiology prior (e.g., \code{overall_uniform} - -all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1}); -\item{TPR_prior}Description of priors for the measurements -(e.g., informative vs non-informative). Its length should be the same with \code{M_use}. -(NB: not sure what M use is...) +\describe{ +\item{Eti_prior}{Description of etiology prior (e.g., \code{overall_uniform} - +all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1});} +\item{TPR_prior}{Description of priors for the measurements +(e.g., informative vs non-informative). Its length should be the +same as \code{use_measurements} above. Please see examples for how to specify. +The package can also handle multiple slices of BrS, SS data, so separate +specification of the TPR priors are needed. +} } } }} @@ -119,7 +123,7 @@ covariates may be identical, overlapping or non-overlapping. This function is called when there exists one or more than one discrete covariate among the union of the two covariate sets. The method implemented by this function directly lets FPR depend upon covariates. -This is different from Wu and Chen (2020+), which let the subclass +This is different from Wu and Chen (2021), which let the subclass weights depend upon covariates. We implemented this function for methods comparison. \item \link{nplcm_fit_Reg_discrete_predictor_NoNest} deals with the setting with all discrete covariates for FPRs and CSCFs. The strata defined by the two sets of @@ -129,7 +133,7 @@ We implemented this function for methods comparison. } \item local dependence model for BrS measures: Fitted at lower level by \link{nplcm_fit_Reg_Nest}: This is the method introduced in -Wu and Chen (2020+): CSCF regression + case/control subclass weight regression. +Wu and Chen (2021): CSCF regression + case/control subclass weight regression. It does not provide a specialized function for the setting with all discrete covariates. } } diff --git a/man/nplcm_fit_NoReg.Rd b/man/nplcm_fit_NoReg.Rd index 290236b7..04ab2783 100644 --- a/man/nplcm_fit_NoReg.Rd +++ b/man/nplcm_fit_NoReg.Rd @@ -18,7 +18,8 @@ is not available, please specify it as, e.g., \code{MGS=NULL} (effectively deleting \code{MGS} from \code{Mobs}). \itemize{ \item \code{MBS} a list of data frame of bronze-standard (BrS) measurements. -Rows are subjects, columns are causative agents (e.g., pathogen species). +For each data frame (referred to as a 'slice'), +rows are subjects, columns are causative agents (e.g., pathogen species). We use \code{list} here to accommodate the possibility of multiple sets of BrS data. They have imperfect sensitivity/specificity (e.g. nasopharyngeal polymerase chain reaction - NPPCR). @@ -46,32 +47,35 @@ basis expansion is often needed for approximation. A vector of characters strings; can be one or more from \code{"BrS"}, \code{"SS"}, \code{"GS"}. } \item{\code{likelihood}}{ -\itemize{ -\item{cause_list} The vector of causes (NB: specify); -\item{k_subclass} The number of nested subclasses in each +\describe{ +\item{cause_list}{ The vector of causes (NB: specify);} +\item{k_subclass}{ The number of nested subclasses in each disease class (one of case classes or the control class; the same \code{k_subclass} is assumed for each class) and each slice of BrS measurements. \code{1} for conditional independence; larger than \code{1} for conditional dependence. It is only available for BrS measurements. It is a vector of length equal to -the number of slices of BrS measurements; -\item{Eti_formula} Formula for etiology regressions. You can use +the number of slices of BrS measurements;} +\item{Eti_formula}{ Formula for etiology regressions. You can use \code{\link[=s_date_Eti]{s_date_Eti()}} to specify the design matrix for \code{R} format enrollment date; -it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended. -\item{FPR_formula}formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. +it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended.} +\item{FPR_formula}{formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. You can use \code{\link[=s_date_FPR]{s_date_FPR()}} to specify part of the design matrix for \code{R} format enrollment date; it will produce penalized-spline basis (based on B-splines). Specify \code{~ 1} if no regression is intended. (NB: If \code{effect="fixed"}, \code{\link[=dm_Rdate_FPR]{dm_Rdate_FPR()}} -will just specify a design matrix with appropriately standardized dates.) +will just specify a design matrix with appropriately standardized dates.)} } } \item{\code{prior}}{ -\itemize{ -\item{Eti_prior}Description of etiology prior (e.g., \code{overall_uniform} - -all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1}); -\item{TPR_prior}Description of priors for the measurements -(e.g., informative vs non-informative). Its length should be the same with \code{M_use}. -(NB: not sure what M use is...) +\describe{ +\item{Eti_prior}{Description of etiology prior (e.g., \code{overall_uniform} - +all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1});} +\item{TPR_prior}{Description of priors for the measurements +(e.g., informative vs non-informative). Its length should be the +same as \code{use_measurements} above. Please see examples for how to specify. +The package can also handle multiple slices of BrS, SS data, so separate +specification of the TPR priors are needed. +} } } }} diff --git a/man/nplcm_fit_Reg_Nest.Rd b/man/nplcm_fit_Reg_Nest.Rd index bd74b942..d4661e4f 100644 --- a/man/nplcm_fit_Reg_Nest.Rd +++ b/man/nplcm_fit_Reg_Nest.Rd @@ -18,7 +18,8 @@ is not available, please specify it as, e.g., \code{MGS=NULL} (effectively deleting \code{MGS} from \code{Mobs}). \itemize{ \item \code{MBS} a list of data frame of bronze-standard (BrS) measurements. -Rows are subjects, columns are causative agents (e.g., pathogen species). +For each data frame (referred to as a 'slice'), +rows are subjects, columns are causative agents (e.g., pathogen species). We use \code{list} here to accommodate the possibility of multiple sets of BrS data. They have imperfect sensitivity/specificity (e.g. nasopharyngeal polymerase chain reaction - NPPCR). @@ -46,32 +47,35 @@ basis expansion is often needed for approximation. A vector of characters strings; can be one or more from \code{"BrS"}, \code{"SS"}, \code{"GS"}. } \item{\code{likelihood}}{ -\itemize{ -\item{cause_list} The vector of causes (NB: specify); -\item{k_subclass} The number of nested subclasses in each +\describe{ +\item{cause_list}{ The vector of causes (NB: specify);} +\item{k_subclass}{ The number of nested subclasses in each disease class (one of case classes or the control class; the same \code{k_subclass} is assumed for each class) and each slice of BrS measurements. \code{1} for conditional independence; larger than \code{1} for conditional dependence. It is only available for BrS measurements. It is a vector of length equal to -the number of slices of BrS measurements; -\item{Eti_formula} Formula for etiology regressions. You can use +the number of slices of BrS measurements;} +\item{Eti_formula}{ Formula for etiology regressions. You can use \code{\link[=s_date_Eti]{s_date_Eti()}} to specify the design matrix for \code{R} format enrollment date; -it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended. -\item{FPR_formula}formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. +it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended.} +\item{FPR_formula}{formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. You can use \code{\link[=s_date_FPR]{s_date_FPR()}} to specify part of the design matrix for \code{R} format enrollment date; it will produce penalized-spline basis (based on B-splines). Specify \code{~ 1} if no regression is intended. (NB: If \code{effect="fixed"}, \code{\link[=dm_Rdate_FPR]{dm_Rdate_FPR()}} -will just specify a design matrix with appropriately standardized dates.) +will just specify a design matrix with appropriately standardized dates.)} } } \item{\code{prior}}{ -\itemize{ -\item{Eti_prior}Description of etiology prior (e.g., \code{overall_uniform} - -all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1}); -\item{TPR_prior}Description of priors for the measurements -(e.g., informative vs non-informative). Its length should be the same with \code{M_use}. -(NB: not sure what M use is...) +\describe{ +\item{Eti_prior}{Description of etiology prior (e.g., \code{overall_uniform} - +all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1});} +\item{TPR_prior}{Description of priors for the measurements +(e.g., informative vs non-informative). Its length should be the +same as \code{use_measurements} above. Please see examples for how to specify. +The package can also handle multiple slices of BrS, SS data, so separate +specification of the TPR priors are needed. +} } } }} diff --git a/man/nplcm_fit_Reg_NoNest.Rd b/man/nplcm_fit_Reg_NoNest.Rd index f1d5517a..e4827fdb 100644 --- a/man/nplcm_fit_Reg_NoNest.Rd +++ b/man/nplcm_fit_Reg_NoNest.Rd @@ -18,7 +18,8 @@ is not available, please specify it as, e.g., \code{MGS=NULL} (effectively deleting \code{MGS} from \code{Mobs}). \itemize{ \item \code{MBS} a list of data frame of bronze-standard (BrS) measurements. -Rows are subjects, columns are causative agents (e.g., pathogen species). +For each data frame (referred to as a 'slice'), +rows are subjects, columns are causative agents (e.g., pathogen species). We use \code{list} here to accommodate the possibility of multiple sets of BrS data. They have imperfect sensitivity/specificity (e.g. nasopharyngeal polymerase chain reaction - NPPCR). @@ -46,32 +47,35 @@ basis expansion is often needed for approximation. A vector of characters strings; can be one or more from \code{"BrS"}, \code{"SS"}, \code{"GS"}. } \item{\code{likelihood}}{ -\itemize{ -\item{cause_list} The vector of causes (NB: specify); -\item{k_subclass} The number of nested subclasses in each +\describe{ +\item{cause_list}{ The vector of causes (NB: specify);} +\item{k_subclass}{ The number of nested subclasses in each disease class (one of case classes or the control class; the same \code{k_subclass} is assumed for each class) and each slice of BrS measurements. \code{1} for conditional independence; larger than \code{1} for conditional dependence. It is only available for BrS measurements. It is a vector of length equal to -the number of slices of BrS measurements; -\item{Eti_formula} Formula for etiology regressions. You can use +the number of slices of BrS measurements;} +\item{Eti_formula}{ Formula for etiology regressions. You can use \code{\link[=s_date_Eti]{s_date_Eti()}} to specify the design matrix for \code{R} format enrollment date; -it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended. -\item{FPR_formula}formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. +it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended.} +\item{FPR_formula}{formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. You can use \code{\link[=s_date_FPR]{s_date_FPR()}} to specify part of the design matrix for \code{R} format enrollment date; it will produce penalized-spline basis (based on B-splines). Specify \code{~ 1} if no regression is intended. (NB: If \code{effect="fixed"}, \code{\link[=dm_Rdate_FPR]{dm_Rdate_FPR()}} -will just specify a design matrix with appropriately standardized dates.) +will just specify a design matrix with appropriately standardized dates.)} } } \item{\code{prior}}{ -\itemize{ -\item{Eti_prior}Description of etiology prior (e.g., \code{overall_uniform} - -all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1}); -\item{TPR_prior}Description of priors for the measurements -(e.g., informative vs non-informative). Its length should be the same with \code{M_use}. -(NB: not sure what M use is...) +\describe{ +\item{Eti_prior}{Description of etiology prior (e.g., \code{overall_uniform} - +all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1});} +\item{TPR_prior}{Description of priors for the measurements +(e.g., informative vs non-informative). Its length should be the +same as \code{use_measurements} above. Please see examples for how to specify. +The package can also handle multiple slices of BrS, SS data, so separate +specification of the TPR priors are needed. +} } } }} diff --git a/man/nplcm_fit_Reg_discrete_predictor_NoNest.Rd b/man/nplcm_fit_Reg_discrete_predictor_NoNest.Rd index 0ba497af..cbecea11 100644 --- a/man/nplcm_fit_Reg_discrete_predictor_NoNest.Rd +++ b/man/nplcm_fit_Reg_discrete_predictor_NoNest.Rd @@ -22,7 +22,8 @@ is not available, please specify it as, e.g., \code{MGS=NULL} (effectively deleting \code{MGS} from \code{Mobs}). \itemize{ \item \code{MBS} a list of data frame of bronze-standard (BrS) measurements. -Rows are subjects, columns are causative agents (e.g., pathogen species). +For each data frame (referred to as a 'slice'), +rows are subjects, columns are causative agents (e.g., pathogen species). We use \code{list} here to accommodate the possibility of multiple sets of BrS data. They have imperfect sensitivity/specificity (e.g. nasopharyngeal polymerase chain reaction - NPPCR). @@ -50,32 +51,35 @@ basis expansion is often needed for approximation. A vector of characters strings; can be one or more from \code{"BrS"}, \code{"SS"}, \code{"GS"}. } \item{\code{likelihood}}{ -\itemize{ -\item{cause_list} The vector of causes (NB: specify); -\item{k_subclass} The number of nested subclasses in each +\describe{ +\item{cause_list}{ The vector of causes (NB: specify);} +\item{k_subclass}{ The number of nested subclasses in each disease class (one of case classes or the control class; the same \code{k_subclass} is assumed for each class) and each slice of BrS measurements. \code{1} for conditional independence; larger than \code{1} for conditional dependence. It is only available for BrS measurements. It is a vector of length equal to -the number of slices of BrS measurements; -\item{Eti_formula} Formula for etiology regressions. You can use +the number of slices of BrS measurements;} +\item{Eti_formula}{ Formula for etiology regressions. You can use \code{\link[=s_date_Eti]{s_date_Eti()}} to specify the design matrix for \code{R} format enrollment date; -it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended. -\item{FPR_formula}formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. +it will produce natural cubic spline basis. Specify \code{~ 1} if no regression is intended.} +\item{FPR_formula}{formula for false positive rates (FPR) regressions; see \code{\link[=formula]{formula()}}. You can use \code{\link[=s_date_FPR]{s_date_FPR()}} to specify part of the design matrix for \code{R} format enrollment date; it will produce penalized-spline basis (based on B-splines). Specify \code{~ 1} if no regression is intended. (NB: If \code{effect="fixed"}, \code{\link[=dm_Rdate_FPR]{dm_Rdate_FPR()}} -will just specify a design matrix with appropriately standardized dates.) +will just specify a design matrix with appropriately standardized dates.)} } } \item{\code{prior}}{ -\itemize{ -\item{Eti_prior}Description of etiology prior (e.g., \code{overall_uniform} - -all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1}); -\item{TPR_prior}Description of priors for the measurements -(e.g., informative vs non-informative). Its length should be the same with \code{M_use}. -(NB: not sure what M use is...) +\describe{ +\item{Eti_prior}{Description of etiology prior (e.g., \code{overall_uniform} - +all hyperparameters are \code{1}; or \verb{0_1} - all hyperparameters are \code{0.1});} +\item{TPR_prior}{Description of priors for the measurements +(e.g., informative vs non-informative). Its length should be the +same as \code{use_measurements} above. Please see examples for how to specify. +The package can also handle multiple slices of BrS, SS data, so separate +specification of the TPR priors are needed. +} } } }} diff --git a/man/simulate_brs.Rd b/man/simulate_brs.Rd index 9be2fe98..9d6eb7fc 100644 --- a/man/simulate_brs.Rd +++ b/man/simulate_brs.Rd @@ -8,39 +8,39 @@ simulate_brs(set_parameter, latent_samples) } \arguments{ \item{set_parameter}{True model parameters in an npLCM specification: -\itemize{ -\item{\code{cause_list}} a vector of disease class names among cases (since +\describe{ +\item{\code{cause_list}}{ a vector of disease class names among cases (since the causes could be multi-agent (e.g., multiple pathogens may cause an individual case's pneumonia), so its length could be longer than the total number of unique -causative agents) -\item{\code{etiology}} a vector of proportions that sum to 100 percent -\item{\code{pathogen_BrS}} a vector of putative causative agents' names measured in bronze-standard (BrS) data. -This function simulates only one slice defined by {specimen}{test}{pathogen} -\item{\code{pathogen_SS}} a vector of pathogen names measured in silver-standard (SS) data. -\item{\code{meas_nm}} a list of {specimen}{test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} +causative agents)} +\item{\code{etiology}}{ a vector of proportions that sum to 100 percent} +\item{\code{pathogen_BrS}}{ a vector of putative causative agents' names measured in bronze-standard (BrS) data. +This function simulates only one slice defined by \verb{specimen``test``pathogen}} +\item{\code{pathogen_SS}}{ a vector of pathogen names measured in silver-standard (SS) data.} +\item{\code{meas_nm}}{ a list of \verb{specimen``test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} for nasopharyngeal (NP) specimen tested by polymerase chain reaction (PCR) - \code{NPPCR} and -blood (B) tested by culture (Cx) - \code{BCX} -\item{\code{Lambda}} controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} -a vector of \code{K} probabilities that sum to 1. -\item{\code{Eta}} a matrix of dimension \code{length(cause_list)} by \code{K}; +blood (B) tested by culture (Cx) - \code{BCX}} +\item{\code{Lambda}}{ controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} +a vector of \code{K} probabilities that sum to 1.} +\item{\code{Eta}}{ a matrix of dimension \code{length(cause_list)} by \code{K}; each row represents a disease class (among cases); the values in that row are subclass weights \eqn{\eta_1, \eta_2, \ldots, \eta_K} for that disease class, so needs to sum to one. In Wu et al. 2016 (JRSS-C), the subclass weights are the same across disease classes across rows. But when simulating data, one can specify rows with distinct subclass weights - it is a matter whether we can recover these parameters (possible when -some cases' true disease classes are observed) -\item{\code{PsiBS/PsiSS}} False positive rates for Bronze-Standard data and +some cases' true disease classes are observed)} +\item{\code{PsiBS/PsiSS}}{ False positive rates for Bronze-Standard data and for Silver-Standard data. For example, the rows of \code{PsiBS} correspond to the dimension of the particular slice of BrS measures, e.g., \code{10} for 10 causative agents measured by NPPCR; the columns correspond to \code{K} subclasses; generically, the dimension is \code{J} by \code{K} -\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures). -\item{\code{ThetaBS/ThetaSS}} True positive rates \eqn{\Theta} for Bronze-Standard data and +\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures).} +\item{\code{ThetaBS/ThetaSS}}{ True positive rates \eqn{\Theta} for Bronze-Standard data and for Silver-Standard data. Dimension is \code{J} by \code{K} (can contain \code{NA} if the total number of causative agents measured by BrS or SS exceeds the measured causative agents in SS. For example, in PERCH study, nasopharyngeal polymerase chain reaction (NPPCR; bronze-standard) may target 30 distinct pathogens, but blood culture (BCX; silver-standard) may only target a subset of the 30, -so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX). -\item{\code{Nu}} the number of control subjects -\item{\code{Nd}} the number of case subjects +so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX).} +\item{\code{Nu}}{ the number of control subjects} +\item{\code{Nd}}{ the number of case subjects} }} \item{latent_samples}{simulated latent status for all the subjects, for use in simulating diff --git a/man/simulate_latent.Rd b/man/simulate_latent.Rd index 9479f798..390f8149 100644 --- a/man/simulate_latent.Rd +++ b/man/simulate_latent.Rd @@ -8,39 +8,39 @@ simulate_latent(set_parameter) } \arguments{ \item{set_parameter}{True model parameters in an npLCM specification: -\itemize{ -\item{\code{cause_list}} a vector of disease class names among cases (since +\describe{ +\item{\code{cause_list}}{ a vector of disease class names among cases (since the causes could be multi-agent (e.g., multiple pathogens may cause an individual case's pneumonia), so its length could be longer than the total number of unique -causative agents) -\item{\code{etiology}} a vector of proportions that sum to 100 percent -\item{\code{pathogen_BrS}} a vector of putative causative agents' names measured in bronze-standard (BrS) data. -This function simulates only one slice defined by {specimen}{test}{pathogen} -\item{\code{pathogen_SS}} a vector of pathogen names measured in silver-standard (SS) data. -\item{\code{meas_nm}} a list of {specimen}{test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} +causative agents)} +\item{\code{etiology}}{ a vector of proportions that sum to 100 percent} +\item{\code{pathogen_BrS}}{ a vector of putative causative agents' names measured in bronze-standard (BrS) data. +This function simulates only one slice defined by \verb{specimen``test``pathogen}} +\item{\code{pathogen_SS}}{ a vector of pathogen names measured in silver-standard (SS) data.} +\item{\code{meas_nm}}{ a list of \verb{specimen``test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} for nasopharyngeal (NP) specimen tested by polymerase chain reaction (PCR) - \code{NPPCR} and -blood (B) tested by culture (Cx) - \code{BCX} -\item{\code{Lambda}} controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} -a vector of \code{K} probabilities that sum to 1. -\item{\code{Eta}} a matrix of dimension \code{length(cause_list)} by \code{K}; +blood (B) tested by culture (Cx) - \code{BCX}} +\item{\code{Lambda}}{ controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} +a vector of \code{K} probabilities that sum to 1.} +\item{\code{Eta}}{ a matrix of dimension \code{length(cause_list)} by \code{K}; each row represents a disease class (among cases); the values in that row are subclass weights \eqn{\eta_1, \eta_2, \ldots, \eta_K} for that disease class, so needs to sum to one. In Wu et al. 2016 (JRSS-C), the subclass weights are the same across disease classes across rows. But when simulating data, one can specify rows with distinct subclass weights - it is a matter whether we can recover these parameters (possible when -some cases' true disease classes are observed) -\item{\code{PsiBS/PsiSS}} False positive rates for Bronze-Standard data and +some cases' true disease classes are observed)} +\item{\code{PsiBS/PsiSS}}{ False positive rates for Bronze-Standard data and for Silver-Standard data. For example, the rows of \code{PsiBS} correspond to the dimension of the particular slice of BrS measures, e.g., \code{10} for 10 causative agents measured by NPPCR; the columns correspond to \code{K} subclasses; generically, the dimension is \code{J} by \code{K} -\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures). -\item{\code{ThetaBS/ThetaSS}} True positive rates \eqn{\Theta} for Bronze-Standard data and +\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures).} +\item{\code{ThetaBS/ThetaSS}}{ True positive rates \eqn{\Theta} for Bronze-Standard data and for Silver-Standard data. Dimension is \code{J} by \code{K} (can contain \code{NA} if the total number of causative agents measured by BrS or SS exceeds the measured causative agents in SS. For example, in PERCH study, nasopharyngeal polymerase chain reaction (NPPCR; bronze-standard) may target 30 distinct pathogens, but blood culture (BCX; silver-standard) may only target a subset of the 30, -so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX). -\item{\code{Nu}} the number of control subjects -\item{\code{Nd}} the number of case subjects +so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX).} +\item{\code{Nu}}{ the number of control subjects} +\item{\code{Nd}}{ the number of case subjects} }} } \value{ diff --git a/man/simulate_nplcm.Rd b/man/simulate_nplcm.Rd index 8d60bc70..81cda25b 100644 --- a/man/simulate_nplcm.Rd +++ b/man/simulate_nplcm.Rd @@ -8,52 +8,52 @@ simulate_nplcm(set_parameter) } \arguments{ \item{set_parameter}{True model parameters in an npLCM specification: -\itemize{ -\item{\code{cause_list}} a vector of disease class names among cases (since +\describe{ +\item{\code{cause_list}}{ a vector of disease class names among cases (since the causes could be multi-agent (e.g., multiple pathogens may cause an individual case's pneumonia), so its length could be longer than the total number of unique -causative agents) -\item{\code{etiology}} a vector of proportions that sum to 100 percent -\item{\code{pathogen_BrS}} a vector of putative causative agents' names measured in bronze-standard (BrS) data. -This function simulates only one slice defined by {specimen}{test}{pathogen} -\item{\code{pathogen_SS}} a vector of pathogen names measured in silver-standard (SS) data. -\item{\code{meas_nm}} a list of {specimen}{test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} +causative agents)} +\item{\code{etiology}}{ a vector of proportions that sum to 100 percent} +\item{\code{pathogen_BrS}}{ a vector of putative causative agents' names measured in bronze-standard (BrS) data. +This function simulates only one slice defined by \verb{specimen``test``pathogen}} +\item{\code{pathogen_SS}}{ a vector of pathogen names measured in silver-standard (SS) data.} +\item{\code{meas_nm}}{ a list of \verb{specimen``test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} for nasopharyngeal (NP) specimen tested by polymerase chain reaction (PCR) - \code{NPPCR} and -blood (B) tested by culture (Cx) - \code{BCX} -\item{\code{Lambda}} controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} -a vector of \code{K} probabilities that sum to 1. -\item{\code{Eta}} a matrix of dimension \code{length(cause_list)} by \code{K}; +blood (B) tested by culture (Cx) - \code{BCX}} +\item{\code{Lambda}}{ controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} +a vector of \code{K} probabilities that sum to 1.} +\item{\code{Eta}}{ a matrix of dimension \code{length(cause_list)} by \code{K}; each row represents a disease class (among cases); the values in that row are subclass weights \eqn{\eta_1, \eta_2, \ldots, \eta_K} for that disease class, so needs to sum to one. In Wu et al. 2016 (JRSS-C), the subclass weights are the same across disease classes across rows. But when simulating data, one can specify rows with distinct subclass weights - it is a matter whether we can recover these parameters (possible when -some cases' true disease classes are observed) -\item{\code{PsiBS/PsiSS}} False positive rates for Bronze-Standard data and +some cases' true disease classes are observed)} +\item{\code{PsiBS/PsiSS}}{ False positive rates for Bronze-Standard data and for Silver-Standard data. For example, the rows of \code{PsiBS} correspond to the dimension of the particular slice of BrS measures, e.g., \code{10} for 10 causative agents measured by NPPCR; the columns correspond to \code{K} subclasses; generically, the dimension is \code{J} by \code{K} -\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures). -\item{\code{ThetaBS/ThetaSS}} True positive rates \eqn{\Theta} for Bronze-Standard data and +\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures).} +\item{\code{ThetaBS/ThetaSS}}{ True positive rates \eqn{\Theta} for Bronze-Standard data and for Silver-Standard data. Dimension is \code{J} by \code{K} (can contain \code{NA} if the total number of causative agents measured by BrS or SS exceeds the measured causative agents in SS. For example, in PERCH study, nasopharyngeal polymerase chain reaction (NPPCR; bronze-standard) may target 30 distinct pathogens, but blood culture (BCX; silver-standard) may only target a subset of the 30, -so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX). -\item{\code{Nu}} the number of control subjects -\item{\code{Nd}} the number of case subjects +so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX).} +\item{\code{Nu}}{ the number of control subjects} +\item{\code{Nd}}{ the number of case subjects} }} } \value{ A list of diagnostic test measurements, true latent statues: -\itemize{ -\item{\code{data_nplcm}} a list of structured data (see \code{\link[=nplcm]{nplcm()}} for -description). -\item{\code{template}} a matrix: rows for causes (may comprise a single or multiple causative agents), +\describe{ +\item{\code{data_nplcm}}{ a list of structured data (see \code{\link[=nplcm]{nplcm()}} for +description). } +\item{\code{template}}{ a matrix: rows for causes (may comprise a single or multiple causative agents), columns for measurements; generated as a lookup table to match disease-class specific -parameters (true and false positive rates) -\item{\code{latent_cat}} integer values to indicate the latent category. The integer +parameters (true and false positive rates)} +\item{\code{latent_cat}}{ integer values to indicate the latent category. The integer code corresponds to the order specified in \code{set_parameter$etiology}. -Controls are coded as \code{length(set_parameter$etiology)+1}.) +Controls are coded as \code{length(set_parameter$etiology)+1}.)} } } \description{ diff --git a/man/simulate_ss.Rd b/man/simulate_ss.Rd index eef193bd..dda2afdb 100644 --- a/man/simulate_ss.Rd +++ b/man/simulate_ss.Rd @@ -8,39 +8,39 @@ simulate_ss(set_parameter, latent_samples) } \arguments{ \item{set_parameter}{True model parameters in an npLCM specification: -\itemize{ -\item{\code{cause_list}} a vector of disease class names among cases (since +\describe{ +\item{\code{cause_list}}{ a vector of disease class names among cases (since the causes could be multi-agent (e.g., multiple pathogens may cause an individual case's pneumonia), so its length could be longer than the total number of unique -causative agents) -\item{\code{etiology}} a vector of proportions that sum to 100 percent -\item{\code{pathogen_BrS}} a vector of putative causative agents' names measured in bronze-standard (BrS) data. -This function simulates only one slice defined by {specimen}{test}{pathogen} -\item{\code{pathogen_SS}} a vector of pathogen names measured in silver-standard (SS) data. -\item{\code{meas_nm}} a list of {specimen}{test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} +causative agents)} +\item{\code{etiology}}{ a vector of proportions that sum to 100 percent} +\item{\code{pathogen_BrS}}{ a vector of putative causative agents' names measured in bronze-standard (BrS) data. +This function simulates only one slice defined by \verb{specimen``test``pathogen}} +\item{\code{pathogen_SS}}{ a vector of pathogen names measured in silver-standard (SS) data.} +\item{\code{meas_nm}}{ a list of \verb{specimen``test} names e.g., \code{list(MBS = c("NPPCR"),MSS="BCX")} for nasopharyngeal (NP) specimen tested by polymerase chain reaction (PCR) - \code{NPPCR} and -blood (B) tested by culture (Cx) - \code{BCX} -\item{\code{Lambda}} controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} -a vector of \code{K} probabilities that sum to 1. -\item{\code{Eta}} a matrix of dimension \code{length(cause_list)} by \code{K}; +blood (B) tested by culture (Cx) - \code{BCX}} +\item{\code{Lambda}}{ controls' subclass weights \eqn{\nu_1, \nu_2, \ldots, \nu_K} +a vector of \code{K} probabilities that sum to 1.} +\item{\code{Eta}}{ a matrix of dimension \code{length(cause_list)} by \code{K}; each row represents a disease class (among cases); the values in that row are subclass weights \eqn{\eta_1, \eta_2, \ldots, \eta_K} for that disease class, so needs to sum to one. In Wu et al. 2016 (JRSS-C), the subclass weights are the same across disease classes across rows. But when simulating data, one can specify rows with distinct subclass weights - it is a matter whether we can recover these parameters (possible when -some cases' true disease classes are observed) -\item{\code{PsiBS/PsiSS}} False positive rates for Bronze-Standard data and +some cases' true disease classes are observed)} +\item{\code{PsiBS/PsiSS}}{ False positive rates for Bronze-Standard data and for Silver-Standard data. For example, the rows of \code{PsiBS} correspond to the dimension of the particular slice of BrS measures, e.g., \code{10} for 10 causative agents measured by NPPCR; the columns correspond to \code{K} subclasses; generically, the dimension is \code{J} by \code{K} -\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures). -\item{\code{ThetaBS/ThetaSS}} True positive rates \eqn{\Theta} for Bronze-Standard data and +\code{PsiSS} is supposed to be a vector of all zeros (perfect specificity in silver-standard measures).} +\item{\code{ThetaBS/ThetaSS}}{ True positive rates \eqn{\Theta} for Bronze-Standard data and for Silver-Standard data. Dimension is \code{J} by \code{K} (can contain \code{NA} if the total number of causative agents measured by BrS or SS exceeds the measured causative agents in SS. For example, in PERCH study, nasopharyngeal polymerase chain reaction (NPPCR; bronze-standard) may target 30 distinct pathogens, but blood culture (BCX; silver-standard) may only target a subset of the 30, -so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX). -\item{\code{Nu}} the number of control subjects -\item{\code{Nd}} the number of case subjects +so we have to specify \code{NA} in \code{ThetaSS}for those pathogens not targeted by BCX).} +\item{\code{Nu}}{ the number of control subjects} +\item{\code{Nd}}{ the number of case subjects} }} \item{latent_samples}{simulated latent status for all the subjects,