From 9d4944eaa442cc0dea01922c4c29790674746368 Mon Sep 17 00:00:00 2001 From: Ramiro Magno Date: Wed, 29 Nov 2023 15:41:13 +0000 Subject: [PATCH] Closes #10 add function `create_iso8601()` (#21) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * clean up dummy test * add `dtc_formats` data set * update .Rbuildignore * add tibble support for automatic pretty printing of tibbles * add `create_iso8601()` (closes #10) * clean up `lintr::lint_package()` issues * Automatic renv profile update. * Automatic renv profile update. * Fix typos in R/dtc_utils.R Co-authored-by: edgar-manukyan * remove `dummy()` function * remove `.onLoad()` function This function was likely added as part of an automatic setup of the R package as a whole but I guess we should add the `.onLoad()` if really needed. * Remove the `is_dtc_fmt()` function Initially I thought of calling this function from within `assert_dtc_fmt()` but I think now that the current usage of `rlang::arg_match()` leads to more concise code, so this is preferred. * Import `.data` from rlang globally Import `.data` from rlang globally by using the R package level documentation (https://roxygen2.r-lib.org/articles/rd-other.html?q=_PACKAGE#packages). * Update WORDLIST * Update `assert_capture_matrix()` and `complete_capture_matrix()` docs * Add `coalesce_capture_matrices()` doc * Fix typo in `assert_dtc_fmt()` doc * Add `regex_or()` doc * Add `fmt_rg()` doc * Add `fmt_c()` doc * Add `parse_dttm_fmt()` doc * Fix doc of `parse_dttm_fmt()` * Add `dttm_fmt_to_regex()` doc * Bump development version to 0.0.0.9001 * Style updates Style updates on R/dtc_create_iso8601.R, R/dtc_parse_dttm.R, R/dtc_utils.R. Mostly indentation corrections, wrapping single line body if-conditions in braces, white space removal. * Style update to tests/testthat/test-yy_to_yyyy.R * Style update Style updates on tests/testthat/test-iso8601.R and tests/testthat/test-reg_matches.R * Blank lines removal * Style update * Update docs after style update * Refactor code about parsing dttm formats Made `dttm_fmt_to_regex()` interface more intitutive by accepting directly the argument `fmt` instead of `tbl_fmt_c` which was an intermediate R object returned by `parse_dttm_fmt()`. Also, introduced unit tests for `parse_dttm_fmt_()`. * Make `parse_dttm_fmt()` handle the case of no matching format components * Use `fmt_dttmc()` in unit tests * Small clarification on unit test description * Remove futile assertion from `assert_dtc_fmt()` * Add staged_dependencies for admiraldev (#26) * Add staged_dependencies for admiraldev * Add new line * Fix admiraldev links. * Fix admiraldev articles links. * Remove R 4.1 a it causing dependencies issues. We want to use purrr >= 1.0.0 * Test latest lintr * Test lintr with install package locally * Add install pacakge variable for lintr * Skip multi version pkgdown workflow. * R build ignore staged_dependencies.yaml * Automatic renv profile update. * Automatic renv profile update. * Cleaned up lintr issues * Export `fmt_cmp()` and add early draft of `create_iso8601()` article * Update `create_iso8601()` article * Link `create_iso8601()` doc to article "iso_8601" * Add RM as author to DESCRIPTION * Fix author role of RM * Fix indentation at `fmt_cmp()` source * Remove `.check_format` from examples and add an example with `fmt_cmp()` * Add an example to `create_iso8601()` with involving alternative formats and unk values * Add example to `create_iso8601()` about the interplay of `.format` and `.fmt_c` * Update common.yml * Update style * Change "oak" to "sdtm.oak" in DESCRIPTION * Change "oak" to "sdtm.oak" in README --------- Co-authored-by: ramiromagno Co-authored-by: edgar-manukyan Co-authored-by: Adam Foryƛ --- .Rbuildignore | 3 + .Rprofile | 2 +- .devcontainer/4.1/devcontainer.json | 95 -- .github/workflows/common.yml | 8 +- .github/workflows/r-renv-lock.yml | 1 - DESCRIPTION | 21 +- NAMESPACE | 5 +- NEWS.md | 6 +- R/dtc_create_iso8601.R | 430 +++++++++ R/dtc_formats.R | 14 + R/dtc_parse_dttm.R | 119 +++ R/dtc_utils.R | 203 +++++ R/package.R | 24 - R/parse_dttm_fmt.R | 454 ++++++++++ R/sdtm.oak-package.R | 8 + README.Rmd | 4 +- README.md | 8 +- data-raw/dtc_formats.R | 27 + data/dtc_formats.rda | Bin 0 -> 440 bytes inst/WORDLIST | 5 + man/assert_capture_matrix.Rd | 43 + man/assert_dtc_fmt.Rd | 26 + man/assert_dtc_format.Rd | 37 + man/coalesce_capture_matrices.Rd | 40 + man/complete_capture_matrix.Rd | 36 + man/create_iso8601.Rd | 101 +++ man/dtc_formats.Rd | 26 + man/dttm_fmt_to_regex.Rd | 43 + man/find_int_gap.Rd | 31 + man/fmt_cmp.Rd | 43 + man/fmt_rg.Rd | 76 ++ man/format_iso8601.Rd | 45 + man/iso8601_mon.Rd | 31 + man/iso8601_na.Rd | 22 + man/iso8601_sec.Rd | 22 + man/iso8601_truncate.Rd | 50 + man/iso8601_two_digits.Rd | 25 + man/iso8601_year.Rd | 35 + man/months_abb_regex.Rd | 24 + man/parse_dttm.Rd | 91 ++ man/parse_dttm_fmt.Rd | 65 ++ man/pseq.Rd | 22 + man/reg_matches.Rd | 25 + man/regex_or.Rd | 32 + man/sdtm.oak-package.Rd | 37 + man/sdtm.oak.Rd | 12 - man/str_to_anycase.Rd | 19 + man/yy_to_yyyy.Rd | 35 + man/zero_pad_whole_number.Rd | 30 + renv.lock | 44 +- renv/profiles/4.1/renv.lock | 1254 -------------------------- renv/profiles/4.1/renv/.gitignore | 7 - renv/profiles/4.1/renv/settings.json | 21 - renv/profiles/4.2/renv.lock | 43 +- renv/profiles/4.3/renv.lock | 44 +- staged_dependencies.yaml | 11 + tests/testthat/test-create_iso8601.R | 75 ++ tests/testthat/test-find_int_gap.R | 43 + tests/testthat/test-format_iso8601.R | 24 + tests/testthat/test-iso8601.R | 44 + tests/testthat/test-onload.R | 3 - tests/testthat/test-parse_dttm.R | 53 ++ tests/testthat/test-parse_dttm_fmt.R | 131 +++ tests/testthat/test-pseq.R | 7 + tests/testthat/test-reg_matches.R | 8 + tests/testthat/test-str_to_anycase.R | 5 + tests/testthat/test-yy_to_yyyy.R | 28 + vignettes/.gitignore | 2 + vignettes/articles/iso_8601.Rmd | 254 ++++++ 69 files changed, 3215 insertions(+), 1447 deletions(-) delete mode 100644 .devcontainer/4.1/devcontainer.json create mode 100644 R/dtc_create_iso8601.R create mode 100644 R/dtc_formats.R create mode 100644 R/dtc_parse_dttm.R create mode 100644 R/dtc_utils.R delete mode 100644 R/package.R create mode 100644 R/parse_dttm_fmt.R create mode 100644 R/sdtm.oak-package.R create mode 100644 data-raw/dtc_formats.R create mode 100644 data/dtc_formats.rda create mode 100644 man/assert_capture_matrix.Rd create mode 100644 man/assert_dtc_fmt.Rd create mode 100644 man/assert_dtc_format.Rd create mode 100644 man/coalesce_capture_matrices.Rd create mode 100644 man/complete_capture_matrix.Rd create mode 100644 man/create_iso8601.Rd create mode 100644 man/dtc_formats.Rd create mode 100644 man/dttm_fmt_to_regex.Rd create mode 100644 man/find_int_gap.Rd create mode 100644 man/fmt_cmp.Rd create mode 100644 man/fmt_rg.Rd create mode 100644 man/format_iso8601.Rd create mode 100644 man/iso8601_mon.Rd create mode 100644 man/iso8601_na.Rd create mode 100644 man/iso8601_sec.Rd create mode 100644 man/iso8601_truncate.Rd create mode 100644 man/iso8601_two_digits.Rd create mode 100644 man/iso8601_year.Rd create mode 100644 man/months_abb_regex.Rd create mode 100644 man/parse_dttm.Rd create mode 100644 man/parse_dttm_fmt.Rd create mode 100644 man/pseq.Rd create mode 100644 man/reg_matches.Rd create mode 100644 man/regex_or.Rd create mode 100644 man/sdtm.oak-package.Rd delete mode 100644 man/sdtm.oak.Rd create mode 100644 man/str_to_anycase.Rd create mode 100644 man/yy_to_yyyy.Rd create mode 100644 man/zero_pad_whole_number.Rd delete mode 100644 renv/profiles/4.1/renv.lock delete mode 100644 renv/profiles/4.1/renv/.gitignore delete mode 100644 renv/profiles/4.1/renv/settings.json create mode 100644 staged_dependencies.yaml create mode 100644 tests/testthat/test-create_iso8601.R create mode 100644 tests/testthat/test-find_int_gap.R create mode 100644 tests/testthat/test-format_iso8601.R create mode 100644 tests/testthat/test-iso8601.R delete mode 100644 tests/testthat/test-onload.R create mode 100644 tests/testthat/test-parse_dttm.R create mode 100644 tests/testthat/test-parse_dttm_fmt.R create mode 100644 tests/testthat/test-pseq.R create mode 100644 tests/testthat/test-reg_matches.R create mode 100644 tests/testthat/test-str_to_anycase.R create mode 100644 tests/testthat/test-yy_to_yyyy.R create mode 100644 vignettes/.gitignore create mode 100644 vignettes/articles/iso_8601.Rmd diff --git a/.Rbuildignore b/.Rbuildignore index 24e5e90..c8038ef 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -11,3 +11,6 @@ ^pkgdown$ ^LICENSE\.md$ ^\.lintr$ +^data-raw$ +^staged_dependencies.yaml$ +^vignettes/articles$ diff --git a/.Rprofile b/.Rprofile index a94c4d6..062e327 100644 --- a/.Rprofile +++ b/.Rprofile @@ -27,7 +27,7 @@ Sys.setenv("RENV_CONFIG_AUTO_SNAPSHOT" = FALSE) if (!Sys.getenv("RENV_AUTOLOADER_ENABLED") %in% c("false", "FALSE")) { .renv_profile <- paste(R.version$major, substr(R.version$minor, 1, 1), sep = ".") if (!file.exists("./renv/profile")) { - if (.renv_profile %in% c("4.1", "4.2", "4.3")) { + if (.renv_profile %in% c("4.2", "4.3")) { message("Set renv profile to `", .renv_profile, "`") Sys.setenv("RENV_PROFILE" = .renv_profile) } else { diff --git a/.devcontainer/4.1/devcontainer.json b/.devcontainer/4.1/devcontainer.json deleted file mode 100644 index e60b4fd..0000000 --- a/.devcontainer/4.1/devcontainer.json +++ /dev/null @@ -1,95 +0,0 @@ -{ - // https://containers.dev/implementors/json_reference/ - "name": "sdtm.oak (RStudio) container", - "image": "ghcr.io/pharmaverse/sdtm.oak-4.1:latest", - // Install Dev Container Features. More info: https://containers.dev/features - "containerEnv": { - "ROOT": "true", - "PASSWORD": "rstudio", - "DISABLE_AUTH": "true", - "RENV_AUTOLOADER_ENABLED": "false" - }, - "features": { - "./ca-cert": {}, - "ghcr.io/rocker-org/devcontainer-features/r-rig:1": { - "version": "none", - "vscodeRSupport": "full", - "installRadian": true, - "installVscDebugger": true - }, - "ghcr.io/rocker-org/devcontainer-features/renv-cache:latest": {}, - "ghcr.io/devcontainers/features/common-utils:2": { - "installZsh": true, - "configureZshAsDefaultShell": false, - "installOhMyZsh": true, - "username": "rstudio", - "upgradePackages": false - }, - "ghcr.io/mikaello/devcontainer-features/modern-shell-utils:1": {} - }, - "overrideFeatureInstallOrder": [ - "./ca-cert", - "./arm64-repos", - "ghcr.io/devcontainers/features/common-utils", - "ghcr.io/rocker-org/devcontainer-features/renv-cache", - "ghcr.io/rocker-org/devcontainer-features/r-rig", - "ghcr.io/mikaello/devcontainer-features/modern-shell-utils" - ], - "init": true, - "overrideCommand": false, - - "postCreateCommand": "bash ./.devcontainer/postCreateCommand.sh", - - "postAttachCommand": "rstudio || true", - - "customizations": { - "codespaces": { - "repositories": { - "pharmaverse/mint": { - "permissions": "write-all" - }, - "pharmaverse/raw.synthetic.data": { - "permissions": "write-all" - } - } - }, - "vscode": { - "settings": { - "r.rterm.linux": "/usr/local/bin/radian", - "r.bracketedPaste": true, - "editor.bracketPairColorization.enabled": true, - "editor.guides.bracketPairs": "active" - }, - "extensions": [ - "vsls-contrib.codetour", - "GitHub.copilot", - "GitHub.copilot-chat", - // R extensions - "ikuyadeu.r", - "REditorSupport.r-lsp", - // Extra extension - "streetsidesoftware.code-spell-checker", - "eamodio.gitlens", - "cweijan.vscode-office", - "donjayamanne.githistory", - "GitHub.vscode-github-actions", - "GitHub.vscode-pull-request-github", - "GitHub.remotehub", - "alefragnani.Bookmarks", - "vscode-icons-team.vscode-icons" - ] - } - }, - - // RStudio ports - "forwardPorts": [8787], - "portsAttributes": { - "8787": { - "label": "Rstudio", - "requireLocalPort": true, - "onAutoForward": "openBrowser" - } - }, - // Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root - "remoteUser": "rstudio" -} diff --git a/.github/workflows/common.yml b/.github/workflows/common.yml index a8c9a5d..de646ff 100644 --- a/.github/workflows/common.yml +++ b/.github/workflows/common.yml @@ -71,17 +71,15 @@ jobs: # Whether to skip multiversion docs # Note that if you have multiple versions of docs, # your URL links are likely to break due to path changes - skip-multiversion-docs: false - latest-tag-alt-name: cran-release - multiversion-docs-landing-page: cran-release - branches-or-tags-to-list: >- - ^cran-release$|^main$|^v([0-9]+\\.)?([0-9]+\\.)?([0-9]+)$ + skip-multiversion-docs: true linter: name: Lint uses: pharmaverse/admiralci/.github/workflows/lintr.yml@main if: github.event_name == 'pull_request' with: r-version: "4.3" + latest-lintr: "true" + install-package: "true" links: name: Links uses: pharmaverse/admiralci/.github/workflows/links.yml@main diff --git a/.github/workflows/r-renv-lock.yml b/.github/workflows/r-renv-lock.yml index fd5015d..d07747e 100644 --- a/.github/workflows/r-renv-lock.yml +++ b/.github/workflows/r-renv-lock.yml @@ -22,7 +22,6 @@ jobs: fail-fast: false matrix: config: - - {os: ubuntu-20.04, r: '4.1', repos: 'https://packagemanager.posit.co/cran/2022-03-10/'} - {os: ubuntu-20.04, r: '4.2', repos: 'https://packagemanager.posit.co/cran/2023-03-15/'} - {os: ubuntu-20.04, r: '4.3', repos: 'https://packagemanager.posit.co/cran/2023-04-20/'} diff --git a/DESCRIPTION b/DESCRIPTION index d253c41..7d93b1f 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,10 +1,13 @@ Package: sdtm.oak Type: Package Title: SDTM Data Transformation Engine -Version: 0.0.0.9000 +Version: 0.0.0.9001 Authors@R: c( person("Omar", "Garcia", email = "ogcalderon@cdisc.org", role = c("aut", "cre")), person("Rammprasad", "Ganapathy", role = "aut"), + person("Ramiro", "Magno", email = "rmagno@pattern.institute", + role = "aut", comment = c(ORCID = "0000-0001-5226-3441")), + person("Pattern Institute", role = c("cph", "fnd")), person("F. Hoffmann-La Roche AG", role = c("cph", "fnd")), person("Pfizer Inc", role = c("cph", "fnd")) ) @@ -13,21 +16,25 @@ Description: An EDC and Data Standard agnostic SDTM data transformation engine based on standard mapping algorithms. Language: en-US License: Apache License (>= 2) -BugReports: https://github.com/pharmaverse/oak/issues -URL: https://pharmaverse.github.io/oak/, https://github.com/pharmaverse/oak +BugReports: https://github.com/pharmaverse/sdtm.oak/issues +URL: https://pharmaverse.github.io/sdtm.oak/, https://github.com/pharmaverse/sdtm.oak Encoding: UTF-8 LazyData: true Roxygen: list(markdown = TRUE) RoxygenNote: 7.2.3 -Depends: R (>= 4.1) +Depends: R (>= 4.2) Imports: - rlang (>= 1.0.0) + admiraldev, + dplyr (>= 1.0.0), + purrr (>= 1.0.0), + rlang (>= 1.0.0), + stringr, + tibble Suggests: knitr, rmarkdown, spelling, - testthat (>= 3.1.7), - tibble + testthat (>= 3.1.7) VignetteBuilder: knitr Config/testthat/edition: 3 Config/testthat/parallel: true diff --git a/NAMESPACE b/NAMESPACE index b730776..455b538 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -1,3 +1,6 @@ # Generated by roxygen2: do not edit by hand -import(rlang) +export(create_iso8601) +export(fmt_cmp) +importFrom(rlang,.data) +importFrom(tibble,tibble) diff --git a/NEWS.md b/NEWS.md index 5b727d4..7777615 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,5 @@ -# sdtm.oak (development version) +# sdtm.oak 0.0.0.9001 (development version) -* Initial CRAN submission. +## New Features + +* New function `create_iso8601()` for conversion of vectors of dates, times or date-times to ISO8601 format. diff --git a/R/dtc_create_iso8601.R b/R/dtc_create_iso8601.R new file mode 100644 index 0000000..9d2908d --- /dev/null +++ b/R/dtc_create_iso8601.R @@ -0,0 +1,430 @@ +# Month abbreviation (en) to numeric month mapping +mon_abb_to_mon_num <- stats::setNames(sprintf("%02d", seq_along(month.abb)), tolower(month.abb)) + +#' Convert NA to `"-"` +#' +#' [iso8601_na()] takes a character vector and converts `NA` values to `"-"`. +#' +#' @param x A character vector. +#' +#' @returns A character vector. +#' +#' @examples +#' sdtm.oak:::iso8601_na(c("10", NA_character_)) +#' +#' @keywords internal +iso8601_na <- function(x) { + admiraldev::assert_character_vector(x) + x[is.na(x)] <- "-" + x +} + +#' Convert an integer to a zero-padded character vector +#' +#' [zero_pad_whole_number()] takes non-negative integer values and converts +#' them to character with zero padding. Negative numbers and numbers greater +#' than the width specified by the number of digits `n` are converted to `NA`. +#' +#' @param x An integer vector. +#' @param n Number of digits in the output, including zero padding. +#' +#' @returns A character vector. +#' +#' @examples +#' sdtm.oak:::zero_pad_whole_number(c(-1, 0, 1)) +#' +#' sdtm.oak:::zero_pad_whole_number(c(-1, 0, 1, 10, 99, 100), n = 2) +#' +#' sdtm.oak:::zero_pad_whole_number(c(-1, 0, 1, 10, 99, 100), n = 3) +#' +#' @keywords internal +zero_pad_whole_number <- function(x, n = 2L) { + # Check `x` + if (!rlang::is_integerish(x)) rlang::abort("`x` must be integerish.") + + # Check `n` + admiraldev::assert_integer_scalar(n) + if (n < 1L) rlang::abort("`n` must be positive.") + + # Negative numbers are not allowed, and hence get converted to NA. + x[x < 0L] <- NA_integer_ + + # Numbers that do not fit within the padding width are converted to NA + x[floor(log10(x)) >= n] <- NA_integer_ + + fmt <- paste0("%0", n, "d") + y <- sprintf(fmt, x) + y[is.na(x)] <- NA_character_ + y +} + +#' Convert two-digit to four-digit years +#' +#' [yy_to_yyyy()] converts two-digit years to four-digit years. +#' +#' @param x An integer vector of years. +#' @param cutoff_2000 An integer value. Two-digit years smaller or equal to +#' `cutoff_2000` are parsed as though starting with `20`, otherwise parsed as +#' though starting with `19`. +#' +#' @returns An integer vector. +#' +#' @examples +#' sdtm.oak:::yy_to_yyyy(0:5) +#' sdtm.oak:::yy_to_yyyy(2000:2005) +#' +#' sdtm.oak:::yy_to_yyyy(90:99) +#' sdtm.oak:::yy_to_yyyy(1990:1999) +#' +#' # NB: change in behavior after 68 +#' sdtm.oak:::yy_to_yyyy(65:72) +#' +#' sdtm.oak:::yy_to_yyyy(1965:1972) +#' +#' @keywords internal +yy_to_yyyy <- function(x, cutoff_2000 = 68L) { + # Check `x` + if (!rlang::is_integerish(x)) rlang::abort("`x` must be integerish.") + + if (any(x < 0L, na.rm = TRUE)) { + rlang::abort("`x` cannot have negative years.") + } + + x <- dplyr::if_else(x <= cutoff_2000, x + 2000L, x) + x <- dplyr::if_else(x <= 99L, x + 1900L, x) + x +} + +#' Format as a ISO8601 two-digit number +#' +#' [iso8601_two_digits()] converts a single digit or two digit number into a +#' two digit, 0-padded, number. Failing to parse the input as a two digit number +#' results in `NA`. +#' +#' @param x A character vector. +#' +#' @returns A character vector of the same size as `x`. +#' +#' @examples +#' x <- c("0", "00", "1", "01", "42", "100", NA_character_, "1.") +#' sdtm.oak:::iso8601_two_digits(x) +#' +#' @keywords internal +iso8601_two_digits <- function(x) { + admiraldev::assert_character_vector(x) + x_int <- as.integer(stringr::str_match(x, "^\\d?\\d$")) + zero_pad_whole_number(x_int, n = 2L) +} + +iso8601_mday <- iso8601_two_digits +iso8601_hour <- iso8601_two_digits +iso8601_min <- iso8601_two_digits + +#' Format as a ISO8601 four-digit year +#' +#' [iso8601_year()] converts a character vector whose values represent years to +#' four-digit years. +#' +#' @param x A character vector. +#' @param cutoff_2000 A non-negative integer value. Two-digit years smaller or +#' equal to `cutoff_2000` are parsed as though starting with `20`, otherwise +#' parsed as though starting with `19`. +#' +#' @returns A character vector. +#' +#' @examples +#' sdtm.oak:::iso8601_year(c("0", "1", "2", "50", "68", "69", "90", "99", "00")) +#' +#' # Be default, `cutoff_2000` is at 68. +#' sdtm.oak:::iso8601_year(c("67", "68", "69", "70")) +#' sdtm.oak:::iso8601_year(c("1967", "1968", "1969", "1970")) +#' +#' # Change it to something else, e.g. `cutoff_2000 = 25`. +#' sdtm.oak:::iso8601_year(as.character(0:50), cutoff_2000 = 25) +#' sdtm.oak:::iso8601_year(as.character(1900:1950), cutoff_2000 = 25) +#' +#' @keywords internal +iso8601_year <- function(x, cutoff_2000 = 68L) { + admiraldev::assert_character_vector(x) + admiraldev::assert_integer_scalar(cutoff_2000, subset = "non-negative") + x_int <- as.integer(stringr::str_match(x, "^\\d{1,4}$")) + x_int <- yy_to_yyyy(x_int, cutoff_2000 = cutoff_2000) + zero_pad_whole_number(x_int, n = 4L) +} + +#' Format as a ISO8601 month +#' +#' [iso8601_mon()] converts a character vector whose values represent numeric +#' or abbreviated month names to zero-padded numeric months. +#' +#' @param x A character vector. +#' +#' @returns A character vector. +#' +#' @examples +#' sdtm.oak:::iso8601_mon(c(NA, "0", "1", "2", "10", "11", "12")) +#' +#' # No semantic validation is performed on the numeric months, so `"13"` stays +#' # `"13"` but representations that can't be represented as two-digit numbers +#' # become `NA`. +#' sdtm.oak:::iso8601_mon(c("13", "99", "100", "-1")) +#' +#' (mon <- month.abb) +#' sdtm.oak:::iso8601_mon(mon) +#' +#' @keywords internal +iso8601_mon <- function(x) { + x <- tolower(x) + num_mon <- mon_abb_to_mon_num[x] + num_mon_chr <- num_mon + num_mon_chr[is.na(num_mon)] <- iso8601_two_digits(x[is.na(num_mon)]) + mon_int <- as.integer(num_mon_chr) + zero_pad_whole_number(mon_int, n = 2L) +} + +#' Format as ISO8601 seconds +#' +#' [iso8601_sec()] converts a character vector whose values represent seconds. +#' +#' @param x A character vector. +#' +#' @returns A character vector. +#' +#' @examples +#' sdtm.oak:::iso8601_sec(c(NA, "0", "1", "10", "59", "99", "100")) +#' +#' @keywords internal +iso8601_sec <- function(x) { + x_iso8601 <- stringr::str_extract(x, "^\\d?\\d(\\.\\d*)?$") + x_iso8601 <- stringr::str_replace(x_iso8601, "^\\d(\\.\\d*)?$", "0\\0") + x_iso8601 <- stringr::str_replace(x_iso8601, "(\\.[^0]*)(0*)$", "\\1") + x_iso8601 <- stringr::str_remove(x_iso8601, "\\.$") + x_iso8601[is.na(x_iso8601)] <- NA_character_ + x_iso8601 +} + +#' Truncate a partial ISO8601 date-time +#' +#' [iso8601_truncate()] converts a character vector of ISO8601 dates, times or +#' date-times that might be partial and truncates the format by removing those +#' missing components. +#' +#' @param x A character vector. +#' +#' @returns A character vector. +#' +#' @examples +#' x <- +#' c( +#' "1999-01-01T15:20:01", +#' "1999-01-01T15:20:-", +#' "1999-01-01T15:-:-", +#' "1999-01-01T-:-:-", +#' "1999-01--T-:-:-", +#' "1999----T-:-:-", +#' "-----T-:-:-" +#' ) +#' +#' sdtm.oak:::iso8601_truncate(x) +#' +#' # With `empty_as_na = FALSE` empty strings are not replaced with `NA` +#' sdtm.oak:::iso8601_truncate("-----T-:-:-", empty_as_na = TRUE) +#' sdtm.oak:::iso8601_truncate("-----T-:-:-", empty_as_na = FALSE) +#' +#' # Truncation only happens if missing components are the right most end, +#' # otherwise they remain unaltered. +#' sdtm.oak:::iso8601_truncate( +#' c( +#' "1999----T15:20:01", +#' "1999-01-01T-:20:01", +#' "1999-01-01T-:-:01", +#' "1999-01-01T-:-:-" +#' ) +#' ) +#' +#' @keywords internal +iso8601_truncate <- function(x, empty_as_na = TRUE) { + x <- stringr::str_remove(x, "[^\\d]*$") + if (empty_as_na) x[x == ""] <- NA_character_ + x +} + +#' Convert date/time components into ISO8601 format +#' +#' [format_iso8601()] takes a character matrix of date/time components and +#' converts each component to ISO8601 format. In practice this entails +#' converting years to a four digit number, and month, day, hours, minutes and +#' seconds to two-digit numbers. Not available (`NA`) components are converted +#' to `"-"`. +#' +#' @param m A character matrix of date/time components. It must have six +#' named columns: `year`, `mon`, `mday`, `hour`, `min` and `sec`. +#' @param .cutoff_2000 An integer value. Two-digit years smaller or equal to +#' `.cutoff_2000` are parsed as though starting with `20`, otherwise parsed as +#' though starting with `19`. +#' +#' @returns A character vector with date-times following the ISO8601 format. +#' +#' @examples +#' cols <- c("year", "mon", "mday", "hour", "min", "sec") +#' m <- matrix( +#' c( +#' "99", "00", "01", +#' "Jan", "feb", "03", +#' "1", "01", "31", +#' "00", "12", "23", +#' "00", "59", "10", +#' "42", "5.15", NA +#' ), +#' ncol = 6, +#' dimnames = list(c(), cols) +#' ) +#' +#' sdtm.oak:::format_iso8601(m) +#' +#' @keywords internal +format_iso8601 <- function(m, .cutoff_2000 = 68L) { + admiraldev::assert_integer_scalar(.cutoff_2000) + + m[, "year"] <- iso8601_year(m[, "year"], cutoff_2000 = .cutoff_2000) + m[, "mon"] <- iso8601_mon(m[, "mon"]) + m[, "mday"] <- iso8601_mday(m[, "mday"]) + m[, "hour"] <- iso8601_hour(m[, "hour"]) + m[, "min"] <- iso8601_min(m[, "min"]) + m[, "sec"] <- iso8601_sec(m[, "sec"]) + + m <- iso8601_na(m) + + x <- + paste0( + m[, "year"], + "-", + m[, "mon"], + "-", + m[, "mday"], + "T", + m[, "hour"], + ":", + m[, "min"], + ":", + m[, "sec"] + ) + + iso8601_truncate(x) +} + +#' Convert date or time collected values to ISO 8601 +#' +#' [create_iso8601()] converts vectors of dates, times or date-times to [ISO +#' 8601](https://en.wikipedia.org/wiki/ISO_8601) format. Learn more in +#' `vignette("iso_8601")`. +#' +#' @param ... Character vectors of dates, times or date-times' components. +#' @param .format Parsing format(s). Either a character vector or a list of +#' character vectors. If a character vector is passed then each element is +#' taken as parsing format for each vector passed in `...`. If a list is +#' provided, then each element must be a character vector of formats. The +#' first vector of formats is used for parsing the first vector passed in +#' `...`, and so on. +#' @param .fmt_c A list of regexps to use when parsing `.format`. Use [fmt_cmp()] +#' to create such an object to pass as argument to this parameter. +#' @param .na A character vector of string literals to be regarded as missing +#' values during parsing. +#' @param .cutoff_2000 An integer value. Two-digit years smaller or equal to +#' `.cutoff_2000` are parsed as though starting with `20`, otherwise parsed as +#' though starting with `19`. +#' @param .check_format Whether to check the formats passed in `.format`, +#' meaning to check against a selection of validated formats in +#' [dtc_formats][sdtm.oak::dtc_formats]; or to have a more permissible +#' interpretation of the formats. +#' +#' @examples +#' # Converting dates +#' create_iso8601(c("2020-01-01", "20200102"), .format = "y-m-d") +#' create_iso8601(c("2020-01-01", "20200102"), .format = "ymd") +#' create_iso8601(c("2020-01-01", "20200102"), .format = list(c("y-m-d", "ymd"))) +#' +#' # Two-digit years are supported +#' create_iso8601(c("20-01-01", "200101"), .format = list(c("y-m-d", "ymd"))) +#' +#' # `.cutoff_2000` sets the cutoff for two-digit to four-digit year conversion +#' # Default is at 68. +#' create_iso8601(c("67-01-01", "68-01-01", "69-01-01"), .format = "y-m-d") +#' +#' # Change it to 80. +#' create_iso8601(c("79-01-01", "80-01-01", "81-01-01"), .format = "y-m-d", .cutoff_2000 = 80) +#' +#' # Converting times +#' create_iso8601("15:10", .format = "HH:MM") +#' create_iso8601("2:10", .format = "HH:MM") +#' create_iso8601("2:1", .format = "HH:MM") +#' create_iso8601("02:01:56", .format = "HH:MM:SS") +#' create_iso8601("020156.5", .format = "HHMMSS") +#' +#' # Converting date-times +#' create_iso8601("12 NOV 202015:15", .format = "dd mmm yyyyHH:MM") +#' +#' # Indicate allowed missing values to make the parsing pass +#' create_iso8601("U DEC 201914:00", .format = "dd mmm yyyyHH:MM") +#' create_iso8601("U DEC 201914:00", .format = "dd mmm yyyyHH:MM", .na = "U") +#' +#' create_iso8601("NOV 2020", .format = "m y") +#' create_iso8601(c("MAR 2019", "MaR 2020", "mar 2021"), .format = "m y") +#' +#' create_iso8601("2019-04-041045-", .format = "yyyy-mm-ddHHMM-") +#' +#' create_iso8601("20200507null", .format = "ymd(HH:MM:SS)") +#' create_iso8601("20200507null", .format = "ymd((HH:MM:SS)|null)") +#' +#' # Fractional seconds +#' create_iso8601("2019-120602:20:13.1230001", .format = "y-mdH:M:S") +#' +#' # Use different reserved characters in the format specification +#' # Here we change "H" to "x" and "M" to "w", for hour and minute, respectively. +#' create_iso8601("14H00M", .format = "HHMM") +#' create_iso8601("14H00M", .format = "xHwM", .fmt_c = fmt_cmp(hour = "x", min = "w")) +#' +#' # Alternative formats with unknown values +#' datetimes <- c("UN UNK 201914:00", "UN JAN 2021") +#' format <- list(c("dd mmm yyyy", "dd mmm yyyyHH:MM")) +#' create_iso8601(datetimes, .format = format, .na = c("UN", "UNK")) +#' +#' # Dates and times may come in many format variations +#' fmt <- "dd MMM yyyy HH nn ss" +#' fmt_cmp <- fmt_cmp(mon = "MMM", min = "nn", sec = "ss") +#' create_iso8601("05 feb 1985 12 55 02", .format = fmt, .fmt_c = fmt_cmp) +#' +#' @export +create_iso8601 <- function(..., .format, .fmt_c = fmt_cmp(), .na = NULL, .cutoff_2000 = 68L, .check_format = FALSE) { + assert_fmt_c(.fmt_c) + + dots <- rlang::dots_list(...) + + if (rlang::is_empty(dots)) { + return(character()) + } + + # Check if all vectors in `dots` are of character type. + if (!identical(unique(sapply(dots, typeof)), "character")) { + rlang::abort("All vectors in `...` must be of type character.") + } + + # Check if all vectors in `dots` are of the same length. + n <- unique(lengths(dots)) + if (!identical(length(n), 1L)) { + rlang::abort("All vectors in `...` must be of the same length.") + } + + if (!identical(length(dots), length(.format))) { + rlang::abort("Number of vectors in `...` should match length of `.format`.") + } + + # Check that the `.format` is either a character vector or a list of + # character vectors, and that each string is one of the possible formats. + if (.check_format) assert_dtc_format(.format) + + cap_matrices <- purrr::map2(dots, .format, ~ parse_dttm(dttm = .x, fmt = .y, na = .na, fmt_c = .fmt_c)) + cap_matrix <- coalesce_capture_matrices(!!!cap_matrices) + + format_iso8601(cap_matrix, .cutoff_2000 = .cutoff_2000) +} diff --git a/R/dtc_formats.R b/R/dtc_formats.R new file mode 100644 index 0000000..af3b0a6 --- /dev/null +++ b/R/dtc_formats.R @@ -0,0 +1,14 @@ +#' Date/time collection formats +#' +#' @format A [tibble][tibble::tibble-package] of `r nrow(dtc_formats)` formats +#' with three variables: +#' \describe{ +#' \item{`fmt`}{Format string.} +#' \item{`type`}{Whether a date, time or date-time.} +#' \item{`description`}{Description of which date-time components are parsed.} +#' } +#' +#' @examples +#' dtc_formats +#' +"dtc_formats" diff --git a/R/dtc_parse_dttm.R b/R/dtc_parse_dttm.R new file mode 100644 index 0000000..2feed78 --- /dev/null +++ b/R/dtc_parse_dttm.R @@ -0,0 +1,119 @@ +#' @rdname parse_dttm +#' @order 2 +parse_dttm_ <- function(dttm, + fmt, + fmt_c = fmt_cmp(), + na = NULL, + sec_na = na, + min_na = na, + hour_na = na, + mday_na = na, + mon_na = na, + year_na = na) { + admiraldev::assert_character_scalar(fmt) + + regex <- + dttm_fmt_to_regex( + fmt, + fmt_regex = fmt_rg( + na = na, + sec_na = sec_na, + min_na = min_na, + hour_na = hour_na, + mday_na = mday_na, + mon_na = mon_na, + year_na = year_na + ), + fmt_c = fmt_c + ) + + m <- stringr::str_match(dttm, regex) + + # Drop matching subgroups (those are unnamed) + m <- m[, colnames(m) != "", drop = FALSE] + + complete_capture_matrix(m) +} + +#' Parse a date, time, or date-time +#' +#' [parse_dttm()] extracts date and time components. [parse_dttm()] wraps around +#' [parse_dttm_()], which is not vectorized over `fmt`. +#' +#' @param dttm A character vector of dates, times or date-times. +#' @param fmt In the case of `parse_dttm()`, a character vector of parsing +#' formats, or a single string format in the case of `parse_dttm_()`. When a +#' character vector of formats is passed, each format is attempted in turn +#' with the first parsing result to be successful taking precedence in the +#' final result. The formats in `fmt` can be any strings, however the +#' following characters (or successive repetitions thereof) are reserved in +#' the sense that they are treated in a special way: +#' - `"y"`: parsed as year; +#' - `"m"`: parsed as month; +#' - `"d"`: parsed as day; +#' - `"H"`: parsed as hour; +#' - `"M"`: parsed as minute; +#' - `"S"`: parsed as second. +#' +#' @param na,sec_na,min_na,hour_na,mday_na,mon_na,year_na A character vector of +#' alternative values to allow during matching. This can be used to indicate +#' different forms of missing values to be found during the parsing date-time +#' strings. +#' +#' @returns A character matrix of six columns: `"year"`, `"mon"`, `"mday"`, +#' `"hour"`, `"min"` and `"sec"`. Each row corresponds to an element in +#' `dttm`. Each element of the matrix is the parsed date/time component. +#' +#' @examples +#' sdtm.oak:::parse_dttm("2020", "y") +#' sdtm.oak:::parse_dttm("2020-05", "y") +#' +#' sdtm.oak:::parse_dttm("2020-05", "y-m") +#' sdtm.oak:::parse_dttm("2020-05-11", "y-m-d") +#' +#' sdtm.oak:::parse_dttm("2020 05 11", "y m d") +#' sdtm.oak:::parse_dttm("2020 05 11", "y m d") +#' sdtm.oak:::parse_dttm("2020 05 11", "y\\s+m\\s+d") +#' sdtm.oak:::parse_dttm("2020 05 11", "y\\s+m\\s+d") +#' +#' sdtm.oak:::parse_dttm("2020-05-11 11:45", "y-m-d H:M") +#' sdtm.oak:::parse_dttm("2020-05-11 11:45:15.6", "y-m-d H:M:S") +#' +#' sdtm.oak:::parse_dttm(c("2002-05-11 11:45", "-05-11 11:45"), "y-m-d H:M") +#' sdtm.oak:::parse_dttm(c("2002-05-11 11:45", "-05-11 11:45"), "-m-d H:M") +#' sdtm.oak:::parse_dttm(c("2002-05-11 11:45", "-05-11 11:45"), c("y-m-d H:M", "-m-d H:M")) +#' +#' sdtm.oak:::parse_dttm(c("2020-05-18", "2020-UN-18", "2020-UNK-UN"), "y-m-d") +#' sdtm.oak:::parse_dttm(c("2020-05-18", "2020-UN-18", "2020-UNK-UN"), "y-m-d", na = "UN") +#' sdtm.oak:::parse_dttm(c("2020-05-18", "2020-UN-18", "2020-UNK-UN"), "y-m-d", na = c("UN", "UNK")) +#' +#' @keywords internal +parse_dttm <- function(dttm, + fmt, + fmt_c = fmt_cmp(), + na = NULL, + sec_na = na, + min_na = na, + hour_na = na, + mday_na = na, + mon_na = na, + year_na = na) { + lst <- + purrr::map( + fmt, + ~ parse_dttm_( + dttm = dttm, + fmt = .x, + fmt_c = fmt_c, + na = na, + sec_na = sec_na, + min_na = min_na, + hour_na = hour_na, + mday_na = mday_na, + mon_na = mon_na, + year_na = year_na + ) + ) + + coalesce_capture_matrices(!!!lst) +} diff --git a/R/dtc_utils.R b/R/dtc_utils.R new file mode 100644 index 0000000..9336140 --- /dev/null +++ b/R/dtc_utils.R @@ -0,0 +1,203 @@ +#' Assert date time character formats +#' +#' [assert_dtc_fmt()] takes a character vector of date/time formats and checks if +#' the formats are supported, meaning it checks if they are one of the formats +#' listed in column `fmt` of [dtc_formats], failing with an error otherwise. +#' +#' @param fmt A character vector. +#' +#' @examples +#' sdtm.oak:::assert_dtc_fmt(c("ymd", "y m d", "dmy", "HM", "H:M:S", "y-m-d H:M:S")) +#' +#' # This example is guarded to avoid throwing errors +#' if (FALSE) { +#' sdtm.oak:::assert_dtc_fmt("y years m months d days") +#' } +#' +#' @keywords internal +assert_dtc_fmt <- function(fmt) { + rlang::arg_match(fmt, + values = sdtm.oak::dtc_formats$fmt, + multiple = TRUE + ) +} + +#' Assert dtc format +#' +#' [assert_dtc_format()] is an internal helper function aiding with the checking +#' of the `.format` parameter of [create_iso8601()]. +#' +#' @param .format The argument of [create_iso8601()]'s `.format` parameter. +#' +#' @returns This function throws an error if `.format` is not either: +#' - A character vector of formats permitted by [assert_dtc_fmt()]; +#' - A list of character vectors of formats permitted by [assert_dtc_fmt()]. +#' +#' Otherwise, it returns `.format` invisibly. +#' +#' @examples +#' sdtm.oak:::assert_dtc_format("ymd") +#' sdtm.oak:::assert_dtc_format(c("ymd", "y-m-d")) +#' sdtm.oak:::assert_dtc_format(list(c("ymd", "y-m-d"), "H:M:S")) +#' +#' # These commands should throw an error +#' if (FALSE) { +#' # Note that `"year, month, day"` is not a supported format. +#' sdtm.oak:::assert_dtc_format("year, month, day") +#' } +#' +#' @keywords internal +assert_dtc_format <- function(.format) { + abort_msg <- "`.format` must be either a character vector of formats of a list thereof." + + switch(typeof(.format), + character = assert_dtc_fmt(.format), + list = purrr::map(.format, assert_dtc_format), + rlang::abort(abort_msg) + ) + + invisible(.format) +} + +#' Assert capture matrix +#' +#' @description +#' +#' [assert_capture_matrix()] is an internal helper function aiding with the +#' checking of an internal R object that contains the parsing results as +#' returned by [parse_dttm()]: capture matrix. +#' +#' This function checks that the capture matrix is a matrix and that it contains +#' six columns: `year`, `mon`, `mday`, `hour`, `min` and `sec`. +#' +#' @param m A character matrix. +#' +#' @returns This function throws an error if `m` is not either: +#' - A character matrix; +#' - A matrix whose columns are (at least): `year`, `mon`, `mday`, `hour`, +#' `min` and `sec`. +#' +#' Otherwise, it returns `m` invisibly. +#' +#' @examples +#' cols <- c("year", "mon", "mday", "hour", "min", "sec") +#' m <- matrix(NA_character_, nrow = 1L, ncol = 6L, dimnames = list(NULL, cols)) +#' sdtm.oak:::assert_capture_matrix(m) +#' +#' # These commands should throw an error +#' if (FALSE) { +#' sdtm.oak:::assert_capture_matrix(character()) +#' sdtm.oak:::assert_capture_matrix(matrix(data = NA_character_, nrow = 0, ncol = 0)) +#' sdtm.oak:::assert_capture_matrix(matrix(data = NA_character_, nrow = 1)) +#' } +#' +#' @keywords internal +assert_capture_matrix <- function(m) { + # `m` must be of character type. + admiraldev::assert_character_vector(m) + + if (!is.matrix(m)) { + rlang::abort("`m` must be a matrix.") + } + + col_names <- c("year", "mon", "mday", "hour", "min", "sec") + m_col_names <- colnames(m) + if (is.null(m_col_names) || !all(m_col_names %in% col_names)) { + rlang::abort("`m` must have the following colnames: `year`, `mon`, `mday`, `hour`, `min` and `sec`.") + } + + invisible(m) +} + +#' Complete a capture matrix +#' +#' [complete_capture_matrix()] completes the missing, if any, columns of the +#' capture matrix. +#' +#' @param m A character matrix that might be missing one or more of the +#' following columns: `year`, `mon`, `mday`, `hour`, `min` or `sec`. +#' +#' @returns A character matrix that contains the columns `year`, `mon`, `mday`, +#' `hour`, `min` and `sec`. Any other existing columns are dropped. +#' +#' @examples +#' sdtm.oak:::complete_capture_matrix(matrix(data = NA_character_, nrow = 0, ncol = 0)) +#' sdtm.oak:::complete_capture_matrix(matrix(data = NA_character_, nrow = 1)) +#' +#' # m <- matrix(NA_character_, nrow = 1, ncol = 2, dimnames = list(NULL, c("year", "sec"))) +#' # sdtm.oak:::complete_capture_matrix(m) +#' +#' # m <- matrix(c("2020", "10"), nrow = 1, ncol = 2, dimnames = list(NULL, c("year", "sec"))) +#' # sdtm.oak:::complete_capture_matrix(m) +#' +#' # Any other existing columns are dropped. +#' # m <- matrix(c("2020", "10"), nrow = 1, ncol = 2, dimnames = list(NULL, c("semester", "quarter"))) +#' # sdtm.oak:::complete_capture_matrix(m) +#' +#' @keywords internal +complete_capture_matrix <- + function(m) { + col_names <- c("year", "mon", "mday", "hour", "min", "sec") + + if (setequal(col_names, colnames(m))) { + return(m) + } + + miss_cols <- setdiff(col_names, colnames(m)) + miss_n_cols <- length(miss_cols) + + m2 <- matrix(nrow = nrow(m), ncol = miss_n_cols) + colnames(m2) <- miss_cols + + m3 <- cbind(m, m2)[, col_names, drop = FALSE] + assert_capture_matrix(m3) + } + +#' Coalesce capture matrices +#' +#' [coalesce_capture_matrices()] combines several capture matrices into one. +#' Each argument of `...` should be a capture matrix in the sense of the output +#' by [complete_capture_matrix()], meaning a character matrix of six columns +#' whose names are: `year`, `mon`, `mday`, `hour`, `min` or `sec`. +#' +#' @param ... A sequence of capture matrices. +#' +#' @returns A single capture matrix whose values have been coalesced in the +#' sense of [coalesce()][dplyr::coalesce]. +#' +#' @examples +#' cols <- c("year", "mon", "mday", "hour", "min", "sec") +#' dates <- c("2020", "01", "01", "20", NA, NA) +#' times <- c(NA, NA, NA, "10", "00", "05") +#' m_dates <- matrix(dates, nrow = 1L, ncol = 6L, dimnames = list(NULL, cols)) +#' m_times <- matrix(times, nrow = 1L, ncol = 6L, dimnames = list(NULL, cols)) +#' +#' # Note how the hour "20" takes precedence over "10" +#' sdtm.oak:::coalesce_capture_matrices(m_dates, m_times) +#' +#' # Reverse the order of the inputs and now hour "10" takes precedence +#' sdtm.oak:::coalesce_capture_matrices(m_times, m_dates) +#' +#' # Single inputs should result in the same output as the input +#' sdtm.oak:::coalesce_capture_matrices(m_dates) +#' sdtm.oak:::coalesce_capture_matrices(m_times) +#' +#' @keywords internal +coalesce_capture_matrices <- function(...) { + dots <- rlang::list2(...) + + if (rlang::is_empty(dots)) { + rlang::abort("At least one input must be passed.") + } + + # Assert that every argument in `...` is a capture matrix + purrr::walk(dots, assert_capture_matrix) + + # `as.vector` needed because of: https://github.com/tidyverse/dplyr/issues/6954 + vecs <- purrr::map(dots, as.vector) + vec <- dplyr::coalesce(!!!vecs) + m <- matrix(vec, ncol = 6L) + colnames(m) <- c("year", "mon", "mday", "hour", "min", "sec") + + m +} diff --git a/R/package.R b/R/package.R deleted file mode 100644 index d34d384..0000000 --- a/R/package.R +++ /dev/null @@ -1,24 +0,0 @@ -#' An EDC and Data Standard agnostic SDTM data transformation engine that automates -#' the transformation of raw clinical data in ODM format to SDTM based on standard -#' mapping algorithms -#' -#' @name sdtm.oak -#' -#' @import rlang -NULL - -#' onLoad function -#' -#' This function is called automatically during package loading. -#' -#' @param libname lib name -#' @param pkgname package name -#' @noRd -.onLoad <- function(libname, pkgname) { # nolint -} - -#' Temporary dummy function -#' @noRd -dummy <- function() { - rlang::as_list -} diff --git a/R/parse_dttm_fmt.R b/R/parse_dttm_fmt.R new file mode 100644 index 0000000..eeb1e37 --- /dev/null +++ b/R/parse_dttm_fmt.R @@ -0,0 +1,454 @@ +#' Find gap intervals in integer sequences +#' +#' [find_int_gap()] determines the `start` and `end` positions for gap intervals +#' in a sequence of integers. By default, the interval range to look for gaps is +#' defined by the minimum and maximum values of `x`; specify `xmin` and `xmax` +#' to change the range explicitly. +#' +#' @param x An integer vector. +#' @param xmin Left endpoint integer value. +#' @param xmax Right endpoint integer value. +#' +#' @returns A [tibble][tibble::tibble-package] of gap intervals of two columns: +#' - `start`: left endpoint +#' - `end`: right endpoint +#' If no gap intervals are found then an empty [tibble][tibble::tibble-package] +#' is returned. +#' +#' @keywords internal +find_int_gap <- function(x, xmin = min(x), xmax = max(x)) { + if (!rlang::is_integerish(x)) { + rlang::abort("`x` must be integer-ish") + } + + if (rlang::is_empty(x)) { + return(tibble::tibble(start = integer(), end = integer())) + } + + admiraldev::assert_integer_scalar(xmin) + admiraldev::assert_integer_scalar(xmax) + + x <- sort(unique(x)) + x <- c(xmin - 1L, x, xmax + 1L) + gaps <- which(diff(x) > 1L) + start <- x[gaps] + 1L + end <- x[gaps + 1L] - 1L + tibble::tibble(start = start, end = end) +} + +#' `regmatches()` with `NA` +#' +#' [reg_matches()] is a thin wrapper around [regmatches()] that returns +#' `NA` instead of `character(0)` when matching fails. +#' +#' @param x A character vector. +#' @param m An object with match data. +#' @param invert A logical scalar. If `TRUE`, extract or replace the non-matched +#' substrings. +#' +#' @returns A list of character vectors with the matched substrings, or `NA` if +#' matching failed. +#' +#' @keywords internal +reg_matches <- function(x, m, invert = FALSE) { + match <- regmatches(x, m, invert = invert) + match[!lengths(match)] <- NA_character_ + match +} + +#' Parallel sequence generation +#' +#' [pseq()] is similar to [seq()] but conveniently accepts integer vectors as +#' inputs to `from` and `to`, allowing for parallel generation of sequences. +#' The result is the union of the generated sequences. +#' +#' @param from An integer vector. The starting value(s) of the sequence(s). +#' @param to An integer vector. The ending value(s) of the sequence(s). +#' +#' @returns An integer vector. +#' +#' @keywords internal +pseq <- function(from, to) { + unlist(purrr::map2(.x = from, .y = to, .f = `:`)) +} + +#' Generate case insensitive regexps +#' +#' [str_to_anycase()] takes a character vector of word strings as input, and +#' generates regular expressions that express that match in any case. +#' +#' @param x A character vector of strings consisting of word characters. +#' +#' @returns A character vector. +#' +#' @keywords internal +str_to_anycase <- function(x) { + lst <- stringr::str_split(x, stringr::boundary("character")) + purrr::map(lst, ~ stringr::str_c(stringr::str_to_upper(.x), stringr::str_to_lower(.x))) |> + purrr::map(~ sprintf("[%s]", .x)) |> + purrr::map(~ stringr::str_flatten(.x)) |> + unlist() +} + +#' Regex for months' abbreviations +#' +#' [months_abb_regex()] generates a regex that matches month abbreviations. For +#' finer control, the case can be specified with parameter `case`. +#' +#' @param x A character vector of three-letter month abbreviations. Default is +#' `month.abb`. +#' @param case A string scalar: `"any"`, if month abbreviations are to be +#' matched in any case; `"upper"`, to match uppercase abbreviations; +#' `"lower"`, to match lowercase; and, `"title"` to match title case. +#' +#' @returns A regex as a string. +#' +#' @keywords internal +months_abb_regex <- function(x = month.abb, case = c("any", "upper", "lower", "title")) { + admiraldev::assert_character_vector(x) + case <- match.arg(case) + + if (identical(case, "any")) x <- str_to_anycase(x) + if (identical(case, "upper")) x <- stringr::str_to_upper(x) + if (identical(case, "lower")) x <- stringr::str_to_lower(x) + if (identical(case, "title")) x <- stringr::str_to_title(x) + + stringr::str_flatten(x, collapse = "|") +} + + +# Date time components. This is a nice +# utility function that allows you to easily +# change the regexp for one specific dttm component +# while keeping the other defaults. + +#' Regexps for date/time format components +#' +#' [fmt_cmp()] creates a character vector of patterns to match individual +#' format date/time components. +#' +#' @param sec A string pattern for matching the second format component. +#' @param min A string pattern for matching the minute format component. +#' @param hour A string pattern for matching the hour format component. +#' @param mday A string pattern for matching the month day format component. +#' @param mon A string pattern for matching the month format component. +#' @param year A string pattern for matching the year format component. +#' +#' @returns A named character vector of date/time format patterns. This a vector +#' of six elements, one for each date/time component. +#' +#' @examples +#' # Regexps to parse format components +#' fmt_cmp() +#' +# # Supply a different pattern for the year component +#' fmt_cmp(year = "yyyy") +#' +#' @export +fmt_cmp <- function(sec = "S+", + min = "M+", + hour = "H+", + mday = "d+", + mon = "m+", + year = "y+") { + structure( + list( + sec = sec, + min = min, + hour = hour, + mday = mday, + mon = mon, + year = year + ), + class = "fmt_c" + ) +} + +assert_fmt_c <- function(x) { + if (!inherits(x, "fmt_c")) { + rlang::abort("`x` must be an object created with `fmt_cmp()`.") + } + + invisible(x) +} + +#' Utility function to assemble a regex of alternative patterns +#' +#' [regex_or()] takes a set of patterns and binds them with the Or (`"|"`) +#' pattern for an easy regex of alternative patterns. +#' +#' @param x A character vector of alternative patterns. +#' @param .open Whether the resulting regex should start with `"|"`. +#' @param .close Whether the resulting regex should end with `"|"`. +#' +#' @returns A character scalar of the resulting regex. +#' +#' @examples +#' # A regex for matching either "jan" or "feb" +#' sdtm.oak:::regex_or(c("jan", "feb")) +#' +#' # Setting `.open` and/or `.close` to `TRUE` can be handy if this regex +#' # is to be combined into a larger regex. +#' paste0(sdtm.oak:::regex_or(c("jan", "feb"), .close = TRUE), r"{\d{2}}") +#' +#' @keywords internal +regex_or <- function(x, .open = FALSE, .close = FALSE) { + admiraldev::assert_character_vector(x) + admiraldev::assert_logical_scalar(.open) + admiraldev::assert_logical_scalar(.close) + + if (.open) x <- c("", x) + if (.close) x <- c(x, "") + + stringr::str_flatten(x, collapse = "|") +} + +#' Regexps for date/time components +#' +#' [fmt_rg()] creates a character vector of named patterns to match individual +#' date/time components. +#' +#' @param sec Regexp for the second component. +#' @param min Regexp for the minute component. +#' @param hour Regexp for the hour component. +#' @param mday Regexp for the month day component. +#' @param mon Regexp for the month component. +#' @param year Regexp for the year component. +#' @param na Regexp of alternatives, useful to match special values coding for +#' missingness. +#' @param sec_na Same as `na` but specifically for the second component. +#' @param min_na Same as `na` but specifically for the minute component. +#' @param hour_na Same as `na` but specifically for the hour component. +#' @param mday_na Same as `na` but specifically for the month day component. +#' @param mon_na Same as `na` but specifically for the month component. +#' @param year_na Same as `na` but specifically for the year component. +#' +#' @returns A named character vector of named patterns (regexps) for matching +#' each date/time component. +#' +#' @examples +#' # Default regexps +#' sdtm.oak:::fmt_rg() +#' +#' # You may change the way months are matched, e.g. you might not want to match +#' # month abbreviations, i.e. only numerical months. So pass an explicit regex +#' # for numerical months: +#' sdtm.oak:::fmt_rg(mon = r"[\b\d|\d{2}]") +#' +#' # Make date/time components accept `"UNK"` as a possible pattern (useful +#' # to match funny codes for `NA`). +#' sdtm.oak:::fmt_rg(na = "UNK") +#' +#' # Or be more specific and use `"UNK"` for the year component only. +#' sdtm.oak:::fmt_rg(year_na = "UNK") +#' +#' @keywords internal +fmt_rg <- function( + sec = r"[(\b\d|\d{2})(\.\d*)?]", + min = r"[(\b\d|\d{2})]", + hour = r"[\d?\d]", + mday = r"[\b\d|\d{2}]", + mon = stringr::str_glue(r"[\d\d|{months_abb_regex()}]"), + year = r"[(\d{2})?\d{2}]", + na = NULL, + sec_na = na, + min_na = na, + hour_na = na, + mday_na = na, + mon_na = na, + year_na = na) { + sec_na <- + ifelse(!is.null(sec_na), regex_or(sec_na, .open = TRUE), "") + min_na <- + ifelse(!is.null(min_na), regex_or(min_na, .open = TRUE), "") + hour_na <- + ifelse(!is.null(hour_na), regex_or(hour_na, .open = TRUE), "") + mday_na <- + ifelse(!is.null(mday_na), regex_or(mday_na, .open = TRUE), "") + mon_na <- + ifelse(!is.null(mon_na), regex_or(mon_na, .open = TRUE), "") + year_na <- + ifelse(!is.null(year_na), regex_or(year_na, .open = TRUE), "") + + + c( + sec = stringr::str_glue("(?{sec}{sec_na})"), + min = stringr::str_glue("(?{min}{min_na})"), + hour = stringr::str_glue("(?{hour}{hour_na})"), + mday = stringr::str_glue("(?{mday}{mday_na})"), + mon = stringr::str_glue("(?{mon}{mon_na})"), + year = stringr::str_glue("(?{year}{year_na})") + ) +} + +fmt_dttmc <- + function(fmt_c = character(), + pat = character(), + cap = character(), + start = integer(), + end = integer(), + len = integer(), + ord = integer()) { + tibble::tibble( + fmt_c = fmt_c, + pat = pat, + cap = cap, + start = start, + end = end, + len = len, + ord = ord + ) + } + +#' @rdname parse_dttm_fmt +parse_dttm_fmt_ <- function(fmt, pattern) { + admiraldev::assert_character_scalar(fmt) + admiraldev::assert_character_scalar(pattern) + + if (identical(nchar(pattern), 0L)) { + rlang::abort("`pattern` must be a literal string of at least one char.") + } + + match_data <- regexpr(pattern, fmt) + match <- reg_matches(fmt, match_data) + + is_match <- (!length(match)) || (!is.na(match)) + + start <- ifelse(is_match, match_data, NA_integer_) + len <- ifelse(is_match, attr(match_data, "match.length"), NA_integer_) + end <- start + len - 1L + tibble::tibble(pat = pattern, cap = match, start = start, end = end, len = len) +} + +#' Parse a date/time format +#' +#' [parse_dttm_fmt()] parses a date/time formats, meaning it will try to parse +#' the components of the format `fmt` that refer to date/time components. +#' [parse_dttm_fmt_()] is similar to [parse_dttm_fmt()] but is not vectorized +#' over `fmt`. +#' +#' @param fmt A format string (scalar) to be parsed by `patterns`. +#' @param pattern,patterns A string (in the case of `pattern`), or a character +#' vector (in the case of `patterns`) of regexps for each of the individual +#' date/time components. Default value is that of [fmt_cmp()]. Use this function +#' if you plan on passing a different set of patterns. +#' +#' @returns A [tibble][tibble::tibble-package] of seven columns: +#' - `fmt_c`: date/time format component. Values are either `"year"`, `"mon"`, +#' `"mday"`, `"hour"`, `"min"`, `"sec"`, or `NA`. +#' - `pat`: Regexp used to parse the date/time component. +#' - `cap`: The captured substring from the format. +#' - `start`: Start position in the format string for this capture. +#' - `end`: End position in the format string for this capture. +#' - `len`: Length of the capture (number of chars). +#' - `ord`: Ordinal of this date/time component in the format string. +#' +#' Each row is for either a date/time format component or a "delimiter" string +#' or pattern in-between format components. +#' +#' @examples +#' sdtm.oak:::parse_dttm_fmt("ymd") +#' sdtm.oak:::parse_dttm_fmt("H:M:S") +#' +#' sdtm.oak:::parse_dttm_fmt("ymd HMS") +#' +#' # Repeating the same special patterns, e.g. "yy" still counts as one pattern +#' # only. +#' sdtm.oak:::parse_dttm_fmt("yymmdd HHMMSS") +#' +#' # Note that `"y"`, `"m"`, `"d"`, `"H"`, `"M"` or `"S"` are reserved patterns +#' # that are matched first and interpreted as format components. # Example: the +#' # first "y" in "year" is parsed as meaning year followed by "ear y". The +#' # second "y" is not longer matched because a first match already # succeded. +#' sdtm.oak:::parse_dttm_fmt("year y") +#' +#' # Specify custom patterns +#' sdtm.oak:::parse_dttm_fmt( +#' "year month day", +#' fmt_cmp(year = "year", mon = "month", mday = "day") +#' ) +#' +#' @keywords internal +parse_dttm_fmt <- function(fmt, patterns = fmt_cmp()) { + admiraldev::assert_character_scalar(fmt) + + fmt_dttmc <- + purrr::map(patterns, ~ parse_dttm_fmt_(fmt, .x)) |> + purrr::list_rbind(names_to = "fmt_c") + + # Check if patterns have matching overlap, i.e. whether they are not + # mutually exclusive (as they should). + if (anyDuplicated(pseq(fmt_dttmc$start, fmt_dttmc$end))) { + rlang::abort("Patterns in `fmt_c` have overlapping matches.") + } + + # Get captures' ranks while leaving NA as NA (`rank()` won't do this.) + fmt_dttmc$ord <- dplyr::row_number(fmt_dttmc$start) + + if (identical(nrow(fmt_dttmc), 0L)) { + return(fmt_dttmc()) + } + + fmt_len <- nchar(fmt) + + dttmc_pos <- + pseq(from = fmt_dttmc$start[!is.na(fmt_dttmc$start)], to = fmt_dttmc$end[!is.na(fmt_dttmc$end)]) + # `delim_pos`: delimiter positions, i.e. positions in `fmt` in-between dttm components. + delim_pos <- find_int_gap(dttmc_pos, xmin = 1L, xmax = fmt_len) + + delim <- with(delim_pos, stringr::str_sub(fmt, start = start, end = end)) + fmt_delim <- + fmt_dttmc( + fmt_c = NA_character_, + pat = NA_character_, + cap = delim, + start = delim_pos$start, + end = delim_pos$end, + len = delim_pos$end - delim_pos$start + 1L, + ord = NA_integer_ + ) + + dplyr::bind_rows(fmt_dttmc, fmt_delim) |> + dplyr::arrange(.data$start) +} + +#' Convert a parsed date/time format to regex +#' +#' [dttm_fmt_to_regex()] takes a [tibble][tibble::tibble-package] of parsed +#' date/time format components (as returned by [parse_dttm_fmt()]), and a +#' mapping of date/time component formats to regexps and generates a single +#' regular expression with groups for matching each of the date/time components. +#' +#' @param fmt A format string (scalar) to be parsed by `patterns`. +#' @param fmt_regex A named character vector of regexps, one for each date/time +#' component. +#' @param anchored Whether the final regex should be anchored, i.e. bounded by +#' `"^"` and `"$"` for a whole match. +#' +#' @returns A string containing a regular expression for matching date/time +#' components according to a format. +#' +#' @examples +#' sdtm.oak:::dttm_fmt_to_regex("y") +#' sdtm.oak:::dttm_fmt_to_regex("y", anchored = FALSE) +#' +#' sdtm.oak:::dttm_fmt_to_regex("m") +#' sdtm.oak:::dttm_fmt_to_regex("ymd") +#' +#' sdtm.oak:::dttm_fmt_to_regex("ymd HH:MM:SS") +#' +#' @keywords internal +dttm_fmt_to_regex <- function(fmt, fmt_regex = fmt_rg(), fmt_c = fmt_cmp(), anchored = TRUE) { + tbl_fmt_c <- parse_dttm_fmt(fmt, patterns = fmt_c) + + fmt_regex <- + tbl_fmt_c |> + dplyr::mutate(regex = dplyr::if_else(is.na(.data$fmt_c), .data$cap, fmt_regex[.data$fmt_c])) |> + dplyr::mutate(regex = dplyr::if_else(is.na(.data$cap), NA_character_, .data$regex)) |> + dplyr::pull(.data$regex) + + fmt_regex <- stringr::str_flatten(fmt_regex, na.rm = TRUE) + if (anchored) fmt_regex <- stringr::str_glue("^{fmt_regex}$") + + fmt_regex +} diff --git a/R/sdtm.oak-package.R b/R/sdtm.oak-package.R new file mode 100644 index 0000000..b1a48e6 --- /dev/null +++ b/R/sdtm.oak-package.R @@ -0,0 +1,8 @@ +#' @keywords internal +"_PACKAGE" + +## usethis namespace: start +#' @importFrom tibble tibble +#' @importFrom rlang .data +## usethis namespace: end +NULL diff --git a/README.Rmd b/README.Rmd index d221cff..eea23ea 100644 --- a/README.Rmd +++ b/README.Rmd @@ -23,9 +23,9 @@ An EDC and Data Standard agnostic SDTM data transformation engine that automates ## Installation -You can install the development version of `{sdtm.oak}` from [GitHub](https://github.com/pharmaverse/oak/) with: +You can install the development version of `{sdtm.oak}` from [GitHub](https://github.com/pharmaverse/sdtm.oak/) with: ``` r # install.packages("remotes") -remotes::install_github("pharmaverse/oak") +remotes::install_github("pharmaverse/sdtm.oak") ``` diff --git a/README.md b/README.md index ce9a234..04f2fa5 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,4 @@ + # sdtm.oak @@ -6,7 +7,6 @@ [![CRAN status](https://www.r-pkg.org/badges/version/sdtm.oak)](https://CRAN.R-project.org/package=sdtm.oak) - An EDC and Data Standard agnostic SDTM data transformation engine that @@ -16,9 +16,9 @@ based on standard mapping algorithms ## Installation You can install the development version of `{sdtm.oak}` from -[GitHub](https://github.com/pharmaverse/oak/) with: +[GitHub](https://github.com/pharmaverse/sdtm.oak/) with: -``` +``` r # install.packages("remotes") -remotes::install_github("pharmaverse/oak") +remotes::install_github("pharmaverse/sdtm.oak") ``` diff --git a/data-raw/dtc_formats.R b/data-raw/dtc_formats.R new file mode 100644 index 0000000..365f928 --- /dev/null +++ b/data-raw/dtc_formats.R @@ -0,0 +1,27 @@ +## code to prepare `dtc_formats` dataset goes here + +dtc_formats <- tibble::tribble( + ~fmt, ~type, ~description, + "ymd", "date", "Parses a date: year, month, and month day.", + "y m d", "date", "Parses a date: year, month, and month day.", + "y-m-d", "date", "Parses a date: year, month, and month day.", + "dmy", "date", "Parses a date: month day, month and year.", + "d m y", "date", "Parses a date: month day, month and year.", + "d-m-y", "date", "Parses a date: month day, month and year.", + "ym", "date", "Parses a date: year and month.", + "y m", "date", "Parses a date: year and month.", + "y-m", "date", "Parses a date: year and month.", + "my", "date", "Parses a date: month and year.", + "m y", "date", "Parses a date: month and year.", + "m-y", "date", "Parses a date: month and year.", + "HM", "time", "Parses a time: hour and minutes.", + "HMS", "time", "Parses a time: hour, minutes, and seconds.", + "H:M", "time", "Parses a time: hour and minutes.", + "H:M:S", "time", "Parses a time: hour, minutes and seconds.", + "ymdH:M:S", "datetime", "Parses a date-time: year, month, month day, hour, minutes, and seconds.", + "ymd H:M:S", "datetime", "Parses a date-time: year, month, month day, hour, minutes, and seconds.", + "y-m-d H:M:S", "datetime", "Parses a date-time: year, month, month day, hour, minutes, and seconds.", + "y m d H:M:S", "datetime", "Parses a date-time: year, month, month day, hour, minutes, and seconds." +) + +usethis::use_data(dtc_formats, overwrite = TRUE) diff --git a/data/dtc_formats.rda b/data/dtc_formats.rda new file mode 100644 index 0000000000000000000000000000000000000000..1b12c598623c77adbc73d278e41d65cd9338314d GIT binary patch literal 440 zcmV;p0Z0BqT4*^jL0KkKS?W>6NdN)c|H1!yOaVYQ5D)~ySU|sL-k?AL00aO5zyZe! z1SFL7lQkPrq{uT<0R|_jk)~-3XlQ5v$Y=ln00E5+4FDMp000003X(|lB{mdoPg74* z)CQVqp{64dFh(l`q&SLM2sx>>d`_nXTX$&!iwLslxMz zqDJsvgMPcYg(j+vXGJ8Q)PkgjbXcUqrKple=c>{))H$U&$nkYbl@{<#?%=XJCYq_f zXJn=eRV7PHNl?;Vp%^A(8I)vZl`t(PEG7BO7roT0RK2NDpPtN!?N2mwkXOUrxsMoA z;3ni8SJGu_qBC& z%`nNVl_!;p6iPZnVp&9BH&Iz<*P#>8C@~ek`r*&bjMZ2zSJ6Cv)g-0RcJ%G?+tG;_ z11M`v$eH#Hdhh1WQdCRjpYuT+oIQ3#+@%%jO=MqfZ}Z0jO@fXnnevq@QS3P1M&UpW iIaLhRd07;l-P8axPm|XqJMkEA@pmLsg$WL&98{23yu~8` literal 0 HcmV?d00001 diff --git a/inst/WORDLIST b/inst/WORDLIST index 84f80d4..e695dab 100644 --- a/inst/WORDLIST +++ b/inst/WORDLIST @@ -2,3 +2,8 @@ EDC ODM SDTM sdtm +Hoffmann +dtc +funder +vectorized +ORCID diff --git a/man/assert_capture_matrix.Rd b/man/assert_capture_matrix.Rd new file mode 100644 index 0000000..8d5db77 --- /dev/null +++ b/man/assert_capture_matrix.Rd @@ -0,0 +1,43 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_utils.R +\name{assert_capture_matrix} +\alias{assert_capture_matrix} +\title{Assert capture matrix} +\usage{ +assert_capture_matrix(m) +} +\arguments{ +\item{m}{A character matrix.} +} +\value{ +This function throws an error if \code{m} is not either: +\itemize{ +\item A character matrix; +\item A matrix whose columns are (at least): \code{year}, \code{mon}, \code{mday}, \code{hour}, +\code{min} and \code{sec}. +} + +Otherwise, it returns \code{m} invisibly. +} +\description{ +\code{\link[=assert_capture_matrix]{assert_capture_matrix()}} is an internal helper function aiding with the +checking of an internal R object that contains the parsing results as +returned by \code{\link[=parse_dttm]{parse_dttm()}}: capture matrix. + +This function checks that the capture matrix is a matrix and that it contains +six columns: \code{year}, \code{mon}, \code{mday}, \code{hour}, \code{min} and \code{sec}. +} +\examples{ +cols <- c("year", "mon", "mday", "hour", "min", "sec") +m <- matrix(NA_character_, nrow = 1L, ncol = 6L, dimnames = list(NULL, cols)) +sdtm.oak:::assert_capture_matrix(m) + +# These commands should throw an error +if (FALSE) { + sdtm.oak:::assert_capture_matrix(character()) + sdtm.oak:::assert_capture_matrix(matrix(data = NA_character_, nrow = 0, ncol = 0)) + sdtm.oak:::assert_capture_matrix(matrix(data = NA_character_, nrow = 1)) +} + +} +\keyword{internal} diff --git a/man/assert_dtc_fmt.Rd b/man/assert_dtc_fmt.Rd new file mode 100644 index 0000000..c7868e5 --- /dev/null +++ b/man/assert_dtc_fmt.Rd @@ -0,0 +1,26 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_utils.R +\name{assert_dtc_fmt} +\alias{assert_dtc_fmt} +\title{Assert date time character formats} +\usage{ +assert_dtc_fmt(fmt) +} +\arguments{ +\item{fmt}{A character vector.} +} +\description{ +\code{\link[=assert_dtc_fmt]{assert_dtc_fmt()}} takes a character vector of date/time formats and checks if +the formats are supported, meaning it checks if they are one of the formats +listed in column \code{fmt} of \link{dtc_formats}, failing with an error otherwise. +} +\examples{ +sdtm.oak:::assert_dtc_fmt(c("ymd", "y m d", "dmy", "HM", "H:M:S", "y-m-d H:M:S")) + +# This example is guarded to avoid throwing errors +if (FALSE) { + sdtm.oak:::assert_dtc_fmt("y years m months d days") +} + +} +\keyword{internal} diff --git a/man/assert_dtc_format.Rd b/man/assert_dtc_format.Rd new file mode 100644 index 0000000..fe19d9e --- /dev/null +++ b/man/assert_dtc_format.Rd @@ -0,0 +1,37 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_utils.R +\name{assert_dtc_format} +\alias{assert_dtc_format} +\title{Assert dtc format} +\usage{ +assert_dtc_format(.format) +} +\arguments{ +\item{.format}{The argument of \code{\link[=create_iso8601]{create_iso8601()}}'s \code{.format} parameter.} +} +\value{ +This function throws an error if \code{.format} is not either: +\itemize{ +\item A character vector of formats permitted by \code{\link[=assert_dtc_fmt]{assert_dtc_fmt()}}; +\item A list of character vectors of formats permitted by \code{\link[=assert_dtc_fmt]{assert_dtc_fmt()}}. +} + +Otherwise, it returns \code{.format} invisibly. +} +\description{ +\code{\link[=assert_dtc_format]{assert_dtc_format()}} is an internal helper function aiding with the checking +of the \code{.format} parameter of \code{\link[=create_iso8601]{create_iso8601()}}. +} +\examples{ +sdtm.oak:::assert_dtc_format("ymd") +sdtm.oak:::assert_dtc_format(c("ymd", "y-m-d")) +sdtm.oak:::assert_dtc_format(list(c("ymd", "y-m-d"), "H:M:S")) + +# These commands should throw an error +if (FALSE) { + # Note that `"year, month, day"` is not a supported format. + sdtm.oak:::assert_dtc_format("year, month, day") +} + +} +\keyword{internal} diff --git a/man/coalesce_capture_matrices.Rd b/man/coalesce_capture_matrices.Rd new file mode 100644 index 0000000..5b32bbf --- /dev/null +++ b/man/coalesce_capture_matrices.Rd @@ -0,0 +1,40 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_utils.R +\name{coalesce_capture_matrices} +\alias{coalesce_capture_matrices} +\title{Coalesce capture matrices} +\usage{ +coalesce_capture_matrices(...) +} +\arguments{ +\item{...}{A sequence of capture matrices.} +} +\value{ +A single capture matrix whose values have been coalesced in the +sense of \link[dplyr:coalesce]{coalesce()}. +} +\description{ +\code{\link[=coalesce_capture_matrices]{coalesce_capture_matrices()}} combines several capture matrices into one. +Each argument of \code{...} should be a capture matrix in the sense of the output +by \code{\link[=complete_capture_matrix]{complete_capture_matrix()}}, meaning a character matrix of six columns +whose names are: \code{year}, \code{mon}, \code{mday}, \code{hour}, \code{min} or \code{sec}. +} +\examples{ +cols <- c("year", "mon", "mday", "hour", "min", "sec") +dates <- c("2020", "01", "01", "20", NA, NA) +times <- c(NA, NA, NA, "10", "00", "05") +m_dates <- matrix(dates, nrow = 1L, ncol = 6L, dimnames = list(NULL, cols)) +m_times <- matrix(times, nrow = 1L, ncol = 6L, dimnames = list(NULL, cols)) + +# Note how the hour "20" takes precedence over "10" +sdtm.oak:::coalesce_capture_matrices(m_dates, m_times) + +# Reverse the order of the inputs and now hour "10" takes precedence +sdtm.oak:::coalesce_capture_matrices(m_times, m_dates) + +# Single inputs should result in the same output as the input +sdtm.oak:::coalesce_capture_matrices(m_dates) +sdtm.oak:::coalesce_capture_matrices(m_times) + +} +\keyword{internal} diff --git a/man/complete_capture_matrix.Rd b/man/complete_capture_matrix.Rd new file mode 100644 index 0000000..fd3f67a --- /dev/null +++ b/man/complete_capture_matrix.Rd @@ -0,0 +1,36 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_utils.R +\name{complete_capture_matrix} +\alias{complete_capture_matrix} +\title{Complete a capture matrix} +\usage{ +complete_capture_matrix(m) +} +\arguments{ +\item{m}{A character matrix that might be missing one or more of the +following columns: \code{year}, \code{mon}, \code{mday}, \code{hour}, \code{min} or \code{sec}.} +} +\value{ +A character matrix that contains the columns \code{year}, \code{mon}, \code{mday}, +\code{hour}, \code{min} and \code{sec}. Any other existing columns are dropped. +} +\description{ +\code{\link[=complete_capture_matrix]{complete_capture_matrix()}} completes the missing, if any, columns of the +capture matrix. +} +\examples{ +sdtm.oak:::complete_capture_matrix(matrix(data = NA_character_, nrow = 0, ncol = 0)) +sdtm.oak:::complete_capture_matrix(matrix(data = NA_character_, nrow = 1)) + +# m <- matrix(NA_character_, nrow = 1, ncol = 2, dimnames = list(NULL, c("year", "sec"))) +# sdtm.oak:::complete_capture_matrix(m) + +# m <- matrix(c("2020", "10"), nrow = 1, ncol = 2, dimnames = list(NULL, c("year", "sec"))) +# sdtm.oak:::complete_capture_matrix(m) + +# Any other existing columns are dropped. +# m <- matrix(c("2020", "10"), nrow = 1, ncol = 2, dimnames = list(NULL, c("semester", "quarter"))) +# sdtm.oak:::complete_capture_matrix(m) + +} +\keyword{internal} diff --git a/man/create_iso8601.Rd b/man/create_iso8601.Rd new file mode 100644 index 0000000..8148197 --- /dev/null +++ b/man/create_iso8601.Rd @@ -0,0 +1,101 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{create_iso8601} +\alias{create_iso8601} +\title{Convert date or time collected values to ISO 8601} +\usage{ +create_iso8601( + ..., + .format, + .fmt_c = fmt_cmp(), + .na = NULL, + .cutoff_2000 = 68L, + .check_format = FALSE +) +} +\arguments{ +\item{...}{Character vectors of dates, times or date-times' components.} + +\item{.format}{Parsing format(s). Either a character vector or a list of +character vectors. If a character vector is passed then each element is +taken as parsing format for each vector passed in \code{...}. If a list is +provided, then each element must be a character vector of formats. The +first vector of formats is used for parsing the first vector passed in +\code{...}, and so on.} + +\item{.fmt_c}{A list of regexps to use when parsing \code{.format}. Use \code{\link[=fmt_cmp]{fmt_cmp()}} +to create such an object to pass as argument to this parameter.} + +\item{.na}{A character vector of string literals to be regarded as missing +values during parsing.} + +\item{.cutoff_2000}{An integer value. Two-digit years smaller or equal to +\code{.cutoff_2000} are parsed as though starting with \code{20}, otherwise parsed as +though starting with \code{19}.} + +\item{.check_format}{Whether to check the formats passed in \code{.format}, +meaning to check against a selection of validated formats in +\link[=dtc_formats]{dtc_formats}; or to have a more permissible +interpretation of the formats.} +} +\description{ +\code{\link[=create_iso8601]{create_iso8601()}} converts vectors of dates, times or date-times to \href{https://en.wikipedia.org/wiki/ISO_8601}{ISO 8601} format. Learn more in +\code{vignette("iso_8601")}. +} +\examples{ +# Converting dates +create_iso8601(c("2020-01-01", "20200102"), .format = "y-m-d") +create_iso8601(c("2020-01-01", "20200102"), .format = "ymd") +create_iso8601(c("2020-01-01", "20200102"), .format = list(c("y-m-d", "ymd"))) + +# Two-digit years are supported +create_iso8601(c("20-01-01", "200101"), .format = list(c("y-m-d", "ymd"))) + +# `.cutoff_2000` sets the cutoff for two-digit to four-digit year conversion +# Default is at 68. +create_iso8601(c("67-01-01", "68-01-01", "69-01-01"), .format = "y-m-d") + +# Change it to 80. +create_iso8601(c("79-01-01", "80-01-01", "81-01-01"), .format = "y-m-d", .cutoff_2000 = 80) + +# Converting times +create_iso8601("15:10", .format = "HH:MM") +create_iso8601("2:10", .format = "HH:MM") +create_iso8601("2:1", .format = "HH:MM") +create_iso8601("02:01:56", .format = "HH:MM:SS") +create_iso8601("020156.5", .format = "HHMMSS") + +# Converting date-times +create_iso8601("12 NOV 202015:15", .format = "dd mmm yyyyHH:MM") + +# Indicate allowed missing values to make the parsing pass +create_iso8601("U DEC 201914:00", .format = "dd mmm yyyyHH:MM") +create_iso8601("U DEC 201914:00", .format = "dd mmm yyyyHH:MM", .na = "U") + +create_iso8601("NOV 2020", .format = "m y") +create_iso8601(c("MAR 2019", "MaR 2020", "mar 2021"), .format = "m y") + +create_iso8601("2019-04-041045-", .format = "yyyy-mm-ddHHMM-") + +create_iso8601("20200507null", .format = "ymd(HH:MM:SS)") +create_iso8601("20200507null", .format = "ymd((HH:MM:SS)|null)") + +# Fractional seconds +create_iso8601("2019-120602:20:13.1230001", .format = "y-mdH:M:S") + +# Use different reserved characters in the format specification +# Here we change "H" to "x" and "M" to "w", for hour and minute, respectively. +create_iso8601("14H00M", .format = "HHMM") +create_iso8601("14H00M", .format = "xHwM", .fmt_c = fmt_cmp(hour = "x", min = "w")) + +# Alternative formats with unknown values +datetimes <- c("UN UNK 201914:00", "UN JAN 2021") +format <- list(c("dd mmm yyyy", "dd mmm yyyyHH:MM")) +create_iso8601(datetimes, .format = format, .na = c("UN", "UNK")) + +# Dates and times may come in many format variations +fmt <- "dd MMM yyyy HH nn ss" +fmt_cmp <- fmt_cmp(mon = "MMM", min = "nn", sec = "ss") +create_iso8601("05 feb 1985 12 55 02", .format = fmt, .fmt_c = fmt_cmp) + +} diff --git a/man/dtc_formats.Rd b/man/dtc_formats.Rd new file mode 100644 index 0000000..08adcdc --- /dev/null +++ b/man/dtc_formats.Rd @@ -0,0 +1,26 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_formats.R +\docType{data} +\name{dtc_formats} +\alias{dtc_formats} +\title{Date/time collection formats} +\format{ +A \link[tibble:tibble-package]{tibble} of 20 formats +with three variables: +\describe{ +\item{\code{fmt}}{Format string.} +\item{\code{type}}{Whether a date, time or date-time.} +\item{\code{description}}{Description of which date-time components are parsed.} +} +} +\usage{ +dtc_formats +} +\description{ +Date/time collection formats +} +\examples{ +dtc_formats + +} +\keyword{datasets} diff --git a/man/dttm_fmt_to_regex.Rd b/man/dttm_fmt_to_regex.Rd new file mode 100644 index 0000000..12510e9 --- /dev/null +++ b/man/dttm_fmt_to_regex.Rd @@ -0,0 +1,43 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{dttm_fmt_to_regex} +\alias{dttm_fmt_to_regex} +\title{Convert a parsed date/time format to regex} +\usage{ +dttm_fmt_to_regex( + fmt, + fmt_regex = fmt_rg(), + fmt_c = fmt_cmp(), + anchored = TRUE +) +} +\arguments{ +\item{fmt}{A format string (scalar) to be parsed by \code{patterns}.} + +\item{fmt_regex}{A named character vector of regexps, one for each date/time +component.} + +\item{anchored}{Whether the final regex should be anchored, i.e. bounded by +\code{"^"} and \code{"$"} for a whole match.} +} +\value{ +A string containing a regular expression for matching date/time +components according to a format. +} +\description{ +\code{\link[=dttm_fmt_to_regex]{dttm_fmt_to_regex()}} takes a \link[tibble:tibble-package]{tibble} of parsed +date/time format components (as returned by \code{\link[=parse_dttm_fmt]{parse_dttm_fmt()}}), and a +mapping of date/time component formats to regexps and generates a single +regular expression with groups for matching each of the date/time components. +} +\examples{ +sdtm.oak:::dttm_fmt_to_regex("y") +sdtm.oak:::dttm_fmt_to_regex("y", anchored = FALSE) + +sdtm.oak:::dttm_fmt_to_regex("m") +sdtm.oak:::dttm_fmt_to_regex("ymd") + +sdtm.oak:::dttm_fmt_to_regex("ymd HH:MM:SS") + +} +\keyword{internal} diff --git a/man/find_int_gap.Rd b/man/find_int_gap.Rd new file mode 100644 index 0000000..5fa5654 --- /dev/null +++ b/man/find_int_gap.Rd @@ -0,0 +1,31 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{find_int_gap} +\alias{find_int_gap} +\title{Find gap intervals in integer sequences} +\usage{ +find_int_gap(x, xmin = min(x), xmax = max(x)) +} +\arguments{ +\item{x}{An integer vector.} + +\item{xmin}{Left endpoint integer value.} + +\item{xmax}{Right endpoint integer value.} +} +\value{ +A \link[tibble:tibble-package]{tibble} of gap intervals of two columns: +\itemize{ +\item \code{start}: left endpoint +\item \code{end}: right endpoint +If no gap intervals are found then an empty \link[tibble:tibble-package]{tibble} +is returned. +} +} +\description{ +\code{\link[=find_int_gap]{find_int_gap()}} determines the \code{start} and \code{end} positions for gap intervals +in a sequence of integers. By default, the interval range to look for gaps is +defined by the minimum and maximum values of \code{x}; specify \code{xmin} and \code{xmax} +to change the range explicitly. +} +\keyword{internal} diff --git a/man/fmt_cmp.Rd b/man/fmt_cmp.Rd new file mode 100644 index 0000000..e278927 --- /dev/null +++ b/man/fmt_cmp.Rd @@ -0,0 +1,43 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{fmt_cmp} +\alias{fmt_cmp} +\title{Regexps for date/time format components} +\usage{ +fmt_cmp( + sec = "S+", + min = "M+", + hour = "H+", + mday = "d+", + mon = "m+", + year = "y+" +) +} +\arguments{ +\item{sec}{A string pattern for matching the second format component.} + +\item{min}{A string pattern for matching the minute format component.} + +\item{hour}{A string pattern for matching the hour format component.} + +\item{mday}{A string pattern for matching the month day format component.} + +\item{mon}{A string pattern for matching the month format component.} + +\item{year}{A string pattern for matching the year format component.} +} +\value{ +A named character vector of date/time format patterns. This a vector +of six elements, one for each date/time component. +} +\description{ +\code{\link[=fmt_cmp]{fmt_cmp()}} creates a character vector of patterns to match individual +format date/time components. +} +\examples{ +# Regexps to parse format components +fmt_cmp() + +fmt_cmp(year = "yyyy") + +} diff --git a/man/fmt_rg.Rd b/man/fmt_rg.Rd new file mode 100644 index 0000000..1c59944 --- /dev/null +++ b/man/fmt_rg.Rd @@ -0,0 +1,76 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{fmt_rg} +\alias{fmt_rg} +\title{Regexps for date/time components} +\usage{ +fmt_rg( + sec = "(\\\\b\\\\d|\\\\d{2})(\\\\.\\\\d*)?", + min = "(\\\\b\\\\d|\\\\d{2})", + hour = "\\\\d?\\\\d", + mday = "\\\\b\\\\d|\\\\d{2}", + mon = stringr::str_glue("\\\\d\\\\d|{months_abb_regex()}"), + year = "(\\\\d{2})?\\\\d{2}", + na = NULL, + sec_na = na, + min_na = na, + hour_na = na, + mday_na = na, + mon_na = na, + year_na = na +) +} +\arguments{ +\item{sec}{Regexp for the second component.} + +\item{min}{Regexp for the minute component.} + +\item{hour}{Regexp for the hour component.} + +\item{mday}{Regexp for the month day component.} + +\item{mon}{Regexp for the month component.} + +\item{year}{Regexp for the year component.} + +\item{na}{Regexp of alternatives, useful to match special values coding for +missingness.} + +\item{sec_na}{Same as \code{na} but specifically for the second component.} + +\item{min_na}{Same as \code{na} but specifically for the minute component.} + +\item{hour_na}{Same as \code{na} but specifically for the hour component.} + +\item{mday_na}{Same as \code{na} but specifically for the month day component.} + +\item{mon_na}{Same as \code{na} but specifically for the month component.} + +\item{year_na}{Same as \code{na} but specifically for the year component.} +} +\value{ +A named character vector of named patterns (regexps) for matching +each date/time component. +} +\description{ +\code{\link[=fmt_rg]{fmt_rg()}} creates a character vector of named patterns to match individual +date/time components. +} +\examples{ +# Default regexps +sdtm.oak:::fmt_rg() + +# You may change the way months are matched, e.g. you might not want to match +# month abbreviations, i.e. only numerical months. So pass an explicit regex +# for numerical months: +sdtm.oak:::fmt_rg(mon = r"[\b\d|\d{2}]") + +# Make date/time components accept `"UNK"` as a possible pattern (useful +# to match funny codes for `NA`). +sdtm.oak:::fmt_rg(na = "UNK") + +# Or be more specific and use `"UNK"` for the year component only. +sdtm.oak:::fmt_rg(year_na = "UNK") + +} +\keyword{internal} diff --git a/man/format_iso8601.Rd b/man/format_iso8601.Rd new file mode 100644 index 0000000..ec101f2 --- /dev/null +++ b/man/format_iso8601.Rd @@ -0,0 +1,45 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{format_iso8601} +\alias{format_iso8601} +\title{Convert date/time components into ISO8601 format} +\usage{ +format_iso8601(m, .cutoff_2000 = 68L) +} +\arguments{ +\item{m}{A character matrix of date/time components. It must have six +named columns: \code{year}, \code{mon}, \code{mday}, \code{hour}, \code{min} and \code{sec}.} + +\item{.cutoff_2000}{An integer value. Two-digit years smaller or equal to +\code{.cutoff_2000} are parsed as though starting with \code{20}, otherwise parsed as +though starting with \code{19}.} +} +\value{ +A character vector with date-times following the ISO8601 format. +} +\description{ +\code{\link[=format_iso8601]{format_iso8601()}} takes a character matrix of date/time components and +converts each component to ISO8601 format. In practice this entails +converting years to a four digit number, and month, day, hours, minutes and +seconds to two-digit numbers. Not available (\code{NA}) components are converted +to \code{"-"}. +} +\examples{ +cols <- c("year", "mon", "mday", "hour", "min", "sec") +m <- matrix( + c( + "99", "00", "01", + "Jan", "feb", "03", + "1", "01", "31", + "00", "12", "23", + "00", "59", "10", + "42", "5.15", NA + ), + ncol = 6, + dimnames = list(c(), cols) +) + +sdtm.oak:::format_iso8601(m) + +} +\keyword{internal} diff --git a/man/iso8601_mon.Rd b/man/iso8601_mon.Rd new file mode 100644 index 0000000..e6c9b69 --- /dev/null +++ b/man/iso8601_mon.Rd @@ -0,0 +1,31 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{iso8601_mon} +\alias{iso8601_mon} +\title{Format as a ISO8601 month} +\usage{ +iso8601_mon(x) +} +\arguments{ +\item{x}{A character vector.} +} +\value{ +A character vector. +} +\description{ +\code{\link[=iso8601_mon]{iso8601_mon()}} converts a character vector whose values represent numeric +or abbreviated month names to zero-padded numeric months. +} +\examples{ +sdtm.oak:::iso8601_mon(c(NA, "0", "1", "2", "10", "11", "12")) + +# No semantic validation is performed on the numeric months, so `"13"` stays +# `"13"` but representations that can't be represented as two-digit numbers +# become `NA`. +sdtm.oak:::iso8601_mon(c("13", "99", "100", "-1")) + +(mon <- month.abb) +sdtm.oak:::iso8601_mon(mon) + +} +\keyword{internal} diff --git a/man/iso8601_na.Rd b/man/iso8601_na.Rd new file mode 100644 index 0000000..03f5a70 --- /dev/null +++ b/man/iso8601_na.Rd @@ -0,0 +1,22 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{iso8601_na} +\alias{iso8601_na} +\title{Convert NA to \code{"-"}} +\usage{ +iso8601_na(x) +} +\arguments{ +\item{x}{A character vector.} +} +\value{ +A character vector. +} +\description{ +\code{\link[=iso8601_na]{iso8601_na()}} takes a character vector and converts \code{NA} values to \code{"-"}. +} +\examples{ +sdtm.oak:::iso8601_na(c("10", NA_character_)) + +} +\keyword{internal} diff --git a/man/iso8601_sec.Rd b/man/iso8601_sec.Rd new file mode 100644 index 0000000..b788de7 --- /dev/null +++ b/man/iso8601_sec.Rd @@ -0,0 +1,22 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{iso8601_sec} +\alias{iso8601_sec} +\title{Format as ISO8601 seconds} +\usage{ +iso8601_sec(x) +} +\arguments{ +\item{x}{A character vector.} +} +\value{ +A character vector. +} +\description{ +\code{\link[=iso8601_sec]{iso8601_sec()}} converts a character vector whose values represent seconds. +} +\examples{ +sdtm.oak:::iso8601_sec(c(NA, "0", "1", "10", "59", "99", "100")) + +} +\keyword{internal} diff --git a/man/iso8601_truncate.Rd b/man/iso8601_truncate.Rd new file mode 100644 index 0000000..4c4a4eb --- /dev/null +++ b/man/iso8601_truncate.Rd @@ -0,0 +1,50 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{iso8601_truncate} +\alias{iso8601_truncate} +\title{Truncate a partial ISO8601 date-time} +\usage{ +iso8601_truncate(x, empty_as_na = TRUE) +} +\arguments{ +\item{x}{A character vector.} +} +\value{ +A character vector. +} +\description{ +\code{\link[=iso8601_truncate]{iso8601_truncate()}} converts a character vector of ISO8601 dates, times or +date-times that might be partial and truncates the format by removing those +missing components. +} +\examples{ +x <- + c( + "1999-01-01T15:20:01", + "1999-01-01T15:20:-", + "1999-01-01T15:-:-", + "1999-01-01T-:-:-", + "1999-01--T-:-:-", + "1999----T-:-:-", + "-----T-:-:-" + ) + +sdtm.oak:::iso8601_truncate(x) + +# With `empty_as_na = FALSE` empty strings are not replaced with `NA` +sdtm.oak:::iso8601_truncate("-----T-:-:-", empty_as_na = TRUE) +sdtm.oak:::iso8601_truncate("-----T-:-:-", empty_as_na = FALSE) + +# Truncation only happens if missing components are the right most end, +# otherwise they remain unaltered. +sdtm.oak:::iso8601_truncate( + c( + "1999----T15:20:01", + "1999-01-01T-:20:01", + "1999-01-01T-:-:01", + "1999-01-01T-:-:-" + ) +) + +} +\keyword{internal} diff --git a/man/iso8601_two_digits.Rd b/man/iso8601_two_digits.Rd new file mode 100644 index 0000000..337da57 --- /dev/null +++ b/man/iso8601_two_digits.Rd @@ -0,0 +1,25 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{iso8601_two_digits} +\alias{iso8601_two_digits} +\title{Format as a ISO8601 two-digit number} +\usage{ +iso8601_two_digits(x) +} +\arguments{ +\item{x}{A character vector.} +} +\value{ +A character vector of the same size as \code{x}. +} +\description{ +\code{\link[=iso8601_two_digits]{iso8601_two_digits()}} converts a single digit or two digit number into a +two digit, 0-padded, number. Failing to parse the input as a two digit number +results in \code{NA}. +} +\examples{ +x <- c("0", "00", "1", "01", "42", "100", NA_character_, "1.") +sdtm.oak:::iso8601_two_digits(x) + +} +\keyword{internal} diff --git a/man/iso8601_year.Rd b/man/iso8601_year.Rd new file mode 100644 index 0000000..190db0f --- /dev/null +++ b/man/iso8601_year.Rd @@ -0,0 +1,35 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{iso8601_year} +\alias{iso8601_year} +\title{Format as a ISO8601 four-digit year} +\usage{ +iso8601_year(x, cutoff_2000 = 68L) +} +\arguments{ +\item{x}{A character vector.} + +\item{cutoff_2000}{A non-negative integer value. Two-digit years smaller or +equal to \code{cutoff_2000} are parsed as though starting with \code{20}, otherwise +parsed as though starting with \code{19}.} +} +\value{ +A character vector. +} +\description{ +\code{\link[=iso8601_year]{iso8601_year()}} converts a character vector whose values represent years to +four-digit years. +} +\examples{ +sdtm.oak:::iso8601_year(c("0", "1", "2", "50", "68", "69", "90", "99", "00")) + +# Be default, `cutoff_2000` is at 68. +sdtm.oak:::iso8601_year(c("67", "68", "69", "70")) +sdtm.oak:::iso8601_year(c("1967", "1968", "1969", "1970")) + +# Change it to something else, e.g. `cutoff_2000 = 25`. +sdtm.oak:::iso8601_year(as.character(0:50), cutoff_2000 = 25) +sdtm.oak:::iso8601_year(as.character(1900:1950), cutoff_2000 = 25) + +} +\keyword{internal} diff --git a/man/months_abb_regex.Rd b/man/months_abb_regex.Rd new file mode 100644 index 0000000..adcd334 --- /dev/null +++ b/man/months_abb_regex.Rd @@ -0,0 +1,24 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{months_abb_regex} +\alias{months_abb_regex} +\title{Regex for months' abbreviations} +\usage{ +months_abb_regex(x = month.abb, case = c("any", "upper", "lower", "title")) +} +\arguments{ +\item{x}{A character vector of three-letter month abbreviations. Default is +\code{month.abb}.} + +\item{case}{A string scalar: \code{"any"}, if month abbreviations are to be +matched in any case; \code{"upper"}, to match uppercase abbreviations; +\code{"lower"}, to match lowercase; and, \code{"title"} to match title case.} +} +\value{ +A regex as a string. +} +\description{ +\code{\link[=months_abb_regex]{months_abb_regex()}} generates a regex that matches month abbreviations. For +finer control, the case can be specified with parameter \code{case}. +} +\keyword{internal} diff --git a/man/parse_dttm.Rd b/man/parse_dttm.Rd new file mode 100644 index 0000000..016afca --- /dev/null +++ b/man/parse_dttm.Rd @@ -0,0 +1,91 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_parse_dttm.R +\name{parse_dttm_} +\alias{parse_dttm_} +\alias{parse_dttm} +\title{Parse a date, time, or date-time} +\usage{ +parse_dttm_( + dttm, + fmt, + fmt_c = fmt_cmp(), + na = NULL, + sec_na = na, + min_na = na, + hour_na = na, + mday_na = na, + mon_na = na, + year_na = na +) + +parse_dttm( + dttm, + fmt, + fmt_c = fmt_cmp(), + na = NULL, + sec_na = na, + min_na = na, + hour_na = na, + mday_na = na, + mon_na = na, + year_na = na +) +} +\arguments{ +\item{dttm}{A character vector of dates, times or date-times.} + +\item{fmt}{In the case of \code{parse_dttm()}, a character vector of parsing +formats, or a single string format in the case of \code{parse_dttm_()}. When a +character vector of formats is passed, each format is attempted in turn +with the first parsing result to be successful taking precedence in the +final result. The formats in \code{fmt} can be any strings, however the +following characters (or successive repetitions thereof) are reserved in +the sense that they are treated in a special way: +\itemize{ +\item \code{"y"}: parsed as year; +\item \code{"m"}: parsed as month; +\item \code{"d"}: parsed as day; +\item \code{"H"}: parsed as hour; +\item \code{"M"}: parsed as minute; +\item \code{"S"}: parsed as second. +}} + +\item{na, sec_na, min_na, hour_na, mday_na, mon_na, year_na}{A character vector of +alternative values to allow during matching. This can be used to indicate +different forms of missing values to be found during the parsing date-time +strings.} +} +\value{ +A character matrix of six columns: \code{"year"}, \code{"mon"}, \code{"mday"}, +\code{"hour"}, \code{"min"} and \code{"sec"}. Each row corresponds to an element in +\code{dttm}. Each element of the matrix is the parsed date/time component. +} +\description{ +\code{\link[=parse_dttm]{parse_dttm()}} extracts date and time components. \code{\link[=parse_dttm]{parse_dttm()}} wraps around +\code{\link[=parse_dttm_]{parse_dttm_()}}, which is not vectorized over \code{fmt}. +} +\examples{ +sdtm.oak:::parse_dttm("2020", "y") +sdtm.oak:::parse_dttm("2020-05", "y") + +sdtm.oak:::parse_dttm("2020-05", "y-m") +sdtm.oak:::parse_dttm("2020-05-11", "y-m-d") + +sdtm.oak:::parse_dttm("2020 05 11", "y m d") +sdtm.oak:::parse_dttm("2020 05 11", "y m d") +sdtm.oak:::parse_dttm("2020 05 11", "y\\\\s+m\\\\s+d") +sdtm.oak:::parse_dttm("2020 05 11", "y\\\\s+m\\\\s+d") + +sdtm.oak:::parse_dttm("2020-05-11 11:45", "y-m-d H:M") +sdtm.oak:::parse_dttm("2020-05-11 11:45:15.6", "y-m-d H:M:S") + +sdtm.oak:::parse_dttm(c("2002-05-11 11:45", "-05-11 11:45"), "y-m-d H:M") +sdtm.oak:::parse_dttm(c("2002-05-11 11:45", "-05-11 11:45"), "-m-d H:M") +sdtm.oak:::parse_dttm(c("2002-05-11 11:45", "-05-11 11:45"), c("y-m-d H:M", "-m-d H:M")) + +sdtm.oak:::parse_dttm(c("2020-05-18", "2020-UN-18", "2020-UNK-UN"), "y-m-d") +sdtm.oak:::parse_dttm(c("2020-05-18", "2020-UN-18", "2020-UNK-UN"), "y-m-d", na = "UN") +sdtm.oak:::parse_dttm(c("2020-05-18", "2020-UN-18", "2020-UNK-UN"), "y-m-d", na = c("UN", "UNK")) + +} +\keyword{internal} diff --git a/man/parse_dttm_fmt.Rd b/man/parse_dttm_fmt.Rd new file mode 100644 index 0000000..6a74b18 --- /dev/null +++ b/man/parse_dttm_fmt.Rd @@ -0,0 +1,65 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{parse_dttm_fmt_} +\alias{parse_dttm_fmt_} +\alias{parse_dttm_fmt} +\title{Parse a date/time format} +\usage{ +parse_dttm_fmt_(fmt, pattern) + +parse_dttm_fmt(fmt, patterns = fmt_cmp()) +} +\arguments{ +\item{fmt}{A format string (scalar) to be parsed by \code{patterns}.} + +\item{pattern, patterns}{A string (in the case of \code{pattern}), or a character +vector (in the case of \code{patterns}) of regexps for each of the individual +date/time components. Default value is that of \code{\link[=fmt_cmp]{fmt_cmp()}}. Use this function +if you plan on passing a different set of patterns.} +} +\value{ +A \link[tibble:tibble-package]{tibble} of seven columns: +\itemize{ +\item \code{fmt_c}: date/time format component. Values are either \code{"year"}, \code{"mon"}, +\code{"mday"}, \code{"hour"}, \code{"min"}, \code{"sec"}, or \code{NA}. +\item \code{pat}: Regexp used to parse the date/time component. +\item \code{cap}: The captured substring from the format. +\item \code{start}: Start position in the format string for this capture. +\item \code{end}: End position in the format string for this capture. +\item \code{len}: Length of the capture (number of chars). +\item \code{ord}: Ordinal of this date/time component in the format string. +} + +Each row is for either a date/time format component or a "delimiter" string +or pattern in-between format components. +} +\description{ +\code{\link[=parse_dttm_fmt]{parse_dttm_fmt()}} parses a date/time formats, meaning it will try to parse +the components of the format \code{fmt} that refer to date/time components. +\code{\link[=parse_dttm_fmt_]{parse_dttm_fmt_()}} is similar to \code{\link[=parse_dttm_fmt]{parse_dttm_fmt()}} but is not vectorized +over \code{fmt}. +} +\examples{ +sdtm.oak:::parse_dttm_fmt("ymd") +sdtm.oak:::parse_dttm_fmt("H:M:S") + +sdtm.oak:::parse_dttm_fmt("ymd HMS") + +# Repeating the same special patterns, e.g. "yy" still counts as one pattern +# only. +sdtm.oak:::parse_dttm_fmt("yymmdd HHMMSS") + +# Note that `"y"`, `"m"`, `"d"`, `"H"`, `"M"` or `"S"` are reserved patterns +# that are matched first and interpreted as format components. # Example: the +# first "y" in "year" is parsed as meaning year followed by "ear y". The +# second "y" is not longer matched because a first match already # succeded. +sdtm.oak:::parse_dttm_fmt("year y") + +# Specify custom patterns +sdtm.oak:::parse_dttm_fmt( + "year month day", + fmt_cmp(year = "year", mon = "month", mday = "day") +) + +} +\keyword{internal} diff --git a/man/pseq.Rd b/man/pseq.Rd new file mode 100644 index 0000000..c1bb877 --- /dev/null +++ b/man/pseq.Rd @@ -0,0 +1,22 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{pseq} +\alias{pseq} +\title{Parallel sequence generation} +\usage{ +pseq(from, to) +} +\arguments{ +\item{from}{An integer vector. The starting value(s) of the sequence(s).} + +\item{to}{An integer vector. The ending value(s) of the sequence(s).} +} +\value{ +An integer vector. +} +\description{ +\code{\link[=pseq]{pseq()}} is similar to \code{\link[=seq]{seq()}} but conveniently accepts integer vectors as +inputs to \code{from} and \code{to}, allowing for parallel generation of sequences. +The result is the union of the generated sequences. +} +\keyword{internal} diff --git a/man/reg_matches.Rd b/man/reg_matches.Rd new file mode 100644 index 0000000..72e531c --- /dev/null +++ b/man/reg_matches.Rd @@ -0,0 +1,25 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{reg_matches} +\alias{reg_matches} +\title{\code{regmatches()} with \code{NA}} +\usage{ +reg_matches(x, m, invert = FALSE) +} +\arguments{ +\item{x}{A character vector.} + +\item{m}{An object with match data.} + +\item{invert}{A logical scalar. If \code{TRUE}, extract or replace the non-matched +substrings.} +} +\value{ +A list of character vectors with the matched substrings, or \code{NA} if +matching failed. +} +\description{ +\code{\link[=reg_matches]{reg_matches()}} is a thin wrapper around \code{\link[=regmatches]{regmatches()}} that returns +\code{NA} instead of \code{character(0)} when matching fails. +} +\keyword{internal} diff --git a/man/regex_or.Rd b/man/regex_or.Rd new file mode 100644 index 0000000..efb2ba4 --- /dev/null +++ b/man/regex_or.Rd @@ -0,0 +1,32 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{regex_or} +\alias{regex_or} +\title{Utility function to assemble a regex of alternative patterns} +\usage{ +regex_or(x, .open = FALSE, .close = FALSE) +} +\arguments{ +\item{x}{A character vector of alternative patterns.} + +\item{.open}{Whether the resulting regex should start with \code{"|"}.} + +\item{.close}{Whether the resulting regex should end with \code{"|"}.} +} +\value{ +A character scalar of the resulting regex. +} +\description{ +\code{\link[=regex_or]{regex_or()}} takes a set of patterns and binds them with the Or (\code{"|"}) +pattern for an easy regex of alternative patterns. +} +\examples{ +# A regex for matching either "jan" or "feb" +sdtm.oak:::regex_or(c("jan", "feb")) + +# Setting `.open` and/or `.close` to `TRUE` can be handy if this regex +# is to be combined into a larger regex. +paste0(sdtm.oak:::regex_or(c("jan", "feb"), .close = TRUE), r"{\d{2}}") + +} +\keyword{internal} diff --git a/man/sdtm.oak-package.Rd b/man/sdtm.oak-package.Rd new file mode 100644 index 0000000..fc04b35 --- /dev/null +++ b/man/sdtm.oak-package.Rd @@ -0,0 +1,37 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/sdtm.oak-package.R +\docType{package} +\name{sdtm.oak-package} +\alias{sdtm.oak} +\alias{sdtm.oak-package} +\title{sdtm.oak: SDTM Data Transformation Engine} +\description{ +An EDC and Data Standard agnostic SDTM data transformation engine that automates the transformation of raw clinical data in ODM format to SDTM based on standard mapping algorithms. +} +\seealso{ +Useful links: +\itemize{ + \item \url{https://pharmaverse.github.io/sdtm.oak/} + \item \url{https://github.com/pharmaverse/sdtm.oak} + \item Report bugs at \url{https://github.com/pharmaverse/sdtm.oak/issues} +} + +} +\author{ +\strong{Maintainer}: Omar Garcia \email{ogcalderon@cdisc.org} + +Authors: +\itemize{ + \item Rammprasad Ganapathy + \item Ramiro Magno \email{rmagno@pattern.institute} (\href{https://orcid.org/0000-0001-5226-3441}{ORCID}) +} + +Other contributors: +\itemize{ + \item Pattern Institute [copyright holder, funder] + \item F. Hoffmann-La Roche AG [copyright holder, funder] + \item Pfizer Inc [copyright holder, funder] +} + +} +\keyword{internal} diff --git a/man/sdtm.oak.Rd b/man/sdtm.oak.Rd deleted file mode 100644 index 3a8c460..0000000 --- a/man/sdtm.oak.Rd +++ /dev/null @@ -1,12 +0,0 @@ -% Generated by roxygen2: do not edit by hand -% Please edit documentation in R/package.R -\name{sdtm.oak} -\alias{sdtm.oak} -\title{An EDC and Data Standard agnostic SDTM data transformation engine that automates -the transformation of raw clinical data in ODM format to SDTM based on standard -mapping algorithms} -\description{ -An EDC and Data Standard agnostic SDTM data transformation engine that automates -the transformation of raw clinical data in ODM format to SDTM based on standard -mapping algorithms -} diff --git a/man/str_to_anycase.Rd b/man/str_to_anycase.Rd new file mode 100644 index 0000000..6d5007f --- /dev/null +++ b/man/str_to_anycase.Rd @@ -0,0 +1,19 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/parse_dttm_fmt.R +\name{str_to_anycase} +\alias{str_to_anycase} +\title{Generate case insensitive regexps} +\usage{ +str_to_anycase(x) +} +\arguments{ +\item{x}{A character vector of strings consisting of word characters.} +} +\value{ +A character vector. +} +\description{ +\code{\link[=str_to_anycase]{str_to_anycase()}} takes a character vector of word strings as input, and +generates regular expressions that express that match in any case. +} +\keyword{internal} diff --git a/man/yy_to_yyyy.Rd b/man/yy_to_yyyy.Rd new file mode 100644 index 0000000..c4895aa --- /dev/null +++ b/man/yy_to_yyyy.Rd @@ -0,0 +1,35 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{yy_to_yyyy} +\alias{yy_to_yyyy} +\title{Convert two-digit to four-digit years} +\usage{ +yy_to_yyyy(x, cutoff_2000 = 68L) +} +\arguments{ +\item{x}{An integer vector of years.} + +\item{cutoff_2000}{An integer value. Two-digit years smaller or equal to +\code{cutoff_2000} are parsed as though starting with \code{20}, otherwise parsed as +though starting with \code{19}.} +} +\value{ +An integer vector. +} +\description{ +\code{\link[=yy_to_yyyy]{yy_to_yyyy()}} converts two-digit years to four-digit years. +} +\examples{ +sdtm.oak:::yy_to_yyyy(0:5) +sdtm.oak:::yy_to_yyyy(2000:2005) + +sdtm.oak:::yy_to_yyyy(90:99) +sdtm.oak:::yy_to_yyyy(1990:1999) + +# NB: change in behavior after 68 +sdtm.oak:::yy_to_yyyy(65:72) + +sdtm.oak:::yy_to_yyyy(1965:1972) + +} +\keyword{internal} diff --git a/man/zero_pad_whole_number.Rd b/man/zero_pad_whole_number.Rd new file mode 100644 index 0000000..d4b972f --- /dev/null +++ b/man/zero_pad_whole_number.Rd @@ -0,0 +1,30 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dtc_create_iso8601.R +\name{zero_pad_whole_number} +\alias{zero_pad_whole_number} +\title{Convert an integer to a zero-padded character vector} +\usage{ +zero_pad_whole_number(x, n = 2L) +} +\arguments{ +\item{x}{An integer vector.} + +\item{n}{Number of digits in the output, including zero padding.} +} +\value{ +A character vector. +} +\description{ +\code{\link[=zero_pad_whole_number]{zero_pad_whole_number()}} takes non-negative integer values and converts +them to character with zero padding. Negative numbers and numbers greater +than the width specified by the number of digits \code{n} are converted to \code{NA}. +} +\examples{ +sdtm.oak:::zero_pad_whole_number(c(-1, 0, 1)) + +sdtm.oak:::zero_pad_whole_number(c(-1, 0, 1, 10, 99, 100), n = 2) + +sdtm.oak:::zero_pad_whole_number(c(-1, 0, 1, 10, 99, 100), n = 3) + +} +\keyword{internal} diff --git a/renv.lock b/renv.lock index 7426a19..5141d10 100644 --- a/renv.lock +++ b/renv.lock @@ -1,6 +1,6 @@ { "R": { - "Version": "4.3.1", + "Version": "4.3.2", "Repositories": [ { "Name": "CRAN", @@ -54,7 +54,7 @@ }, "R.utils": { "Package": "R.utils", - "Version": "2.12.2", + "Version": "2.12.3", "Source": "Repository", "Repository": "RSPM", "Requirements": [ @@ -65,7 +65,7 @@ "tools", "utils" ], - "Hash": "325f01db13da12c04d8f6e7be36ff514" + "Hash": "3dc2829b790254bfba21e60965787651" }, "R6": { "Package": "R6", @@ -526,6 +526,20 @@ ], "Hash": "06230136b2d2b9ba5805e1963fa6e890" }, + "hms": { + "Package": "hms", + "Version": "1.1.3", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "lifecycle", + "methods", + "pkgconfig", + "rlang", + "vctrs" + ], + "Hash": "b59377caa7ed00fa41808342002138f9" + }, "htmltools": { "Package": "htmltools", "Version": "0.5.5", @@ -686,6 +700,19 @@ ], "Hash": "001cecbeac1cff9301bdc3775ee46a86" }, + "lubridate": { + "Package": "lubridate", + "Version": "1.9.2", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "generics", + "methods", + "timechange" + ], + "Hash": "e25f18436e3efd42c7c590a1c4c15390" + }, "magrittr": { "Package": "magrittr", "Version": "2.0.3", @@ -1366,6 +1393,17 @@ ], "Hash": "79540e5fcd9e0435af547d885f184fd5" }, + "timechange": { + "Package": "timechange", + "Version": "0.2.0", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "cpp11" + ], + "Hash": "8548b44f79a35ba1791308b61e6012d7" + }, "tinytex": { "Package": "tinytex", "Version": "0.45", diff --git a/renv/profiles/4.1/renv.lock b/renv/profiles/4.1/renv.lock deleted file mode 100644 index 5b85869..0000000 --- a/renv/profiles/4.1/renv.lock +++ /dev/null @@ -1,1254 +0,0 @@ -{ - "R": { - "Version": "4.1.3", - "Repositories": [ - { - "Name": "CRAN", - "URL": "https://packagemanager.posit.co/cran/latest" - }, - { - "Name": "RSPM", - "URL": "https://packagemanager.posit.co/cran/2022-03-10" - } - ] - }, - "Packages": { - "R.cache": { - "Package": "R.cache", - "Version": "0.16.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R.methodsS3", - "R.oo", - "R.utils", - "digest", - "utils" - ], - "Hash": "fe539ca3f8efb7410c3ae2cf5fe6c0f8" - }, - "R.methodsS3": { - "Package": "R.methodsS3", - "Version": "1.8.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "utils" - ], - "Hash": "278c286fd6e9e75d0c2e8f731ea445c8" - }, - "R.oo": { - "Package": "R.oo", - "Version": "1.25.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R.methodsS3", - "methods", - "utils" - ], - "Hash": "a0900a114f4f0194cf4aa8cd4a700681" - }, - "R.utils": { - "Package": "R.utils", - "Version": "2.12.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R.methodsS3", - "R.oo", - "methods", - "tools", - "utils" - ], - "Hash": "325f01db13da12c04d8f6e7be36ff514" - }, - "R6": { - "Package": "R6", - "Version": "2.5.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "470851b6d5d0ac559e9d01bb352b4021" - }, - "Rcpp": { - "Package": "Rcpp", - "Version": "1.0.8", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "methods", - "utils" - ], - "Hash": "22b546dd7e337f6c0c58a39983a496bc" - }, - "askpass": { - "Package": "askpass", - "Version": "1.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "sys" - ], - "Hash": "e8a22846fff485f0be3770c2da758713" - }, - "backports": { - "Package": "backports", - "Version": "1.4.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "c39fbec8a30d23e721980b8afb31984c" - }, - "base64enc": { - "Package": "base64enc", - "Version": "0.1-3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "543776ae6848fde2f48ff3816d0628bc" - }, - "brew": { - "Package": "brew", - "Version": "1.0-7", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "38875ea52350ff4b4c03849fc69736c8" - }, - "brio": { - "Package": "brio", - "Version": "1.1.3", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "976cf154dfb043c012d87cddd8bca363" - }, - "bslib": { - "Package": "bslib", - "Version": "0.3.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "grDevices", - "htmltools", - "jquerylib", - "jsonlite", - "rlang", - "sass" - ], - "Hash": "56ae7e1987b340186a8a5a157c2ec358" - }, - "cachem": { - "Package": "cachem", - "Version": "1.0.6", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "fastmap", - "rlang" - ], - "Hash": "648c5b3d71e6a37e3043617489a0a0e9" - }, - "callr": { - "Package": "callr", - "Version": "3.7.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R6", - "processx", - "utils" - ], - "Hash": "461aa75a11ce2400245190ef5d3995df" - }, - "checkmate": { - "Package": "checkmate", - "Version": "2.0.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "backports", - "utils" - ], - "Hash": "a667800d5f0350371bedeb8b8b950289" - }, - "cli": { - "Package": "cli", - "Version": "3.2.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "glue", - "utils" - ], - "Hash": "1bdb126893e9ce6aae50ad1d6fc32faf" - }, - "clipr": { - "Package": "clipr", - "Version": "0.8.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "utils" - ], - "Hash": "3f038e5ac7f41d4ac41ce658c85e3042" - }, - "commonmark": { - "Package": "commonmark", - "Version": "1.8.0", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "2ba81b120c1655ab696c935ef33ea716" - }, - "cpp11": { - "Package": "cpp11", - "Version": "0.4.2", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "fa53ce256cd280f468c080a58ea5ba8c" - }, - "crayon": { - "Package": "crayon", - "Version": "1.5.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "grDevices", - "methods", - "utils" - ], - "Hash": "741c2e098e98afe3dc26a7b0e5489f4e" - }, - "credentials": { - "Package": "credentials", - "Version": "1.3.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "askpass", - "curl", - "jsonlite", - "openssl", - "sys" - ], - "Hash": "93762d0a34d78e6a025efdbfb5c6bb41" - }, - "curl": { - "Package": "curl", - "Version": "4.3.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "022c42d49c28e95d69ca60446dbabf88" - }, - "desc": { - "Package": "desc", - "Version": "1.4.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R6", - "cli", - "rprojroot", - "utils" - ], - "Hash": "eebd27ee58fcc58714eedb7aa07d8ad1" - }, - "devtools": { - "Package": "devtools", - "Version": "2.4.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "callr", - "cli", - "desc", - "ellipsis", - "fs", - "httr", - "lifecycle", - "memoise", - "pkgbuild", - "pkgload", - "rcmdcheck", - "remotes", - "rlang", - "roxygen2", - "rstudioapi", - "rversions", - "sessioninfo", - "stats", - "testthat", - "tools", - "usethis", - "utils", - "withr" - ], - "Hash": "fc35e13bb582e5fe6f63f3d647a4cbe5" - }, - "diffobj": { - "Package": "diffobj", - "Version": "0.3.5", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "crayon", - "methods", - "stats", - "tools", - "utils" - ], - "Hash": "bcaa8b95f8d7d01a5dedfd959ce88ab8" - }, - "digest": { - "Package": "digest", - "Version": "0.6.29", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "utils" - ], - "Hash": "cf6b206a045a684728c3267ef7596190" - }, - "dplyr": { - "Package": "dplyr", - "Version": "1.0.8", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R6", - "generics", - "glue", - "lifecycle", - "magrittr", - "methods", - "pillar", - "rlang", - "tibble", - "tidyselect", - "utils", - "vctrs" - ], - "Hash": "ef47665e64228a17609d6df877bf86f2" - }, - "ellipsis": { - "Package": "ellipsis", - "Version": "0.3.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "rlang" - ], - "Hash": "bb0eec2fe32e88d9e2836c2f73ea2077" - }, - "evaluate": { - "Package": "evaluate", - "Version": "0.15", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "methods" - ], - "Hash": "699a7a93d08c962d9f8950b2d7a227f1" - }, - "fansi": { - "Package": "fansi", - "Version": "1.0.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "grDevices", - "utils" - ], - "Hash": "f28149c2d7a1342a834b314e95e67260" - }, - "fastmap": { - "Package": "fastmap", - "Version": "1.1.0", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "77bd60a6157420d4ffa93b27cf6a58b8" - }, - "fs": { - "Package": "fs", - "Version": "1.5.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "methods" - ], - "Hash": "7c89603d81793f0d5486d91ab1fc6f1d" - }, - "generics": { - "Package": "generics", - "Version": "0.1.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "methods" - ], - "Hash": "177475892cf4a55865868527654a7741" - }, - "gert": { - "Package": "gert", - "Version": "1.5.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "askpass", - "credentials", - "openssl", - "rstudioapi", - "sys", - "zip" - ], - "Hash": "8fddce7cbd59467106266a6e93e253b4" - }, - "gh": { - "Package": "gh", - "Version": "1.3.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "cli", - "gitcreds", - "httr", - "ini", - "jsonlite" - ], - "Hash": "38c2580abbda249bd6afeec00d14f531" - }, - "git2r": { - "Package": "git2r", - "Version": "0.29.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "graphics", - "utils" - ], - "Hash": "b114135c4749076bd5ef74a5827b6f62" - }, - "gitcreds": { - "Package": "gitcreds", - "Version": "0.1.1", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "f3aefccc1cc50de6338146b62f115de8" - }, - "glue": { - "Package": "glue", - "Version": "1.6.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "methods" - ], - "Hash": "4f2596dfb05dac67b9dc558e5c6fba2e" - }, - "highr": { - "Package": "highr", - "Version": "0.9", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "xfun" - ], - "Hash": "8eb36c8125038e648e5d111c0d7b2ed4" - }, - "htmltools": { - "Package": "htmltools", - "Version": "0.5.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "base64enc", - "digest", - "fastmap", - "grDevices", - "rlang", - "utils" - ], - "Hash": "526c484233f42522278ab06fb185cb26" - }, - "httr": { - "Package": "httr", - "Version": "1.4.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R6", - "curl", - "jsonlite", - "mime", - "openssl" - ], - "Hash": "a525aba14184fec243f9eaec62fbed43" - }, - "hunspell": { - "Package": "hunspell", - "Version": "3.0.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "Rcpp", - "digest" - ], - "Hash": "3987784c19192ad0f2261c456d936df1" - }, - "ini": { - "Package": "ini", - "Version": "0.3.1", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "6154ec2223172bce8162d4153cda21f7" - }, - "jquerylib": { - "Package": "jquerylib", - "Version": "0.1.4", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "htmltools" - ], - "Hash": "5aab57a3bd297eee1c1d862735972182" - }, - "jsonlite": { - "Package": "jsonlite", - "Version": "1.8.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "methods" - ], - "Hash": "d07e729b27b372429d42d24d503613a0" - }, - "knitr": { - "Package": "knitr", - "Version": "1.37", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "evaluate", - "highr", - "methods", - "stringr", - "tools", - "xfun", - "yaml" - ], - "Hash": "a4ec675eb332a33fe7b7fe26f70e1f98" - }, - "lifecycle": { - "Package": "lifecycle", - "Version": "1.0.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "glue", - "rlang" - ], - "Hash": "a6b6d352e3ed897373ab19d8395c98d0" - }, - "magrittr": { - "Package": "magrittr", - "Version": "2.0.2", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "cdc87ecd81934679d1557633d8e1fe51" - }, - "memoise": { - "Package": "memoise", - "Version": "2.0.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "cachem", - "rlang" - ], - "Hash": "e2817ccf4a065c5d9d7f2cfbe7c1d78c" - }, - "mime": { - "Package": "mime", - "Version": "0.12", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "tools" - ], - "Hash": "18e9c28c1d3ca1560ce30658b22ce104" - }, - "openssl": { - "Package": "openssl", - "Version": "2.0.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "askpass" - ], - "Hash": "cf4329aac12c2c44089974559c18e446" - }, - "pillar": { - "Package": "pillar", - "Version": "1.7.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "cli", - "crayon", - "ellipsis", - "fansi", - "glue", - "lifecycle", - "rlang", - "utf8", - "utils", - "vctrs" - ], - "Hash": "51dfc97e1b7069e9f7e6f83f3589c22e" - }, - "pkgbuild": { - "Package": "pkgbuild", - "Version": "1.3.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R6", - "callr", - "cli", - "crayon", - "desc", - "prettyunits", - "rprojroot", - "withr" - ], - "Hash": "66d2adfed274daf81ccfe77d974c3b9b" - }, - "pkgconfig": { - "Package": "pkgconfig", - "Version": "2.0.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "utils" - ], - "Hash": "01f28d4278f15c76cddbea05899c5d6f" - }, - "pkgload": { - "Package": "pkgload", - "Version": "1.2.4", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "cli", - "crayon", - "desc", - "methods", - "rlang", - "rprojroot", - "rstudioapi", - "utils", - "withr" - ], - "Hash": "7533cd805940821bf23eaf3c8d4c1735" - }, - "praise": { - "Package": "praise", - "Version": "1.0.0", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "a555924add98c99d2f411e37e7d25e9f" - }, - "prettyunits": { - "Package": "prettyunits", - "Version": "1.1.1", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "95ef9167b75dde9d2ccc3c7528393e7e" - }, - "processx": { - "Package": "processx", - "Version": "3.5.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R6", - "ps", - "utils" - ], - "Hash": "0cbca2bc4d16525d009c4dbba156b37c" - }, - "ps": { - "Package": "ps", - "Version": "1.6.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "utils" - ], - "Hash": "32620e2001c1dce1af49c49dccbb9420" - }, - "purrr": { - "Package": "purrr", - "Version": "0.3.4", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "magrittr", - "rlang" - ], - "Hash": "97def703420c8ab10d8f0e6c72101e02" - }, - "rappdirs": { - "Package": "rappdirs", - "Version": "0.3.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "5e3c5dc0b071b21fa128676560dbe94d" - }, - "rcmdcheck": { - "Package": "rcmdcheck", - "Version": "1.4.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R6", - "callr", - "cli", - "curl", - "desc", - "digest", - "pkgbuild", - "prettyunits", - "rprojroot", - "sessioninfo", - "utils", - "withr", - "xopen" - ], - "Hash": "8f25ebe2ec38b1f2aef3b0d2ef76f6c4" - }, - "rematch2": { - "Package": "rematch2", - "Version": "2.1.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "tibble" - ], - "Hash": "76c9e04c712a05848ae7a23d2f170a40" - }, - "remotes": { - "Package": "remotes", - "Version": "2.4.2.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "methods", - "stats", - "tools", - "utils" - ], - "Hash": "63d15047eb239f95160112bcadc4fcb9" - }, - "renv": { - "Package": "renv", - "Version": "1.0.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "utils" - ], - "Hash": "41b847654f567341725473431dd0d5ab" - }, - "rlang": { - "Package": "rlang", - "Version": "1.0.6", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "utils" - ], - "Hash": "4ed1f8336c8d52c3e750adcdc57228a7" - }, - "rmarkdown": { - "Package": "rmarkdown", - "Version": "2.12", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "bslib", - "evaluate", - "htmltools", - "jquerylib", - "jsonlite", - "knitr", - "methods", - "stringr", - "tinytex", - "tools", - "utils", - "xfun", - "yaml" - ], - "Hash": "354da5088ddfdffb73c11cc952885d88" - }, - "roxygen2": { - "Package": "roxygen2", - "Version": "7.2.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R6", - "brew", - "cli", - "commonmark", - "cpp11", - "desc", - "knitr", - "methods", - "pkgload", - "purrr", - "rlang", - "stringi", - "stringr", - "utils", - "withr", - "xml2" - ], - "Hash": "7b153c746193b143c14baa072bae4e27" - }, - "rprojroot": { - "Package": "rprojroot", - "Version": "2.0.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "249d8cd1e74a8f6a26194a91b47f21d1" - }, - "rstudioapi": { - "Package": "rstudioapi", - "Version": "0.13", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "06c85365a03fdaf699966cc1d3cf53ea" - }, - "rversions": { - "Package": "rversions", - "Version": "2.1.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "curl", - "utils", - "xml2" - ], - "Hash": "f88fab00907b312f8b23ec13e2d437cb" - }, - "sass": { - "Package": "sass", - "Version": "0.4.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R6", - "fs", - "htmltools", - "rappdirs", - "rlang" - ], - "Hash": "50cf822feb64bb3977bda0b7091be623" - }, - "sessioninfo": { - "Package": "sessioninfo", - "Version": "1.2.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "cli", - "tools", - "utils" - ], - "Hash": "3f9796a8d0a0e8c6eb49a4b029359d1f" - }, - "spelling": { - "Package": "spelling", - "Version": "2.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "commonmark", - "hunspell", - "knitr", - "xml2" - ], - "Hash": "b8c899a5c83f0d897286550481c91798" - }, - "staged.dependencies": { - "Package": "staged.dependencies", - "Version": "0.3.1.9001", - "Source": "GitHub", - "RemoteType": "github", - "RemoteHost": "api.github.com", - "RemoteUsername": "openpharma", - "RemoteRepo": "staged.dependencies", - "RemoteRef": "main", - "RemoteSha": "fb124997306b35d44a0225bb4b400bf7258c4c75", - "Requirements": [ - "checkmate", - "desc", - "devtools", - "digest", - "dplyr", - "fs", - "git2r", - "glue", - "httr", - "jsonlite", - "methods", - "rcmdcheck", - "remotes", - "rlang", - "stats", - "tidyr", - "utils", - "withr", - "yaml" - ], - "Hash": "145e45afff215d85f808dda07557fcad" - }, - "stringi": { - "Package": "stringi", - "Version": "1.7.6", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "stats", - "tools", - "utils" - ], - "Hash": "bba431031d30789535745a9627ac9271" - }, - "stringr": { - "Package": "stringr", - "Version": "1.4.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "glue", - "magrittr", - "stringi" - ], - "Hash": "0759e6b6c0957edb1311028a49a35e76" - }, - "styler": { - "Package": "styler", - "Version": "1.10.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R.cache", - "cli", - "magrittr", - "purrr", - "rlang", - "rprojroot", - "tools", - "vctrs", - "withr" - ], - "Hash": "d61238fd44fc63c8adf4565efe8eb682" - }, - "sys": { - "Package": "sys", - "Version": "3.4", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "b227d13e29222b4574486cfcbde077fa" - }, - "testthat": { - "Package": "testthat", - "Version": "3.1.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R6", - "brio", - "callr", - "cli", - "crayon", - "desc", - "digest", - "ellipsis", - "evaluate", - "jsonlite", - "lifecycle", - "magrittr", - "methods", - "pkgload", - "praise", - "processx", - "ps", - "rlang", - "utils", - "waldo", - "withr" - ], - "Hash": "32454e5780e8dbe31e4b61b13d8918fe" - }, - "tibble": { - "Package": "tibble", - "Version": "3.1.6", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "ellipsis", - "fansi", - "lifecycle", - "magrittr", - "methods", - "pillar", - "pkgconfig", - "rlang", - "utils", - "vctrs" - ], - "Hash": "8a8f02d1934dfd6431c671361510dd0b" - }, - "tidyr": { - "Package": "tidyr", - "Version": "1.2.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "cpp11", - "dplyr", - "ellipsis", - "glue", - "lifecycle", - "magrittr", - "purrr", - "rlang", - "tibble", - "tidyselect", - "utils", - "vctrs" - ], - "Hash": "d8b95b7fee945d7da6888cf7eb71a49c" - }, - "tidyselect": { - "Package": "tidyselect", - "Version": "1.1.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "ellipsis", - "glue", - "purrr", - "rlang", - "vctrs" - ], - "Hash": "17f6da8cfd7002760a859915ce7eef8f" - }, - "tinytex": { - "Package": "tinytex", - "Version": "0.37", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "xfun" - ], - "Hash": "a80abeb527a977e4bef21873d29222dd" - }, - "usethis": { - "Package": "usethis", - "Version": "2.1.5", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "cli", - "clipr", - "crayon", - "curl", - "desc", - "fs", - "gert", - "gh", - "glue", - "jsonlite", - "lifecycle", - "purrr", - "rappdirs", - "rlang", - "rprojroot", - "rstudioapi", - "stats", - "utils", - "whisker", - "withr", - "yaml" - ], - "Hash": "c499f488e6dd7718accffaee5bc5a79b" - }, - "utf8": { - "Package": "utf8", - "Version": "1.2.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "c9c462b759a5cc844ae25b5942654d13" - }, - "vctrs": { - "Package": "vctrs", - "Version": "0.4.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "cli", - "glue", - "rlang" - ], - "Hash": "8b54f22e2a58c4f275479c92ce041a57" - }, - "waldo": { - "Package": "waldo", - "Version": "0.3.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "cli", - "diffobj", - "fansi", - "glue", - "methods", - "rematch2", - "rlang", - "tibble" - ], - "Hash": "ad8cfff5694ac5b3c354f8f2044bd976" - }, - "whisker": { - "Package": "whisker", - "Version": "0.4", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "ca970b96d894e90397ed20637a0c1bbe" - }, - "withr": { - "Package": "withr", - "Version": "2.5.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "grDevices", - "graphics", - "stats" - ], - "Hash": "c0e49a9760983e81e55cdd9be92e7182" - }, - "xfun": { - "Package": "xfun", - "Version": "0.30", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "stats", - "tools" - ], - "Hash": "e83f48136b041845e50a6658feffb197" - }, - "xml2": { - "Package": "xml2", - "Version": "1.3.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "methods" - ], - "Hash": "40682ed6a969ea5abfd351eb67833adc" - }, - "xopen": { - "Package": "xopen", - "Version": "1.0.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "processx" - ], - "Hash": "6c85f015dee9cc7710ddd20f86881f58" - }, - "yaml": { - "Package": "yaml", - "Version": "2.3.5", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "458bb38374d73bf83b1bb85e353da200" - }, - "zip": { - "Package": "zip", - "Version": "2.2.0", - "Source": "Repository", - "Repository": "RSPM", - "Hash": "c7eef2996ac270a18c2715c997a727c5" - } - } -} diff --git a/renv/profiles/4.1/renv/.gitignore b/renv/profiles/4.1/renv/.gitignore deleted file mode 100644 index 0ec0cbb..0000000 --- a/renv/profiles/4.1/renv/.gitignore +++ /dev/null @@ -1,7 +0,0 @@ -library/ -local/ -cellar/ -lock/ -python/ -sandbox/ -staging/ diff --git a/renv/profiles/4.1/renv/settings.json b/renv/profiles/4.1/renv/settings.json deleted file mode 100644 index 3830d97..0000000 --- a/renv/profiles/4.1/renv/settings.json +++ /dev/null @@ -1,21 +0,0 @@ -{ - "bioconductor.version": null, - "external.libraries": [], - "ignored.packages": [ - "admiraldev" - ], - "package.dependency.fields": [ - "Imports", - "Depends", - "LinkingTo" - ], - "ppm.enabled": null, - "ppm.ignored.urls": [], - "r.version": null, - "snapshot.type": "custom", - "use.cache": true, - "vcs.ignore.cellar": true, - "vcs.ignore.library": true, - "vcs.ignore.local": true, - "vcs.manage.ignores": true -} diff --git a/renv/profiles/4.2/renv.lock b/renv/profiles/4.2/renv.lock index c7dd8ca..4120eec 100644 --- a/renv/profiles/4.2/renv.lock +++ b/renv/profiles/4.2/renv.lock @@ -54,7 +54,7 @@ }, "R.utils": { "Package": "R.utils", - "Version": "2.12.2", + "Version": "2.12.3", "Source": "Repository", "Repository": "RSPM", "Requirements": [ @@ -65,7 +65,7 @@ "tools", "utils" ], - "Hash": "325f01db13da12c04d8f6e7be36ff514" + "Hash": "3dc2829b790254bfba21e60965787651" }, "R6": { "Package": "R6", @@ -526,6 +526,21 @@ ], "Hash": "06230136b2d2b9ba5805e1963fa6e890" }, + "hms": { + "Package": "hms", + "Version": "1.1.2", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "ellipsis", + "lifecycle", + "methods", + "pkgconfig", + "rlang", + "vctrs" + ], + "Hash": "41100392191e1244b887878b533eea91" + }, "htmltools": { "Package": "htmltools", "Version": "0.5.4", @@ -686,6 +701,19 @@ ], "Hash": "001cecbeac1cff9301bdc3775ee46a86" }, + "lubridate": { + "Package": "lubridate", + "Version": "1.9.2", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "generics", + "methods", + "timechange" + ], + "Hash": "e25f18436e3efd42c7c590a1c4c15390" + }, "magrittr": { "Package": "magrittr", "Version": "2.0.3", @@ -1365,6 +1393,17 @@ ], "Hash": "79540e5fcd9e0435af547d885f184fd5" }, + "timechange": { + "Package": "timechange", + "Version": "0.2.0", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "cpp11" + ], + "Hash": "8548b44f79a35ba1791308b61e6012d7" + }, "tinytex": { "Package": "tinytex", "Version": "0.44", diff --git a/renv/profiles/4.3/renv.lock b/renv/profiles/4.3/renv.lock index 7426a19..5141d10 100644 --- a/renv/profiles/4.3/renv.lock +++ b/renv/profiles/4.3/renv.lock @@ -1,6 +1,6 @@ { "R": { - "Version": "4.3.1", + "Version": "4.3.2", "Repositories": [ { "Name": "CRAN", @@ -54,7 +54,7 @@ }, "R.utils": { "Package": "R.utils", - "Version": "2.12.2", + "Version": "2.12.3", "Source": "Repository", "Repository": "RSPM", "Requirements": [ @@ -65,7 +65,7 @@ "tools", "utils" ], - "Hash": "325f01db13da12c04d8f6e7be36ff514" + "Hash": "3dc2829b790254bfba21e60965787651" }, "R6": { "Package": "R6", @@ -526,6 +526,20 @@ ], "Hash": "06230136b2d2b9ba5805e1963fa6e890" }, + "hms": { + "Package": "hms", + "Version": "1.1.3", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "lifecycle", + "methods", + "pkgconfig", + "rlang", + "vctrs" + ], + "Hash": "b59377caa7ed00fa41808342002138f9" + }, "htmltools": { "Package": "htmltools", "Version": "0.5.5", @@ -686,6 +700,19 @@ ], "Hash": "001cecbeac1cff9301bdc3775ee46a86" }, + "lubridate": { + "Package": "lubridate", + "Version": "1.9.2", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "generics", + "methods", + "timechange" + ], + "Hash": "e25f18436e3efd42c7c590a1c4c15390" + }, "magrittr": { "Package": "magrittr", "Version": "2.0.3", @@ -1366,6 +1393,17 @@ ], "Hash": "79540e5fcd9e0435af547d885f184fd5" }, + "timechange": { + "Package": "timechange", + "Version": "0.2.0", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "cpp11" + ], + "Hash": "8548b44f79a35ba1791308b61e6012d7" + }, "tinytex": { "Package": "tinytex", "Version": "0.45", diff --git a/staged_dependencies.yaml b/staged_dependencies.yaml new file mode 100644 index 0000000..72a49b0 --- /dev/null +++ b/staged_dependencies.yaml @@ -0,0 +1,11 @@ +--- +upstream_repos: +- repo: pharmaverse/admiraldev + host: https://github.com + +downstream_repos: + +current_repo: + repo: pharmaverse/sdtm.oak + host: https://github.com + \ No newline at end of file diff --git a/tests/testthat/test-create_iso8601.R b/tests/testthat/test-create_iso8601.R new file mode 100644 index 0000000..f1325cf --- /dev/null +++ b/tests/testthat/test-create_iso8601.R @@ -0,0 +1,75 @@ +test_that("`create_iso8601()`: individual date components", { + x <- c("0", "50", "1950", "80", "1980", "2000") + y0 <- create_iso8601(x, .format = "y", .check_format = FALSE) + y1 <- c(NA, "2050", "1950", "1980", "1980", "2000") + expect_identical(y0, y1) + + x <- c("0", "jan", "JAN", "JaN", "1", "01") + y0 <- create_iso8601(x, .format = "m", .check_format = FALSE) + y1 <- c(NA, "--01", "--01", "--01", NA, "--01") + expect_identical(y0, y1) + + x <- c("0", "00", "1", "01", "10", "31") + y0 <- create_iso8601(x, .format = "d", .check_format = FALSE) + y1 <- c("----00", "----00", "----01", "----01", "----10", "----31") + expect_identical(y0, y1) +}) + +test_that("`create_iso8601()`: dates", { + y1 <- c("1999-01-01", "2000-01-01", "1999-01-01", "1999-12-31") + + x <- c("19990101", "20000101", "990101", "991231") + y0 <- create_iso8601(x, .format = "ymd", .check_format = FALSE) + expect_identical(y0, y1) + + x <- c("1999-01-01", "2000-01-01", "99-01-01", "99-12-31") + y0 <- create_iso8601(x, .format = "y-m-d", .check_format = FALSE) + expect_identical(y0, y1) + + x <- c("1999 01 01", "2000 01 01", "99 01 01", "99 12 31") + y0 <- create_iso8601(x, .format = "y m d", .check_format = FALSE) + expect_identical(y0, y1) +}) + +test_that("`create_iso8601()`: times: hours and minutes", { + y1 <- c("-----T15:20", "-----T00:10", "-----T23:01", "-----T00:00") + + x <- c("1520", "0010", "2301", "0000") + y0 <- create_iso8601(x, .format = "HM", .check_format = FALSE) + expect_identical(y0, y1) + + x <- c("15:20", "00:10", "23:01", "00:00") + y0 <- create_iso8601(x, .format = "H:M", .check_format = FALSE) + expect_identical(y0, y1) + + x <- c("15h20", "00h10", "23h01", "00h00") + y0 <- create_iso8601(x, .format = "HhM", .check_format = FALSE) + expect_identical(y0, y1) +}) + +test_that("`create_iso8601()`: times: hours, minutes and seconds", { + x <- c("152000", "001059", "230112.123", "00002.") + y0 <- create_iso8601(x, .format = "HMS", .check_format = FALSE) + y1 <- c("-----T15:20:00", "-----T00:10:59", "-----T23:01:12.123", "-----T00:00:02") + expect_identical(y0, y1) + + x <- c("15:20:00", "00:10:59", "23:01:12.123", "00:00:2.", "5:1:4") + y0 <- create_iso8601(x, .format = "H:M:S", .check_format = FALSE) + y1 <- c(y1, "-----T05:01:04") + expect_identical(y0, y1) +}) + + +test_that("`create_iso8601()`: dates and times", { + dates <- c("1999-01-01", "2000-01-01", "99-01-01", "99-12-31") + times <- c("1520", "0010", "2301", "0000") + iso8601_dttm <- create_iso8601(dates, times, .format = c("y-m-d", "HM"), .check_format = FALSE) + expectation <- + c( + "1999-01-01T15:20", + "2000-01-01T00:10", + "1999-01-01T23:01", + "1999-12-31T00:00" + ) + expect_identical(iso8601_dttm, expectation) +}) diff --git a/tests/testthat/test-find_int_gap.R b/tests/testthat/test-find_int_gap.R new file mode 100644 index 0000000..27bb2a8 --- /dev/null +++ b/tests/testthat/test-find_int_gap.R @@ -0,0 +1,43 @@ +test_that("`find_int_gap()`: one interval", { + tbl <- find_int_gap(c(1L:3L, 7L:10L)) + + expect_identical(tbl$start, 4L) + expect_identical(tbl$end, 6L) +}) + +test_that("`find_int_gap()`: two intervals", { + tbl <- find_int_gap(c(1L:3L, 7L:10L, 15L:20L)) + + expect_identical(tbl$start, c(4L, 11L)) + expect_identical(tbl$end, c(6L, 14L)) +}) + +test_that("`find_int_gap()`: explicit endpoints", { + tbl <- find_int_gap(c(3L:5L, 8L), xmin = 0L, xmax = 10L) + + expect_identical(tbl$start, c(0L, 6L, 9L)) + expect_identical(tbl$end, c(2L, 7L, 10L)) +}) + +test_that("`find_int_gap()`: no intervals", { + tbl <- find_int_gap(0L:5L) + expect_identical(tbl, tibble::tibble(start = integer(), end = integer())) +}) + +test_that("`find_int_gap()`: ensure `x` is integerish", { + expect_error(find_int_gap(c(1.5, pi))) +}) + +test_that("`find_int_gap()`: ensure `xmin` and `xmax` are integer scalars", { + # Error because `xmin` and `xmax` are vectors + expect_error(find_int_gap(c(1L:3L, 7L:10L), xmin = 1L:2L)) + expect_error(find_int_gap(c(1L:3L, 7L:10L), xmax = 3L:4L)) + + # Error because `xmin` and `xmax` are double + expect_error(find_int_gap(c(1L:3L, 7L:10L), xmin = 1.5)) + expect_error(find_int_gap(c(1L:3L, 7L:10L), xmax = 1.5)) + + # Error because `xmin` and `xmax` are character + expect_error(find_int_gap(c(1L:3L, 7L:10L), xmin = "1")) + expect_error(find_int_gap(c(1L:3L, 7L:10L), xmax = "2")) +}) diff --git a/tests/testthat/test-format_iso8601.R b/tests/testthat/test-format_iso8601.R new file mode 100644 index 0000000..7753ea6 --- /dev/null +++ b/tests/testthat/test-format_iso8601.R @@ -0,0 +1,24 @@ +test_that("`format_iso8601()`: basic usage", { + cols <- c("year", "mon", "mday", "hour", "min", "sec") + m <- matrix( + c( + "99", "00", "01", + "Jan", "feb", "03", + "1", "01", "31", + "00", "12", "23", + "00", "59", "10", + "42", "5.15", NA + ), + ncol = 6L, + dimnames = list(c(), cols) + ) + + expect_identical( + format_iso8601(m), + c( + "1999-01-01T00:00:42", + "2000-02-01T12:59:05.15", + "2001-03-31T23:10" + ) + ) +}) diff --git a/tests/testthat/test-iso8601.R b/tests/testthat/test-iso8601.R new file mode 100644 index 0000000..b21737f --- /dev/null +++ b/tests/testthat/test-iso8601.R @@ -0,0 +1,44 @@ +test_that("`iso8601_na()`: basic usage", { + expect_identical(iso8601_na(c("10", "15")), c("10", "15")) + expect_identical(iso8601_na(c("10", NA_character_)), c("10", "-")) + expect_identical(iso8601_na(character()), character(0L)) +}) + +test_that("`iso8601_na()`: input can't be `NULL`", { + expect_error(iso8601_na(NULL)) + expect_error(iso8601_na(c())) +}) + +test_that("`zero_pad_whole_number()`: ensure `x` is integerish", { + expect_error(zero_pad_whole_number(pi)) + expect_error(zero_pad_whole_number("42")) + expect_error(zero_pad_whole_number(sqrt(2.0))) + expect_error(zero_pad_whole_number(TRUE)) + + expect_no_error(zero_pad_whole_number(1L)) + expect_no_error(zero_pad_whole_number(1.00)) + expect_no_error(zero_pad_whole_number(c(1L:3L))) +}) + +test_that("`zero_pad_whole_number()`: basic usage", { + expect_identical(zero_pad_whole_number(c(-1L, 0L, 1L)), c(NA, "00", "01")) + expect_identical( + zero_pad_whole_number(c(-1L, 0L, 1L, 10L, 99L, 100L), n = 2L), + c(NA, "00", "01", "10", "99", NA) + ) + expect_identical( + zero_pad_whole_number(c(-1L, 0L, 1L, 10L, 99L, 100L), n = 3L), + c(NA, "000", "001", "010", "099", "100") + ) +}) + +test_that("`zero_pad_whole_number()`: ensure `n` is scalar integer", { + expect_no_error(zero_pad_whole_number(1L, n = 1L)) + expect_error(zero_pad_whole_number(1L, n = 1L:2L)) +}) + +test_that("`iso8601_two_digits()`: basic usage", { + x <- c("0", "00", "1", "01", "42", "100", NA_character_, "1.") + y <- c("00", "00", "01", "01", "42", NA, NA, NA) + expect_identical(iso8601_two_digits(x), y) +}) diff --git a/tests/testthat/test-onload.R b/tests/testthat/test-onload.R deleted file mode 100644 index 0cc1245..0000000 --- a/tests/testthat/test-onload.R +++ /dev/null @@ -1,3 +0,0 @@ -test_that("multiplication works", { - expect_identical(2L * 2L, 4L) -}) diff --git a/tests/testthat/test-parse_dttm.R b/tests/testthat/test-parse_dttm.R new file mode 100644 index 0000000..8da7ca9 --- /dev/null +++ b/tests/testthat/test-parse_dttm.R @@ -0,0 +1,53 @@ +test_that("`months_abb_regex()`: default behavior (case insensitive)", { + x <- paste0( + "[Jj][Aa][Nn]|", + "[Ff][Ee][Bb]|", + "[Mm][Aa][Rr]|", + "[Aa][Pp][Rr]|", + "[Mm][Aa][Yy]|", + "[Jj][Uu][Nn]|", + "[Jj][Uu][Ll]|", + "[Aa][Uu][Gg]|", + "[Ss][Ee][Pp]|", + "[Oo][Cc][Tt]|", + "[Nn][Oo][Vv]|", + "[Dd][Ee][Cc]" + ) + expect_identical(months_abb_regex(), x) +}) + +test_that("`months_abb_regex()`: uppercase", { + x <- paste0( + "JAN|", + "FEB|", + "MAR|", + "APR|", + "MAY|", + "JUN|", + "JUL|", + "AUG|", + "SEP|", + "OCT|", + "NOV|", + "DEC" + ) + expect_identical(months_abb_regex(case = "upper"), x) +}) + +test_that("`months_abb_regex()`: lowercase", { + x <- paste0( + "jan|", + "feb|", + "mar|", + "apr|", + "may|", + "jun|", + "jul|", + "aug|", + "sep|", + "oct|", + "nov|", + "dec" + ) + expect_identical(months_abb_regex(case = "lower"), x) +}) diff --git a/tests/testthat/test-parse_dttm_fmt.R b/tests/testthat/test-parse_dttm_fmt.R new file mode 100644 index 0000000..1108c56 --- /dev/null +++ b/tests/testthat/test-parse_dttm_fmt.R @@ -0,0 +1,131 @@ +test_that("`parse_dttm_fmt_`: empty fmt", { + x <- + tibble::tibble( + pat = character(), + cap = character(), + start = integer(), + end = integer(), + len = integer() + ) + expect_identical(x, parse_dttm_fmt_("", pattern = "y")) + expect_error(parse_dttm_fmt_(character(), pattern = "y")) +}) + +test_that("`parse_dttm_fmt_`: empty pattern", { + expect_error(parse_dttm_fmt_("ymd", pattern = "")) + expect_error(parse_dttm_fmt_("ymd", pattern = character())) +}) + +test_that("`parse_dttm_fmt_`: basic usage", { + fmt1 <- "y m d" + fmt2 <- "y-m-d" + + x1 <- + tibble::tibble( + pat = "y", + cap = "y", + start = 1L, + end = 1L, + len = 1L + ) + expect_identical(x1, parse_dttm_fmt_(fmt1, pattern = "y")) + expect_identical(x1, parse_dttm_fmt_(fmt2, pattern = "y")) + + x2 <- + tibble::tibble( + pat = "m", + cap = "m", + start = 3L, + end = 3L, + len = 1L + ) + expect_identical(x2, parse_dttm_fmt_(fmt1, pattern = "m")) + expect_identical(x2, parse_dttm_fmt_(fmt2, pattern = "m")) + + x3 <- + tibble::tibble( + pat = "d", + cap = "d", + start = 5L, + end = 5L, + len = 1L + ) + + expect_identical(x3, parse_dttm_fmt_(fmt1, pattern = "d")) + expect_identical(x3, parse_dttm_fmt_(fmt2, pattern = "d")) +}) + +test_that("`parse_dttm_fmt_`: pattern variations", { + fmt <- "HH:MM:SS" + + x1 <- + tibble::tibble( + pat = "H", + cap = "H", + start = 1L, + end = 1L, + len = 1L + ) + + x2 <- + tibble::tibble( + pat = "HH", + cap = "HH", + start = 1L, + end = 2L, + len = 2L + ) + + x3 <- + tibble::tibble( + pat = "H+", + cap = "HH", + start = 1L, + end = 2L, + len = 2L + ) + + expect_identical(x1, parse_dttm_fmt_(fmt, pattern = "H")) + expect_identical(x2, parse_dttm_fmt_(fmt, pattern = "HH")) + expect_identical(x3, parse_dttm_fmt_(fmt, pattern = "H+")) +}) + +test_that("`parse_dttm_fmt_`: only the first match is returned", { + fmt <- "H M S H" + + x1 <- + tibble::tibble( + pat = "H", + cap = "H", + start = 1L, + end = 1L, + len = 1L + ) + + x2 <- + tibble::tibble( + pat = character(), + cap = character(), + start = integer(), + end = integer(), + len = integer() + ) + + x3 <- + tibble::tibble( + pat = "H+", + cap = "H", + start = 1L, + end = 1L, + len = 1L + ) + + expect_identical(x1, parse_dttm_fmt_(fmt, pattern = "H")) + expect_identical(x2, parse_dttm_fmt_(fmt, pattern = "HH")) + expect_identical(x3, parse_dttm_fmt_(fmt, pattern = "H+")) +}) + +test_that("`parse_dttm_fmt`: empty fmt", { + expect_identical(fmt_dttmc(), parse_dttm_fmt("", pattern = "y")) + expect_error(parse_dttm_fmt_(character(), pattern = "y")) +}) diff --git a/tests/testthat/test-pseq.R b/tests/testthat/test-pseq.R new file mode 100644 index 0000000..abdbde3 --- /dev/null +++ b/tests/testthat/test-pseq.R @@ -0,0 +1,7 @@ +test_that("`pseq()`: scalar inputs", { + expect_identical(pseq(from = 0L, to = 5L), 0L:5L) +}) + +test_that("`pseq()`: vector inputs", { + expect_identical(pseq(from = c(0L, 10L), to = c(5L, 15L)), c(0L:5L, 10L:15L)) +}) diff --git a/tests/testthat/test-reg_matches.R b/tests/testthat/test-reg_matches.R new file mode 100644 index 0000000..8f3d5cb --- /dev/null +++ b/tests/testthat/test-reg_matches.R @@ -0,0 +1,8 @@ +test_that("`reg_matches()`: basic usage", { + x <- c("sdtm.oak", "sdtm.cdisc", "adam") + m <- gregexpr("sdtm", x, fixed = TRUE) + + # `regmatches()` returns `character(0)` for `"adam"` + # But `reg_matches()` returns `NA` for `"adam"` + expect_identical(reg_matches(x, m), list("sdtm", "sdtm", NA_character_)) +}) diff --git a/tests/testthat/test-str_to_anycase.R b/tests/testthat/test-str_to_anycase.R new file mode 100644 index 0000000..b6e372e --- /dev/null +++ b/tests/testthat/test-str_to_anycase.R @@ -0,0 +1,5 @@ +test_that("`str_to_anycase()`: basic usage", { + x <- c("JAN", "feb", "mAr") + y <- c("[Jj][Aa][Nn]", "[Ff][Ee][Bb]", "[Mm][Aa][Rr]") + expect_identical(str_to_anycase(x), y) +}) diff --git a/tests/testthat/test-yy_to_yyyy.R b/tests/testthat/test-yy_to_yyyy.R new file mode 100644 index 0000000..a909b98 --- /dev/null +++ b/tests/testthat/test-yy_to_yyyy.R @@ -0,0 +1,28 @@ +test_that("`yy_to_yyyy()`: basic usage", { + # Default cutoff is at `68`. + x1 <- c(0L, 1L, 50L, 68L, 69L, 70L) + y1 <- c(2000L, 2001L, 2050L, 2068L, 1969L, 1970L) + expect_identical(yy_to_yyyy(x1), y1) + + # Different cutoff, e.g. `79`. + x2 <- 75L:85L + y2 <- + c( + 2075L, + 2076L, + 2077L, + 2078L, + 2079L, + 1980L, + 1981L, + 1982L, + 1983L, + 1984L, + 1985L + ) + expect_identical(yy_to_yyyy(x2, cutoff_2000 = 79L), y2) + + # Four-digit years remain altered. + x3 <- 1965L:1975L + expect_identical(yy_to_yyyy(x3), x3) +}) diff --git a/vignettes/.gitignore b/vignettes/.gitignore new file mode 100644 index 0000000..097b241 --- /dev/null +++ b/vignettes/.gitignore @@ -0,0 +1,2 @@ +*.html +*.R diff --git a/vignettes/articles/iso_8601.Rmd b/vignettes/articles/iso_8601.Rmd new file mode 100644 index 0000000..222e5d0 --- /dev/null +++ b/vignettes/articles/iso_8601.Rmd @@ -0,0 +1,254 @@ +--- +title: "Converting dates, times or date-times to ISO 8601" +--- + +```{r, include = FALSE} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) +library(sdtm.oak) +``` + +An SDTM DTC variable may include data that is represented in [ISO +8601](https://en.wikipedia.org/wiki/ISO_8601) format as a complete date/time, a +partial date/time, or an incomplete date/time. `{sdtm.oak}` provides the +`create_iso8601()` function that allows flexible mapping of date and time +values in various formats to a single date-time ISO 8601 format. + +## Introduction + +To perform conversion to the ISO 8601 format you need to pass two key arguments: + +- At least one vector of dates, times, or date-times of `character` type; +- A date/time format via the `.format` parameter that instructs `create_iso8601()` on which date/time components to expect. + +```{r} +create_iso8601("2000 01 05", .format = "y m d") +create_iso8601("22:35:05", .format = "H:M:S") +``` + +By default the `.format` parameter understands a few reserved characters: + +- `"y"` for year +- `"m"` for month +- `"d"` for day +- `"H"` for hours +- `"M"` for minutes +- `"S"` for seconds + +Besides character vectors of dates and times, you may also pass a single vector +of date-times, provided you adjust the format: + +```{r} +create_iso8601("2000-01-05 22:35:05", .format = "y-m-d H:M:S") +``` + +## Multiple inputs + +If you have dates and times in separate vectors then you will need to pass +a format for each vector: + +```{r} +create_iso8601("2000-01-05", "22:35:05", .format = c("y-m-d", "H:M:S")) +``` + +In addition, like most R functions that take vectors as input, +`create_iso8601()` is vectorized: + +```{r} +date <- c("2000-01-05", "2001-12-25", "1980-06-18", "1979-09-07") +time <- c("00:12:21", "22:35:05", "03:00:15", "07:09:00") +create_iso8601(date, time, .format = c("y-m-d", "H:M:S")) +``` + +But the number of elements in each of the inputs has to match or you will get an +error: + +```{r} +date <- c("2000-01-05", "2001-12-25", "1980-06-18", "1979-09-07") +time <- "00:12:21" +try(create_iso8601(date, time, .format = c("y-m-d", "H:M:S"))) +``` + +You can combine individual date and time components coming +in as separate inputs; here is a contrived example of year, month and day +together, hour, and minute: + +```{r} +year <- c("99", "84", "00", "80", "79", "1944", "1953") +month_and_day <- c("jan 1", "apr 04", "mar 06", "jun 18", "sep 07", "sep 13", "sep 14") +hour <- c("12", "13", "05", "23", "16", "16", "19") +min <- c("0", "60", "59", "42", "44", "10", "13") +create_iso8601(year, month_and_day, hour, min, .format = c("y", "m d", "H", "M")) +``` + +The `.format` argument must be always named; otherwise, it will be treated as if +it were one of the inputs and interpreted as missing. + +```{r} +try(create_iso8601("2000-01-05", "y-m-d")) +``` + + +## Format variations + +The `.format` parameter can easily accommodate variations in the format of the +inputs: + +```{r} +create_iso8601("2000-01-05", .format = "y-m-d") +create_iso8601("2000 01 05", .format = "y m d") +create_iso8601("2000/01/05", .format = "y/m/d") +``` + +Individual components may come in a different order, so adjust the format +accordingly: + +```{r} +create_iso8601("2000 01 05", .format = "y m d") +create_iso8601("05 01 2000", .format = "d m y") +create_iso8601("01 05, 2000", .format = "m d, y") +``` + +All other individual characters given in the format are taken strictly, e.g. +the number of spaces matters: + +```{r} +date <- c("2000 01 05", "2000 01 05", "2000 01 05", "2000 01 05") +create_iso8601(date, .format = "y m d") +create_iso8601(date, .format = "y m d") +create_iso8601(date, .format = "y m d") +create_iso8601(date, .format = "y m d") +``` + +The format can include regular expressions though: + +```{r} +create_iso8601(date, .format = "y\\s+m\\s+d") +``` + +By default, a streak of the reserved characters is treated as if only one was +provided, so these formats are equivalent: + +```{r} +date <- c("2000-01-05", "2001-12-25", "1980-06-18", "1979-09-07") +time <- c("00:12:21", "22:35:05", "03:00:15", "07:09:00") +create_iso8601(date, time, .format = c("y-m-d", "H:M:S")) +create_iso8601(date, time, .format = c("yyyy-mm-dd", "HH:MM:SS")) +create_iso8601(date, time, .format = c("yyyyyyyy-m-dddddd", "H:MMMMM:SSSS")) +``` + +## Multiple alternative formats + +When an input vector contains values with varying formats, a single format may +not be adequate to encompass all variations. In such situations, it's advisable +to list multiple alternative formats. This approach ensures that each format is +tried sequentially until one matches the data in the vector. + +```{r} +date <- c("2000/01/01", "2000-01-02", "2000 01 03", "2000/01/04") +create_iso8601(date, .format = "y-m-d") +create_iso8601(date, .format = "y m d") +create_iso8601(date, .format = "y/m/d") +create_iso8601(date, .format = list(c("y-m-d", "y m d", "y/m/d"))) +``` + +Consider the order in which you supply the formats, as it can be significant. If +multiple formats could potentially match, the sequence determines which format +is applied first. + +```{r} +create_iso8601("07 04 2000", .format = list(c("d m y", "m d y"))) +create_iso8601("07 04 2000", .format = list(c("m d y", "d m y"))) +``` + +Note that if you are passing alternative formats, then the `.format` argument +must be a list whose length matches the number of inputs. + +## Parsing of date or time components + +By default, date or time components are parsed as follows: + +- year: either parsed from a two- or four-digit year; +- month: either as a numeric month (single or two-digit number) or as an English abbreviated month name (e.g. Jan, Jun or Dec) regardless of case; +- month day: are parsed from two-digit numbers; +- hour and minute: are parsed from single or two-digit numbers; +- second: is parsed from single or two-digit numbers with an optional fractional part. + +```{r} +# Years: two-digit or four-digit numbers. +years <- c("0", "1", "00", "01", "15", "30", "50", "68", "69", "80", "99") +create_iso8601(years, .format = "y") + +# Adjust the point where two-digits years are mapped to 2000's or 1900's. +create_iso8601(years, .format = "y", .cutoff_2000 = 20L) + +# Both numeric months (two-digit only) and abbreviated months work out of the box +months <- c("0", "00", "1", "01", "Jan", "jan") +create_iso8601(months, .format = "m") + +# Month days: single or two-digit numbers, anything else results in NA. +create_iso8601(c("1", "01", "001", "10", "20", "31"), .format = "d") + +# Hours +create_iso8601(c("1", "01", "001", "10", "20", "31"), .format = "H") + +# Minutes +create_iso8601(c("1", "01", "001", "10", "20", "60"), .format = "M") + +# Seconds +create_iso8601(c("1", "01", "23.04", "001", "10", "20", "60"), .format = "S") +``` + +## Allowing alternative date or time values + +If date or time component values include special values, e.g. values +encoding missing values, then you can indicate those values as possible +alternatives such that the parsing will tolerate them; use the `.na` argument: + +```{r} +create_iso8601("U DEC 2019 14:00", .format = "d m y H:M") +create_iso8601("U DEC 2019 14:00", .format = "d m y H:M", .na = "U") + +create_iso8601("U UNK 2019 14:00", .format = "d m y H:M") +create_iso8601("U UNK 2019 14:00", .format = "d m y H:M", .na = c("U", "UNK")) +``` + +In this case you could achieve the same result using regexps: + +```{r} +create_iso8601("U UNK 2019 14:00", .format = "(d|U) (m|UNK) y H:M") +``` + + +## Changing reserved format characters + +There might be cases when the reserved characters --- `"y"`, `"m"`, `"d"`, +`"H"`, `"M"`, `"S"` --- might get in the way of specifying an adequate format. +For example, you might be tempted to use format `"HHMM"` to try to parse a time +such as `"14H00M"`. You could assume that the first "H" codes for parsing the +hour, and the second "H" to be a literal "H" but, actually, `"HH"` will be taken +to mean parsing hours, and `"MM"` to parse minutes. You can use the function +`fmt_cmp()` to specify alternative format regexps for the format, replacing the +default characters. + +In the next example, we reassign new format strings for the hour and minute +components, thus freeing the `"H"` and `"M"` patterns from being interpreted as +hours and minutes, and to be taken literally: + +```{r} +create_iso8601("14H00M", .format = "HHMM") +create_iso8601("14H00M", .format = "xHwM", .fmt_c = fmt_cmp(hour = "x", min = "w")) +``` +Note that you need to make sure that the format component regexps are mutually +exclusive, i.e. they don't have overlapping matches; otherwise +`create_iso8601()` will fail with an error. In the next example both months and +minutes could be represented by an `"m"` in the format resulting in an ambiguous +format specification. + +```{r} +fmt_cmp(hour = "h", min = "m") +try(create_iso8601("14H00M", .format = "hHmM", .fmt_c = fmt_cmp(hour = "h", min = "m"))) +``` +