Skip to content

Commit

Permalink
Merge pull request nationalparkservice#45 from RobLBaker/main
Browse files Browse the repository at this point in the history
updates for taxize/bold dependency chain
  • Loading branch information
RobLBaker authored Jan 15, 2025
2 parents 884e6f9 + b068929 commit 2f34055
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 15 deletions.
8 changes: 0 additions & 8 deletions .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
# fix linux build fails:
# https://forum.posit.co/t/libraptor2-dev-depends-libcurl4-gnutls-dev-but-it-is-not-installable-in-r-lib-actions-setup-r-dependencies-v2/181572/4
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]

name: R-CMD-check.yaml

Expand All @@ -31,7 +28,6 @@ jobs:
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes
PKG_SYSREQS: false

steps:
- uses: actions/checkout@v4
Expand All @@ -49,10 +45,6 @@ jobs:
extra-packages: any::rcmdcheck
needs: check

- name: dependencies on Linux
if: runner.os == 'Linux'
run: sudo apt-get install -y make pandoc git libjq-dev libssl-dev libgdal-dev gdal-bin libgeos-dev libproj-dev libsqlite3-dev libicu-dev libudunits2-dev librdf0-dev libxml2-dev libfreetype6-dev libjpeg-dev libpng-dev libtiff-dev libfontconfig1-dev libfribidi-dev libharfbuzz-dev libcurl4-gnutls-dev

- uses: r-lib/actions/check-r-package@v2
with:
upload-snapshots: true
Expand Down
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Imports:
crayon,
DPchecker (>= 0.0.0.9000),
EML,
EMLassemblyline (>= 3.5.4),
EMLassemblyline,
EMLeditor (>= 0.0.1.0),
NPSutils (>= 0.1.0),
QCkit (>= 0.1.0),
Expand Down
1 change: 1 addition & 0 deletions NPSdataverse.Rproj
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
Version: 1.0
ProjectId: a49a17c2-d2d1-45c6-a1ff-3bcdbf2ac52e

RestoreWorkspace: No
SaveWorkspace: No
Expand Down
12 changes: 6 additions & 6 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ tags:
- data access
date: "21 October 2024"
output:
word_document: default
pdf_document: default
html_document:
df_print: paged
pdf_document: default
word_document: default
authors:
- name: Robert L. Baker
orcid: "0000-0001-7591-5035"
Expand Down Expand Up @@ -134,7 +134,7 @@ new_metadata2 <- set_publisher(eml_object = new_metadata1,
for_or_by_NPS = FALSE,
NPS = FALSE)
```
By default, `EMLeditor` functions provide verbose user feedback and may require user input to confirm some operations. These checks are intended to help guide users, prevent inadvertent mistakes, and limit unnecessary API calls. However, requiring user input can hamper highly scripted approaches and limits reproducability. Therefore, all `EMLeditor` functions can be set to circumvent these requirements using the parameter `force = FALSE`.
By default, `EMLeditor` functions provide verbose user feedback and may require user input to confirm some operations. These checks are intended to help guide users, prevent inadvertent mistakes, and limit unnecessary API calls. However, requiring user input can hamper highly scripted approaches and limits reproducibility. Therefore, all `EMLeditor` functions can be set to circumvent these requirements using the parameter `force = TRUE`.

```
#example setting the abstract while suppressing user feedback and input:
Expand All @@ -152,13 +152,13 @@ The [DPchecker](https://nationalparkservice.github.io/DPchecker/) ("Data Package
2. Metadata elements necessary for DataStore automated extraction are present (creators have valid surnames, publication date is present and in the correct ISO-8601 format, keywords are present, abstract and methods are present and well formatted, etc).
3. Recommended EML elements are present including ORCiDs and a notes section.
4. Metadata and data are in congruence including all files listed in metadata refer to data files, the columns in the metadata match the columns in the data files, missing fields in data files are properly documented in metadata, and dates in data files fall within the date ranges given in the metadata, etc.
5. Data and metadata are in compliance with (a subset of) federal regulations including tests for information that should not be released to the public such as non-.gov emails and GPS coordinates for restricted data packages.
5. Data and metadata are in compliance with (a subset of) federal regulations including tests for information that should not be released to the public such as non-.gov emails.

For each test, the data package may fail with an error, fail with a warning, or pass. When possible, warnings and error messages indicate the appropriate `EMLeditor` function to address the problem. `DPchecker` will often throw a warning even if an EML element exists and is properly formatted but could by improved to increase the FAIR characteristics of the metadata. For instance, `DPchecker` will throw a warning if an abstract is less than 20 words long as it is unlikely the creator is able to meaningfully describe the data collection and processing in less than 20 words.

# NPSutils R Package

The `[NPSutils](https://nationalparkservice.github.io/NPSutils/)` ("NPS utilities") package serves primarily as a way to access data [@Baker_NPSutils2024]. `NPSutils` provides avenues for directly downloading data from DataStore using R. `NPSutils` can also import data downloaded from any repository into R and take advantage of rich EML metadata to call column types. `NPSutils` provides some basic meta-analysis capability, assuming certain interoperabilty standards are met (such as consistently naming columns with Darwin Core parameters or other domain-accepted parameter names). `NPSutils` can also be used to import data and metadata into common data visualization tools.
The `[NPSutils](https://nationalparkservice.github.io/NPSutils/)` ("NPS utilities") package serves primarily as a way to access data [@Baker_NPSutils2024]. `NPSutils` provides avenues for directly downloading data from DataStore using R. `NPSutils` can also import data downloaded from any repository into R and take advantage of rich EML metadata to call column types. `NPSutils` provides some basic meta-analysis capability. `NPSutils` can also be used to import data and metadata into common data visualization tools.

Example of how to download and access data:
```
Expand All @@ -167,7 +167,7 @@ Example of how to download and access data:
NPSutils::get_data_package(2300498)
# load the data package into R:
# load the data package into R, and use the metadata to call column types
# returns a list of tibbles; each tibble corresponds to a single data file
mojn <- NPSutils::load_data_package(2300498, assign_attributes = TRUE)
Expand Down

0 comments on commit 2f34055

Please sign in to comment.