diff --git a/.Rbuildignore b/.Rbuildignore index 5f3d199..086ab97 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -3,3 +3,4 @@ ^LICENSE\.md$ ^README\.Rmd$ ^\.github$ +^vignettes/articles$ diff --git a/DESCRIPTION b/DESCRIPTION index 53166c0..a6c8522 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: QCkit Type: Package Title: NPS Inventory and Monitoring Quality Control Toolkit -Version: 0.1.4 +Version: 0.1.5 Authors@R: c( person(given = "Robert", family = "Baker", diff --git a/NEWS.md b/NEWS.md index 43f027b..97d78ac 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,7 @@ +# QCkit v0.1.5 +2024-02-09 +* This version adds the DRR template, example files, and associated documentation to the QCkit package. + # QCkit v0.1.4 2024-01-23 * Maintenance on `get_custom_flag()` to align with updated DRR requirements diff --git a/docs/404.html b/docs/404.html index 5be2d8c..e96494a 100644 --- a/docs/404.html +++ b/docs/404.html @@ -32,17 +32,32 @@ QCkit - 0.1.4 + 0.1.5
vignettes/articles/DRR_Purpose_and_Scope.Rmd
+ DRR_Purpose_and_Scope.Rmd
The Data Release Report (DRR) is aimed at fulfilling requirements and +expectations of Open Science at the National Park Service. This +includes:
+Broad adoption of open-data and open-by-default +practices.
A move in the scientific disciplines toward considering and +publishing data sets as independently-citable scientific works.
Routine assignment of digital object identifiers (DOIs) to +datasets to facilitate location, reuse, and citation of specific +data
Increased transparency and reproducibility in the processing and +analysis of data.
Establishment of peer reviewed “data journals” dedicated to +publishing data sets and associated documentation designed to facilitate +their reuse.
Expectation that science-based decisions are based on +peer-reviewed, reproducible, and open science by default.
Data Release Reports are designed to parallel external peer-reviewed +scientific journals dedicated to facilitate reuse of reproducible +scientific data, in recognition that the primary reason IMD data are +collected is to support science-based decisions.
+Note that publication in a Data Release Report Series (not mandated) +is distinct from requirements to document data collection, processing, +and quality evaluation (mandated; see below). The establishment of a +Data Release Report Series is intended to facilitate and encourage this +type of reporting in a standard format, and in a manner commensurate +with current scientific norms.
+Reproducibility. The degree to which scientific +information, modeling, and methods of analysis could be evaluated by an +independent third party to arrive at the same, or substantially similar, +conclusion as the original study or information, and that the scientific +assessment can be repeated to obtain similar results (Plesser 2017). A +study is reproducible if you can take the original data and the computer +code used to analyze the data and reproduce all of the numerical +findings from the study. This may initially sound like a trivial task +but experience has shown that it’s not always easy to achieve this +seemingly minimal standard (ASA 2017, Plesser 2017).
+Transparency. Full disclosure of the methods used to +obtain, process, analyze, and review scientific data and other +information products, the availability of the data that went into and +came out of the analysis, and the computer code used to conduct the +analysis. Documenting this information is crucial to ensure +reproducibility and requires, at minimum, the sharing of analytical data +sets, relevant metadata, analytical code, and related software.
+Fitness for Use. The utility of scientific +information (in this case a dataset) for its intended users and its +intended purposes. Agencies must review and communicate the fitness of a +dataset for its intended purpose, and should provide the public +sufficient documentation about each dataset to allow data users to +determine the fitness of the data for the purpose for which third +parties may consider using it.
+Decisions. The type of decisions that must be based +on publicly-available, reproducible, and peer-reviewed science has not +been defined. At a minimum it includes any influential decisions, but it +may also include any decisions subject to public review and comment.
+Descriptive Reporting. The policies listed above are +consistent in the requirement to provide documentation that describes +the methods used to collect, process, and evaluate science products, +including data. Note that this is distinct from (and in practice may +significantly differ from) prescriptive documents such as protocols, +procedures, and study plans. Descriptive reporting should cite or +describe relevant science planning documents, methods used, deviations, +and mitigations. In total, descriptive reporting provides a clear “line +of sight” on precisely how data were collected, processed, and +evaluated. Although deviations may warrant revisions to prescriptive +documents, changes in prescriptive documents after the fact do not meet +reproducibility and transparency requirements.
+DO11B-a, +DO +11B-b, OMB +M-05-03 (Peer review and information quality):
+Scientific information must be appropriately reviewed prior to +use in decision-making, regulatory processes, or dissemination to the +public, regardless of media.
As per OMB M-05-03 “scientific information” includes factual +inputs, data, models, analyses, technical information, or scientific +assessments related to such disciplines as the behavioral and social +sciences, public health and medical sciences, life and earth sciences, +engineering, or physical sciences.
Methods for producing information will be made transparent, to +the maximum extent practicable, through accurate documentation, use of +appropriate review, and verification of information quality.
OMB +M-19-15 (Updates to Implementing the Information +Quality Act):
+Federal agencies must collect, use, and disseminate information +that is fit for its intended purpose.
Agencies must conduct pre-dissemination review of quality [of +scientific information] based on the likely use of that information. +Quality encompasses utility, integrity, and objectivity, defined as +follows: a) Utility – utility for its intended users and its intended +purposes, b) Integrity – refers to security, and c) Objectivity – +accurate, reliable, and unbiased as a matter of presentation and +substance.
Agencies should provide the public with sufficient documentation +about each dataset released to allow data users to determine the fitness +of the data for the purpose for which third parties may consider using +it. Potential users must be provided with sufficient +information to understand… the data’s strengths, weaknesses, analytical +limitations, security requirements, and processing options.
Reproducibility requirements for Influential Information. Note +that because this may not be determined at the time of collection, +processing, or dissemination this should be the default for NPS +scientific activities:
+Analyses must be disseminated with sufficient descriptions of +data and methods to allow them to be reproduced by qualified third +parties who may want to test the sensitivity of agency analyses. This is +a higher standard than simply documenting the characteristics of the +underlying data, which is required for all information.
Computer code used to process data should be made available to +the public for further analysis. In the context of results generated, +for example, a statistical model or machine augmented learning and +decision support, reproducibility requires, at a minimum transparency +about the specific methods, design parameters, equations or algorithms, +parameters, and assumptions used.
Reports, data, and computer code used, developed, or cited in the +analysis and reporting of findings must be made publicly available +except where prohibited by law.
Multiple policy and guidance documents require the use of best +available science in decision-making at the Natonal Park Service (NPS). +Additional requirements include:
+SO +3369 (Promoting Open Science):
+DO +11B (Ensuring Objectivity, Utility, and Integrity +of Information Used and Disseminated by the National Park +Service):
+NPS-75 +(Inventory and Monitoring Guidelines):
+An annual summary report documenting the condition of park +resources should be developed as part of the annual revision of the +parks Resource Management Plan.
An annual report provides a mechanism for reviewing and making +recommendations for revisions in the [Protocol/SOPs].
[Inventory] data obtained should be archived in park records and, +when appropriate, a report should be written summarizing +findings.
Reporting requirements as per IMD directive
IMD +Reporting and Analysis Guidance
+Because all of the data the NPS IMD collects is intended for use in +supporting science-based decisions as per our program’s five goals, and +is intended for use in planning (the decisions of which are subject to +public comment as per NEPA requirements), this means that by +default:
+All analytical work we do should be reproducible to the extent +possible. Analytical work includes both statistical analysis and +reporting of data as well as quality control procedures where data are +tested against quality standards and qualified or corrected as +appropriate.
Full reproducibility may not be possible in all cases, +particularly where analytical methods involve subject matter expertise +to make informed judgments on how to proceed with analyses. In such +cases, decisions should be documented to ensure transparency.
All IMD data should be published with supporting documentation to +allow for reproduction of results.
All IMD data should be evaluated to determine whether they are +suitable for their intended use.
All IMD data should be published with information fully +describing how data were collected, processed, and evaluated.
All data should be published in open formats that support the FAIR principles +(Findable, Accessible, Interoperable, and Resuable).
(for the NPS Inventory & Monitoring Program)
+Any project that involves the collection of scientific data for use +in supporting decisions to be made by NPS personnel. General study data +may or may not be collected based on documented or peer-reviewed study +plans or defined quality standards, but are in most cases purpose-driven +and resultant information should be evaluated for the suitability +for—and prior to—their use in decision support. These data may be reused +for secondary purposes including similar decisions at other locations or +times and/or portions of general study data may be reused or contribute +to other scientific work (observations from a deer browsing study may be +contribute to an inventory or may be used as ancillary data to explain +monitoring observations).
+Vital signs monitoring data are collected by IMD and park staff to +address specific monitoring objectives following methods designed to +ensure long-term comparability of data. Procedures are established to +ensure that data quality standards are maintained in perpetuity. +However, because monitoring data are collected over long periods of time +in dynamic systems, the methods employed may differ from those +prescribed in monitoring protocols, procedures, or sampling plans, and +any deviations (and resultant mitigations to the data) must be +documented. Data should be evaluated to ensure that they meet prescribed +standards and are suitable for analyses designed to test whether +monitoring objectives have been met. Monitoring data may be reused for +secondary purposes including synthesis reports and condition +assessments, and portions of monitoring data may contribute to +inventories.
+Inventory study data are similar to general study data in that they +are time- and area-specific efforts designed to answer specific +management needs as well as broader inventory objectives outlined in +project-specific study plans and inventory science plans. Inventory +studies typically follow well-documented data collection methods or +procedures, and resultant data should be evaluated for whether they are +suitable for use in supporting study-specific and broader +inventory-level objectives. Inventory study data are expected to be +reused to meet broader inventory level goals, but may also support other +scientific work and decision support.
+American Statistical Association (ASA). 2017. Recommendations to +funding agencies for supporting reproducible research. https://www.amstat.org/asa/files/pdfs/POL-ReproducibleResearchRecommendations.pdf.
+Plesser, H. E. 2017. Reproducibility vs. Replicability: A brief +history of a confused terminology. Front. Neuroinform. 11:76. https://doi.org/10.3389/fninf.2017.00076.
+Purpose and Scope of Data +Release Reports
+This is a template is for use when drafting Data Release Reports. +DRRs are created by the National Park Service and provide detailed +descriptions of valuable research datasets, including the methods used +to collect the data and technical analyses supporting the quality of the +measurements. Data Release Reports focus on helping others reuse data +rather than presenting results, testing hypotheses, or presenting new +interpretations and in-depth analyses.
+This template contains an rmarkdown template file, default folder +structure for project files, and all the necessary template files to +generate an unformatted .docx file. Upon submission for publication, the +.docx file will be ingested by EXstyles, converted to an .xml file and +fully formatted according to NPS branding upon final publication. The +goal of this process is to relieve data producers, managers, and +scientists from the burden of formatting and allow them to focus +primarily on content. Consequently, the .docx generated for the +publication process may not be visually appealing. The content, however, +should focus on the production, quality, and utility of NPS data +packages.
+To start your DRR you will need all of your data in flat +.csv files. All quality assurance, quality control, and quality +flagging should be completed. Ideally you have already created or are in +the process of creating a data package. All of the .csv files you want +to describe in the DRR should be in a single folder with no +additional .csv files (other files such as .txt and .xml will +be ignored). This folder can be the same folder you used/are using to +create a data package.
Using Rstudio, open an R project (Select: File +> New Project…) in the same folder as your .csv files. If you already +have an R project (.Rproj) initiated from creating a data package, you +can use that same R project.
Install, update (if necessary), and load the QCkit R +package. QCkit can be installed either as a component of the NPSdataverse +or on its own. The benefits of installing the entire NPSdataverse is +that upon loading the NPSdataverse, you will automatically be informed +if there are any updates to QCkit (or any of the constituent packages). +The downside to installing and loading the NPSdataverse is that the +first time you install it the process can be lengthy (there are many +dependencies) and you may hit the GitHub.com API rate limit. Either +installation is from GitHub.com and requires the devtools package to +install.
+# Install the devtools package, if you don't already have it:
+install.packages("devtools")
+# Install and load QCkit via NPSdataverse:
+devtools::install_github("nationalparkservice/NPSdataverse")
+library(NPSdataverse)
+# Alternatively, install and load just QCkit:
+devtools::install_github("nationalparkservice/QCkit")
+library(QCkit)
After selecting “OK” two things will happen: First, you the DRR +Template file will open up. It is called “Untitled.Rmd” by default. +Second, a new folder will be created called “Untitled” (Unless you opted +to change the default “Name:” in the “New R Markdown” pop up, then these +will have whatever name you gave them).
Edit the DRR Template according to your specifications and the +instructions in the “Using the DRR +Template” guide.
When you are done, “knit” the .Rmd file to Word and submit the +resulting .docx file for publication.
extra text to be moved elsewhere?
+Bibtex +reference file is used if you want to automate your citations. Add +each citation in bibtex format to this file and save it. Add in-text +citations to the DRR Template and your References section will +automatically be generated for you when you knit the .Rmd to .docx. You +should still visually check the final format in the .docx file for +accuracy, completion, and formatting. If you would prefer to manually +format your citations, feel free to continue doing so.
DRR_Report +an example of the .docx output file that takes into account your edits +and changes to DRR_to_docx.Rmd (assuming you have saved and/or knit the +.Rmd to .docx format in Rstudio).
Knit your own example DRR: Assuming you left the +“Name:” as the default “untitled”, you should be able to knit the DRR +template in to an example .docx that could be submitted for publication. +If you opted to change the Name, you will need to update the the file +paths before kniting.
+vignettes/articles/Using-the-DRR-Template.Rmd
+ Using-the-DRR-Template.Rmd
Data Release Reports (DRRs) are created by the National Park Service +and provide detailed descriptions of valuable research datasets, +including the methods used to collect the data and technical analyses +supporting the quality of the measurements. Data Release Reports focus +on helping others reuse data, rather than presenting results, testing +hypotheses, or presenting new interpretations, methods or in-depth +analyses.
+DRRs are intended to document the processing of fully-Quality-Assured +data to their final (Quality Controlled) form in a reproducible and +transparent manner. DRRs document the data collection methods and +quality standards used to prepare and review data prior to release. DRRs +present the quality of resultant data in the context of fitness for +their intended use.
+Each DRR cites source and resultant data packages that are published +concurrently and cross-referenced. Associated data packages are made +publicly available with the exception of data that must be protected +from release as per NPS and park-specific policies.
+Data packages that are published concurrently with DRRs are intended +to be independently citable scientific works that can serve as the basis +for subsequent analysis and reporting by NPS or third parties.
+To set up a project, follow the instructions in the Article, “Starting a DRR”.
+The following is for users who are using the DRR template file to +generate a data release report using RMarkdown.
+In addition to the report outline and a description of content for +each section, the template includes four standard code chunks.
+YAML Header:
+The YAML header helps format the DRR. You should not need to edit any +of the YAML header.
+R code chunks:
+user_edited_parameters
. A series of parameters that
+are used in the creation of the DRR and may be re-used in metadata and
+associated data package construction. You will need to edit these
+parameters for each DRR.
title
. The title of the Data Release Report.reportNumber
. This is optional, and should
+only be included if publishing in the semi-official DRR series.
+Set to NULL if there is no reportNumber.DRR_DSRefID
. This is the DataStore reference ID for the
+report.authorNames
. A list of the author’s names.authorAffiliations
. A list of the author’s
+affiliations. The order of author affiliations must match the order of
+the authors in the authorNames
list. Note that the entirety
+of each affiliation is enclosed in a single set of quotations. Line
+breaks are indicated with the authorORCID
. A list of ORCID iDs for each author in the
+format “xxxx-xxxx-xxxx-xxxx”. If an author does not have an ORCID iD,
+specify NA (no quotes). The order of ORCID iDs (and NAs) must correspond
+to the order of authors in the authorNames
list. Future
+iterations of the DRR Template will pull ORCID iDs from metadata and
+eventually from Active Directory. See ORCID for more information about ORCID
+iDs or to register an ORCID iD.DRRabstract
. The abstract for the DRR (which may be
+distinct from the data package abstract). Pay careful attention to
+non-standard characters, line breaks, carriage returns, and
+curly-quotes. You may find it useful to write the abstract in NotePad or
+some other text editor and NOT a word processor (such as Microsoft
+Word). Indicate line breaks with and a space between paragraphs - should
+you want them - using . The Abstract should succinctly describe the
+study, the assay(s) performed, the resulting data, and their reuse
+potential, but should not make any claims regarding new scientific
+findings. No references are allowed in this section. A good suggested
+length for abstracts is less than 250 words.dataPackageRefID
. DataStore reference ID for the data
+package associated with this report. You must have at least one data
+package. Eventually, we will automate importing much of this information
+from metadata and include the ability to describe multiple data packages
+in a single DRR.dataPackageTitle
. The title of the data package. Must
+match the title on DataStore (and metadata).dataPackageDescription
. A short title/subtitle or short
+description for the data package. Must match the data package
+metadata.dataPackageDOI
. Auto-generated, no need to edit or
+update. This is the data package DOI. It is based on the DataStore
+reference number.dataPackage_fileNames
. List the file names in your data
+package. Do NOT include metadata files. For example, include
+“my_data.csv” but do NOT include “my_metadata.xml”.dataPackage_fileSizes
. List the approximate size of
+each data file. Make sure the order of the file sizes corresponds to the
+order of file names in dataPackage_fileNames
.
+dataPackage_fileDescript
. A short description of the
+corresponding data file that helps distinguish it from other data files.
+A good guideline is 10 words or less. This will be used in a table
+summary table so brevity is a priority. If you have already created
+metadata for your data package in EML format, this should be the same
+text as found in the “entityDescription” element for each data
+file.setup
. Most users will not need to edit this code
+chunk. There is one code snippet for loading packages; the
+RRpackages
section is a suite of packages that are used to
+assist with reproducible reporting. You may not need these for your
+report, but we have included them as part of the base recommended
+packages. If you plan to perform you QC as part of the DRR construction
+process, you can add a second code snipped to import necessary packages
+for your QC process here.
title_do_not_edit
. These parameters are
+auto-generated based on either the EML you supplied (when that becomes
+an option) or the information you’ve already supplied under
+“user-edited-parameters”. You really should not need to edit these
+parameters.
authors_do_not_edit
. There is no need to edit this
+chunk. This writes the author names, ORCID iDs, and affiliations to the
+.docx document based on information supplied in
+user-edited-parameters.
LoadData
. Any datasets you need to load can go here.
+For most people these datasets are used to generate summary statistics
+on proportions of data that were flagged as accepted (A) accepted,
+estimated (AE) and rejected (R) during the quality control
+process.
FileTable
. Do not edit. Generates a table of file
+names, sizes, and descriptions in the data package being described by
+the DRR.
dataFlaggingTable
. This sample code provides a
+summary table defining the suggested data flagging codes. There is no
+need to edit this table.
Listing
. Appendix A, by default is the code listing.
+This will generate all code used in generating the report and data
+packages. In most cases, code listing is not required. If all QA/QC
+processes and data manipulations were performed elsewhere, you should
+cite that code (in the methods and references) and leave the “listing
+code chunk with the default settings of eval=FALSE and echo=FALSE. If
+you have developed custom scripts, you can add those to DataStore with
+the reference”Script” and cite them in the DRR.
session-info
is the information about the versions
+of R and packages used in generating the report. In most cases, you do
+not need to report session info (leave the session-info code chunk
+parameters in their default state: eval=FALSE). Session and version
+information is only necessary if you have set the “Listing” code chunk
+in appendix A to eval=TRUE. In that case, change the “session info” code
+chunk parameters to eval=TRUE.
To automate citations, add the citation in bibtex format to the file +“references.bib”. You can manually copy and paste the bibtex for each +reference in, or you can search for it from within Rstudio. From within +Rstudio, make sure you are editing the DRR rmarkdown template using the +“Visual” view (as opposed to “Source”). From the “Insert” drop-down +menu, select “@ Citation…” (shortcut: Cntrl-Shift-F8). This will open a +Graphical User Interface (GUI) tool where you can view all the citations +in your reference.bib file as well as search multiple databases for +references, automatically insert the bibtex for the reference into your +references.bib file (and customize the unique identifier if you’d like) +and insert the in-text citation into the DRR template.
+Once a reference is in your references.bib file, using the Visual +view of the template you can simply type the ‘@’ symbol and select which +reference to insert in the text.
+If you need to edit how the citation is displayed after inserting it +into the text, switch back to the “Source” view. Each bibtex citation +should start with a unique identifier; the example reference in the +supplied references.bib file has the unique identifier “@article{Scott1994,”. Using the “Source” view in +Rstudio, insert the reference in your text, by combining the “at” symbol +with the portion of the unique identifier after the curly bracket: @Scott1994 .
+Syntax | +Result | +
---|---|
+@Scott1994 concludes that … |
+Scott et al., 1994 concludes that … | +
+@Scott1994[p.33] concludes that … |
+Scott (1994, p.33) concludes that … | +
… end of sentence [@Scott1994] . |
+… end of sentence (Scott et al., 1994). | +
… end of sentence [see @Scott1994,p.33] . |
+… end of sentence (see Scott et al. 1994,p.33). | +
delineate multiple authors with colon:
+[@Scott1994; @aberdeen1958]
+ |
+delineate multiple authors with colon: (Scott et al., 1994; +Aberdeen, 1958) | +
Scott et al. conclude that …. [-@Scott1994] + | +Scott et al. conclude that . . . (1994) | +
The full citation, properly formatted, will be included in a +“References” section at the end of the rendered MS Word document. . . +though it is also worth visually inspecting the .docx for citation +completeness and formatting.
+The following text in the body of the DRR template will need to be +edited to customize it to each data package.
+This is a required section and consists of two subheadings:
+Data inputs - an optional subsection used to +describe datasets that the data package is based on if it is a +re-analysis, reorganization, or re-integration of prevously existing +data sets.
Summary of datasts created - this is a required +section used to explain each data record associated with the work (for +instance, a data package), including the DOI indicating where this +information is stored. It shoudl also provide an overview of the data +files and their formats. Each external data record should be +cited.
Sample text is included that uses r code to incorporate previously +specified parameters such as the data package title, file names, and +DOI.
+A code for a sample table summarizing the contents of the data +package (except the metadata) is provided.
+This is a required section. and the text includes multiple suggested +text elements and code for an example table defining data flagging +codes. Near future development here will incorporate additional optional +tables to summarize the data quality based on the flags in the data +sets.
+This is a required section that should contain brief instructions to +assist other researchers with reuse of the data. This may include +discussion of software packages (with appropriate citations) that are +suitable for analysing the assay data files, suggested downstream +processing steps (e.g. normalization, etc.), or tips for integrating or +comparing the data records with other datasets. Authors are encouraged +to provide code, programs or data-processing workflows if they may help +others understand or use the data.
+This is a required section that cites previous methods used but +should also be detailed enough in describing data production including +the experimental design, data acquisition assays, and any computational +processing (e.g. normalization, QA, QC) such that others can understand +the methods without referring to associated publications.
+Optional sub-sections within the methods include:
+This required section includes full bibliographic references for each +paper, chapter, book, data package, dataset, protocol, etc cited within +the DRR.
+There are numerous examples of proper formatting for each of these. +Future versions of the DRR will enable automatic reference formatting +given a correctly formatted bibtex file with the references (.bib).
+Figures should be inserted using code chunks in all cases so that +figure settings can be set in the chunk header. The chunk header should +at a minimum set the fig.align parameter to “center” and the specify the +figure caption (fig.cap parameter). Inserting figures this way will +ensure that the caption is properly formatted and it will apply copy the +caption to the figure’s “alt text” tag, making it 508-compliant.
+For example:
+```{r fig2, echo=FALSE, out.width="70%", fig.align="center", fig.cap="Example general workflow to incude in the methods section."}
+knitr::include_graphics("ProcessingWorkflow.png")
+```
Results in:
+Tables should be created using the kable function. Specifying the +caption in the kable function call (as opposed to inline markdown text) +will ensure that the caption is appropriately formatted.
+For example:
+```{r Table2, echo=FALSE}
+c1<-c("Protocol1","Protocol2","Protocol3")
+c2<-c("Park Unit 1","Park Unit 2","Park Unit 3")
+c3<-c("Site 1","Site 2","Site 3")
+c4<-c("Date 1","Date 2","Date 3")
+c5<-c("GEOXXXXX","GEOXXXXX","GEOXXXXX")
+Table2<-data.frame(c1,c2,c3,c4,c5)
+
+kable(Table2,
+ col.names=c("Subjects","Park Units","Locations","Sampling Dates","Data"),
+ caption="**Table 1.** Monitoring study example Data Records table.") %>%
+ kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),full_width=F)
+```
Results in:
++Subjects + | ++Park Units + | ++Locations + | ++Sampling Dates + | ++Data + | +
---|---|---|---|---|
+Protocol1 + | ++Park Unit 1 + | ++Site 1 + | ++Date 1 + | ++GEOXXXXX + | +
+Protocol2 + | ++Park Unit 2 + | ++Site 2 + | ++Date 2 + | ++GEOXXXXX + | +
+Protocol3 + | ++Park Unit 3 + | ++Site 3 + | ++Date 3 + | ++GEOXXXXX + | +
Because data release reports and associated data packages are +cross-referential, report numbers are typically assigned early in data +processing and quality evaluation.
+DataStore Reference Numbers. When developing a +report and data packages, DataStore references should be created as +early in the process as practicable. While the report and data packages +are in development, these should not be activated.
Report Numbers. If you are planning to publish a +Data Release Report with an official DRR number, please contact the IMD +Deputy Chief with the DataStore reference number associated with the +DRR.
Persistent Identifiers. Digital object +identifiers (DOIs) will be assigned to all DRRs and concurrently +published data packages. DOIs will resolve to a DataStore Reference; +DOIs are reserved when a draft reference is initiated in DataStore. They +are not activated until the publication process, including relevant +review, is complete.
DRR DOIs have the format: https://doi.org/10.36967/xxxxxxx
+Data package DOIs have the format: https://doi.org/10.57830/xxxxxxx
+Where the “xxxxxx” is the 7-digit DataStore reference number.
+Under no circumstances should reports and associated data packages or +metadata published in the DRR series contain disclaimers or text that +suggests that the work does not meet scientific integrity or information +quality standards of the National Park Service. The following +disclaimers are suitable for use, depending on whether the data are +provisional or final (or approved or certified).
+++For approved & published data sets: “Unless +otherwise stated, all data, metadata and related materials are +considered to satisfy the quality standards relative to the purpose for +which the data were collected. Although these data and associated +metadata have been reviewed for accuracy and completeness and approved +for release by the National Park Service Inventory and Monitoring +Division, no warranty expressed or implied is made regarding the display +or utility of the data for other purposes, nor on all computer systems, +nor shall the act of distribution constitute any such warranty.”
+
++For provisional data: “The data you have secured +from the National Park Service (NPS) database identified as [database +name] have not received approval for release by the NPS Inventory and +Monitoring Division, and as such are provisional and subject to +revision. The data are released on the condition that neither the NPS +nor the U.S. Government shall be held liable for any damages resulting +from its authorized or unauthorized use.”
+
Baker R, Patterson J, DeVivo J, Quevedo I (2024). QCkit: NPS Inventory and Monitoring Quality Control Toolkit. -R package version 0.1.4, https://github.com/nationalparkservice/QCkit/. +R package version 0.1.5, https://github.com/nationalparkservice/QCkit/.
@Manual{, title = {QCkit: NPS Inventory and Monitoring Quality Control Toolkit}, author = {Robert Baker and Judd Patterson and Joe DeVivo and Issac Quevedo}, year = {2024}, - note = {R package version 0.1.4}, + note = {R package version 0.1.5}, url = {https://github.com/nationalparkservice/QCkit/}, }diff --git a/docs/index.html b/docs/index.html index 5bb5cee..7a8aff9 100644 --- a/docs/index.html +++ b/docs/index.html @@ -33,17 +33,32 @@ QCkit - 0.1.4 + 0.1.5
NEWS.md
2024-02-09 * This version adds the DRR template, example files, and associated documentation to the QCkit package.
+2024-01-23 * Maintenance on get_custom_flag()
to align with updated DRR requirements * Added function replace_blanks()
to ingest a directory of .csvs and write them back out to .csv (overwriting the original files) with blanks converted to NA (except if a file has NO data - then it remains blank and needs to be dealt with manually)