Skip to content

Latest commit

 

History

History
113 lines (69 loc) · 9.34 KB

README.md

File metadata and controls

113 lines (69 loc) · 9.34 KB

Heliconia Demography Project

Repository Overview

This repository is for the cleanup, organization, and archiving of demographic survey data collected as part of the Heliconia Demography Project. These procedures are carried out by executing two R scripts (see Workflow, below). An overview of the 1998-2002 surveys and the associated metadata have been submitted to Ecology for publication as a data paper; upon acceptance the demographic data will be archived in the Dryad Digital Repository. There is a separate Github repository for the 2023 Ecology Data Paper; that repo includes the final version of the paper (in .pdf format) there and the .Rmd files used containing the text and code for analyses, data summaries, figures, and tables.

This repository includes the following:

  1. R Code used to:

  2. Data:

  3. Data validation algorithms and their output algorithms

  4. Summaries of the demographic data (e.g., total number of plants, total number of plants per plot, total number of seedlings per year).

  5. A log of updates and corrections.

  6. HDP Publications and publicly available data sets.

  7. Methodological information and records, including:

Workflow

STEP 1. Correct, organize, & review the data with 01_clean_survey_data.R

Code: The functions in 01_clean_survey_data.R will consolidate the 'raw' survey data, clean it, organize it in tidy form, and conduct a series of validation procedures.

  • ha_data<-clean_heliconia_data() calls several other functions found in the folder code/survey_cleaning. These functions include an .R script for cleaning and correcting the records for plants found in each demographic plot and producing `csv files of 'clean' data and any records recommended for follow-up review.

  • create_plot_info_file() will create a .csv file of plot-level descriptors.

  • create_tag_changes_file() creates a .csv of all the plants whose tags were replaced during the field survey (necessary only if one is reviewing the survey history of individual plants using the original data sheets)

  • create_plot_treefalls_file() creates a .csv with records of any new tree falls and gaps noted in the demographic plots during the survey. (NB: review of these records is currently in progress.)

  • create_plant_damage_file() creates a .csv with any observations by the survey team of plants that were damaged by fallen branches or trees. (NB: review of these records is currently in progress.)

Output: The .csv files produced by these functions are saved to the folder data/survey_clean. Executing the code also creates or edits .txt files with the relevant file's version number and date of most recent update (see 'File Versioning', below).

File Versioning: To ensure reproducibility, users must know the precise version of a data set they used in their analyses. Below each function is a snippet of code entitled create version files; uncommenting and running this code will create or update the file recording the version number of the file being created (see 'Frictionless Standards').

The first time the files are 'cleaned' or 'created' a .txt file will automatically be created assigning the version number 1.0.0 with the date of file creation. If a file already exists, the user will be asked if the file being created is an updated version. 'N' will execute the code without changing the version number or date; 'Y' will trigger a follow-up question of whether the new version is a major, minor, or patch update. The version number will be appriopriately incremented by 1 (e.g., major: 1.0.0 -> 2.0.0, minor: 1.0.0 -> 1.1.0, patch: 1.0.0 -> 1.0.1).

  • [NB: this was automated but is temporarily manual to allow automated validation, see details here].

Data Validation & Review: Once the file heliconia_survey_clean.csv has been saved to the the data/survey_clean folder, the function review_heliconia_data() conducts a series of data validation procedures to flag any records to review before preparing the files to be archived at the Dryad Digital Repository.

  • The functions for this review are in the folder code/survey_review.

  • These and other validations are also carried out using the pointblank package; The output of the data validation process suggesting records for review is here.

  • Any individual plant records that are flagged for review by review_heliconia_data() will be saved as .csv files in the folder data/survey_review. They can also be downloaded as .csv files from the Data Validation page.

STEP 2. Prepare the files for archiving at Dryad with 02_create_survey_archive.R.

**Code: **02_create_survey_archive.R will prepare the version of the 'clean' survey data and file of plot descriptors that are archived in Dryad.

  • Uncommenting and running the snippet of code entitled create version files will prompt the user to answer if they are creating an updated version of the data set, and if so, if the version is a major, minor, or patch update.

  • create_dryad_file() will then create .csv files of (1) plot descriptors and (2) the survey data that were archived in Dryad (NB: The demographic data file uploaded to Dryad excludes some of the redundant plot identification codes and the x-y coordinates of individual plants). The function generating and saving these files is found in the folder code/survey_archive, as is the create_version_file.R script used toupdate the version_info.txt file.

  • These resulting .csv files are saved to the folder data/survey_archive.

Improvements, Suggestions, & Questions

We welcome any suggestions for package improvement or ideas for features to include in future versions. If you have Issues, Feature Requests and Pull Requests, here is how to contribute. We expect everyone contributing to the package to abide by our Code of Conduct.

Contributors

Citation

Please cite both the Data Paper and Dryad Repository when using these data for research, publications, teaching, etc.

If you wish to cite this repository, please cite as follows:

@misc{BrunaSurveys2023,
author = {Bruna, E.M., Eric R. Scott},
title = {Heliconia Demography Project},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
note = {data v1.0.0.},
url={https://github.com/BrunaLab/HeliconiaSurveys}
}