Skip to content

QuantGen/G2P-Datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

G2P-Datasets

G2P-Datasets is a platform for accessing >100 public genome-to-phenome datasets for plants and animals.

How to use G2P-Datasets

The hosted datasets and code can be searched for via the following Web Application.

Accessing datasets and analyses

To browse datasets, go to the "Datasets" module of the web app (the default module) and search the dataset's metadata (species, type of study, etc.) in the search box. Additional metadata fields (n Genotypes, n Markers, etc.) can be used as filters individually in the search table. Click on a dataset in the table to view a summary of the dataset. Below the dataset summary, the user can also access code to load the dataset from an external database (GPDatasets links to datasets in situ and does not store datasets itself), format the data to a standard format for analysis, and perform genomic prediction on the data using a range of provided models.

Contributing a dataset

To contribute a dataset to the repository, first make sure it's not already in the repository (see Accessing datasets and analyses). If it isn't already present, (i) go to the "Add dataset" module of the web app and (ii) fill in all the required fields about the dataset's metadata and code to load the data. The app will then package the provided metadata and code into a standard format which can be pushed to the repo as-is. (iii) Download the packaged dataset .zip file, unzip it, and (iv) submit it to the repository.

Please follow the instructions in this document to propose adding a dataset to the web application. A member of our team will review your request and approve or deny it after evaluating potential duplication, the relevance of the dataset to the project, the presence of a DOI, and the accuracy of the metadata.

Data and Code Availability

The web application code and repository used in this study have been permanently archived in Zenodo and are publicly available.

Repository: https://doi.org/10.5281/zenodo.17604233

Web application: https://doi.org/10.5281/zenodo.17604237

Databases specific to a species

The following are a few currently available plant genome databases specific to a species. To download the dataset, go to the specific website and follow the download instructions.

  1. Cassava: https://cassavabase.org/
  2. Sweetpotato: https://sweetpotatobase.org/
  3. Banana: https://musabase.org/
  4. Solanaceae: https://solgenomics.net/
  5. Cotton: https://db.cngb.org/cottonGVD/

About

Genomic Prediction Datasets

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 5