G2P-Datasets is a platform for accessing >100 public genome-to-phenome datasets for plants and animals.
The hosted datasets and code can be searched for via the following Web Application.
To browse datasets, go to the "Datasets" module of the web app (the default module) and search the dataset's metadata (species, type of study, etc.) in the search box. Additional metadata fields (n Genotypes, n Markers, etc.) can be used as filters individually in the search table. Click on a dataset in the table to view a summary of the dataset. Below the dataset summary, the user can also access code to load the dataset from an external database (GPDatasets links to datasets in situ and does not store datasets itself), format the data to a standard format for analysis, and perform genomic prediction on the data using a range of provided models.
To contribute a dataset to the repository, first make sure it's not already in the repository (see Accessing datasets and analyses). If it isn't already present, (i) go to the "Add dataset" module of the web app and (ii) fill in all the required fields about the dataset's metadata and code to load the data. The app will then package the provided metadata and code into a standard format which can be pushed to the repo as-is. (iii) Download the packaged dataset .zip file, unzip it, and (iv) submit it to the repository.
Please follow the instructions in this document to propose adding a dataset to the web application. A member of our team will review your request and approve or deny it after evaluating potential duplication, the relevance of the dataset to the project, the presence of a DOI, and the accuracy of the metadata.
The web application code and repository used in this study have been permanently archived in Zenodo and are publicly available.
Repository: https://doi.org/10.5281/zenodo.17604233
Web application: https://doi.org/10.5281/zenodo.17604237
The following are a few currently available plant genome databases specific to a species. To download the dataset, go to the specific website and follow the download instructions.
- Cassava: https://cassavabase.org/
- Sweetpotato: https://sweetpotatobase.org/
- Banana: https://musabase.org/
- Solanaceae: https://solgenomics.net/
- Cotton: https://db.cngb.org/cottonGVD/