This repository offers cohorts of GA4GH Phenopackets that represent individuals with Mendelian diseases, as described in Danis et al, HGG Advances 2025.
Please see the online documentation for information about available cohorts.
Phenopacket store releases are available for download from the Releases section.
The latest release ZIP archive is available for download from a stable URL:
ZIP
https://github.com/monarch-initiative/phenopacket-store/releases/latest/download/all_phenopackets.zip
We provide special support for Python with Phenopacket Store Toolkit to simplify accessing the Phenopacket Store data in downstream applications.
The toolkit is available at Python Package Index (PyPi)
and can be installed, e.g. with pip
:
python3 -m pip install phenopacket-store-toolkit
After installation, loading phenopackets from Phenopacket Store is super easy. First, we create Phenopacket Store registry, an object for managing local data files of Phenopacket Store releases:
from ppktstore.registry import configure_phenopacket_registry
registry = configure_phenopacket_registry()
By default, the registry
keeps the files in data directory at $HOME/.phenopacket-store
(or similar on Windows), but this can be configured if desired.
Then, we can use registry
to load phenopackets of a cohort, e.g. SUOX of release 0.1.18
:
with registry.open_phenopacket_store(release="0.1.18") as ps:
phenopackets = list(ps.iter_cohort_phenopackets("SUOX"))
assert len(phenopackets) == 35
The registry peeks into the data directory to check if the 0.1.18
release ZIP file has already been downloaded.
If absent, the registry will download the ZIP file from Github. Then, we open Phenopacket Store as ps
and we load 35 phenopackets of SUOX cohort.
More info about Phenopacket Store Toolkit is available in its documentation.
The cohorts were curated from data medical publications, mainly by parsing the tables or supplemental tables. The curation was done in Jupyter notebooks using pyphetools library. Pull requests with additional notebooks in the same style are welcome.
If you use Phenopacket Store in a scientific publication, we would appreciate citations to the following paper:
A corpus of GA4GH phenopackets: Case-level phenotyping for genomic diagnostics and discovery, Danis et al., Human Genetics and Genomics Advances, Volume 6, Issue 1, 100371