This repository houses the LinkML representation of the data model for the Canadian Inborn Errors of Immunity National Registry (CIEINR) and its all configurations for instant export to GA4GH Phenopackets utilising the RareLink v2.0.0.dev1 engine.
The Canadian Inborn Errors of Immunity National Registry (CIEINR) aims to collect and standardize data on patients with inborn errors of immunity (IEI) across Canada via REDCap. This repository provides the LinkML representation of the CIEINR data model, enabling the creation of interoperable data structures and facilitating the export of patient data into Phenopackets. This approach aligns with the goals of improving diagnosis, treatment, and research for IEI patients, as detailed in Genotype-first approach to the diagnosis of primary immunodeficiencies: a Canadian perspective.
This project integrates with RareLink to export data into Phenopackets, promoting data sharing and analysis using standardized formats. The forms used in this project are developed based on the rules found in RareLink's documentation for developing REDCap instruments.
- LinkML Data Model: Defines the structure of the CIEINR data, ensuring data consistency and interoperability.
- USIDNET Catalogue: The CIEINR data model is based upon the USIDNET data model. A detailed mapping will follow soon.
- RareLink Integration: Enables the export of CIEINR data into Phenopackets.
- Phenopacket Generation: Facilitates the creation of standardized Phenopackets for patient data.
- REDCap Instrument Alignment: The forms are developed using the guidelines for Rarelink and REDCap integration.
- Data Standardization: Promotes the use of standardized terminologies and data formats.
- Python 3.10, 3.11, or 3.12 (not compatible with Python 3.13 due to LinkML dependencies)
- pip
- Git (for cloning the repository with submodules)
- LinkML Toolkit (1.8.0+)
-
Clone the repository with submodules:
git clone https://github.com/your-org/cieinr.git cd cieinr git submodule update --init --recursive
-
Create and activate a virtual environment (optional but recommended):
python -m venv .venv source .venv/bin/activate # On Windows, use: .venv\Scripts\activate
-
Install the package in development mode:
pip install -e .
This will install CIEINR and its dependencies, including RareLink from the submodule.
You can find the LinkML definition of the entire CIEINR-REDCap data model here:
And the corresponding Python schemas here:
All value sets are also defined in these locations.
CIEINR implements the complete IUIS2024 classification and encodes all disease values using MONDO.
You can find the disease definitions in this file:
Or import the enum directly via:
from src.cieinr.v1_0_0.python_schemas.form_1_basic import IUIS2024MONDOEnum
⚠️ 45 diseases are not yet represented in MONDO. Workshops with ESID, USIDNET, and others are planned to improve MONDO coverage of immunological diseases. Contact us for more info. All other diseases are MONDO-encoded, enabling harmonized Phenopackets for precise downstream analysis.
RareLink is included as a Git submodule and installed automatically with the package.
To make sure it is set up correctly, run:
rarelink framework update
rarelink framework status
First, configure the REDCap API keys for your local REDCap project:
rarelink setup keys
Check the configuration with:
rarelink setup view
Important: Ensure
.env
andrarelink_apiconfig.json
are listed in.gitignore
. These contain sensitive credentials and must remain local and private.
Once data has been captured, download it with:
rarelink redcap download-records
Move and rename the raw REDCap data file to:
res/redcap_data.json
Then transform the data into the CIEINR-LinkML format:
python src/cieinr/utils/transform_redcap2linkml.py
Once transformation is complete, export the data as Phenopackets using:
rarelink phenopackets export \
--input-path cieinr_linkml.json \
--output-dir res/phenopackets \
--mappings src/cieinr/v1_0_0/mappings/phenopackets/combined.py
🔐 Note: All data must remain within your local site and secure environment.
This repository and the data model of the Canadian Inborn Errors of Immunity National Registry (CIEINR) is licensed under an open-source Apache 2.0 license
- This project is inspired by the research on inborn errors of immunity and the need for standardized data collection.
- We acknowledge the RareLink project for providing the tools and guidelines for Phenopacket generation.
- We acknowledge the paper, Genotype-first approach to the diagnosis of primary immunodeficiencies: a Canadian perspective, for the general information regarding CIEINR.