Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond
Yundi Zhang, Paul Hager, Che Liu, Suprosanna Shit, Chen Chen, Daniel Rueckert, and Jiazhen Pan
This repository contains the code used in the research paper published in Medical Image Analysis: Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond.
In this project, we introduce a multimodel framework ViTa for a comprehensive, patient-specific understanding of cardiac health. By integrating rich 3D+T cine CMR data (from both short-axis and long-axis views) with detailed tabular health records from 42,000 UK Biobank participants, ViTa enabels context-aware representation learning that supports a wide range of cardiac and metabolic health tasks in a singed unified framework.
Key features include:
-
Multimodal Integration: Combines CMR imaging with patient-level health indicators (e.g., sex, BMI, smoking status) for holistic cardiac assessment.
-
Rich Spatio-temporal Imaging: Utilizes 3D+T cine stacks from multiple anatomical views for complete cardiac cycle representation.
-
Unified Framework: Supports Multi-plane/multi-frame cardiac MRI segmentation, phenotype prediction, and disease classification within the same model.
ViTa marks a step toward foundation models for cardiology -- informative, generalizable, and grounded in patient context.
To get a local copy up and running, follow these steps:
Before you begin, ensure you have met the following requirements:
- Python 3.9+ as the programming language.
- Conda installed (part of the Anaconda or Miniconda distribution).
- pip installed for package management.
- Git installed to clone the repository.
-
Clone the repository
git clone https://github.com/Yundi-Zhang/ViTa.git cd ViTa -
Create and activate a Conda environment
# Create a new Conda environment with Python 3.9 (or your required version) conda create --name vita python=3.9 # Activate the Conda environment conda activate vita
-
Install dependencies
pip install torch==2.0.1+cu117 torchvision==0.15.2+cu117 torchsummary -f https://download.pytorch.org/whl/torch_stable.html pip install -r requirements.txt
-
Configure environment variables Update the necessary environment variables in
.env.
The essential data file paths should be added in .env file before running the tasks.
-
IMAGE_SUBJ_PATH: The path to a pickle file containing all paths to the preprocessed CMR data files. It contains a dictionary with three keys"train", "val", "test".cmr_paths.pkl βββ "train": [Path(data1.npz), Path(data2.npz), Path(data3.npz)] βββ "val": [Path(data1.npz), Path(data2.npz), Path(data3.npz)] βββ "test": [Path(data1.npz), Path(data2.npz), Path(data3.npz)]Each of the file contains: The
.npzfile contains a dictionary like this:{ "sax": np.array of shape (H, W, S, T), "lax": np.array of shape (H, W, S, T), "seg_sax": np.array of shape (H, W, S, T), "seg_lax": np.array of shape (H, W, S, T) } -
Input tabular data:
input_tab.csvwhich is normalized and imputed as decribed in the paper. -
Targt tabular data:
- For regression:
raw_tab.csvwhich contains not processed target values. - For classification:
labels_CAD.csvwhich contains disease labels.
- For regression:
This project supports four tasks:
- Pretraining (stage I + II),
- Phenotype prediction,
- Disease classification,
- Segmentation.
The pretained weights are released on huggingface (https://huggingface.co/UKBB-Foundational-Models/ViTa). Running the main.py will automatically save the checkpoint weights into ./log.
Before run scripts please check the .env file and replace the entries:
IMAGE_SUBJ_PATHS_FOLDER: where you have yourcmr_paths.pklfileRAW_TABULAR_DATA_PATH: the path to the raw target tabular data.PREPROCESSED_TABULAR_DATA_PATH: the path to the input processed tabular data.WANDB_ENTITY: the entity of your WandB.FEATURE_NAMES_INandFEATURE_NAMES_OUT: the path to the json file for the names of input and target (output) data. Please check the scriptdatasets/preprocessing_tabular/selected_feaure_names.py.
Config file: configs/imaging_model/pretraining_reconstruction_mae.yaml
Entries to update:
data.cmr_path_pickle_name: replace with your own subject file namedata.tabular_data.tabular_data: replace with the processed tabular data.data.tabular_data.raw_tabular_data: replace with the raw target tabular data.
For trainig:
. shell_scripts/pretraining_imaging.shFor evaluation:
. shell_scripts/evaluate_pretraining_imaging.shConfig file: configs/imaging_tabular_model/cl_pretraining_imaging_vita.yaml
Entries to update:
data.cmr_path_pickle_name: replace with your own subject file namedata.tabular_data.tabular_data: replace with the processed tabular data.data.tabular_data.raw_tabular_data: replace with the raw target tabular data.
For trainig:
. shell_scripts/pretraining_imaging_tabular.shFor evaluation:
. shell_scripts/evaluate_pretraining_imaging_tabular.shConfig file: configs/imaging_tabular_model/regression_vita.yaml
Entries to update:
data.cmr_path_pickle_name: replace with your own subject file namedata.tabular_data.tabular_data: replace with the processed tabular data.data.tabular_data.raw_tabular_data: replace with the raw target tabular data.module.tabular_hparams.selected_features: replace with the names of selected features.
For trainig:
. shell_scripts/downstream_regression.shFor evaluation:
. shell_scripts/evaluate_downstream_regression_sax_phenotypes.sh
. shell_scripts/evaluate_downstream_regression_lax_phenotypes.sh
. shell_scripts/evaluate_downstream_regression_indicators.shConfig file: configs/imaging_tabular_model/classification_vita.yaml
Entries to update:
data.cmr_path_pickle_name: replace with your own subject file namedata.tabular_data.tabular_data: replace with the processed tabular data.data.tabular_data.raw_tabular_data: replace with the raw target tabular data.module.tabular_hparams.selected_features: replace with the names of selected features.
For trainig:
. shell_scripts/downstream_classification.shFor evaluation:
. shell_scripts/evaluate_downstream_classification.shConfig file: configs/imaging_model/segmentation_vita.yaml
Entries to update:
data.cmr_path_pickle_name: replace with your own subject file name
For trainig:
. shell_scripts/downstream_segmentation.shFor evaluation:
. shell_scripts/evaluate_downstream_segmentation.shWe also provide the scripts are preprocessing UKBB data in datasets/preprocessing_imaging and datasets/preprocessing_tabular for references. To run the tabular preprocessing please follow the instructions:
-
Replace the data paths with your directories in .env file.
-
Run imaging preprocessing shell script with . shell_scripts/preprocessing_imaging_data.sh.
-
Run tabular preprocessing shell script with . shell_scripts/preprocessing_tabular_data.sh.
This project uses NIfTI files to store imaging data. For each subject, the data is organized in a specific folder structure with various NIfTI files for different types of images and their corresponding segmentations.
For each subject, the data is contained in a single folder. The folder includes:
-
Short-axis (SAX) Images:
sa.nii.gz: The short-axis images.seg_sa.nii.gz: The segmentation maps for the short-axis images.
-
Long-axis (LAX) Images:
la_2ch.nii.gz: The long-axis images in the 2-chamber view.la_3ch.nii.gz: The long-axis images in the 3-chamber view.la_4ch.nii.gz: The long-axis images in the 4-chamber view.
Here is an example of how the data should be organized for one subject:
data/
β
βββ subject1/
βββ sa.nii.gz
βββ seg_sa.nii.gz
βββ la_2ch.nii.gz
βββ la_3ch.nii.gz
βββ la_4ch.nii.gzThis project uses .npz files to store processed image data. Each .npz file contains a dictionary with specific keys, where each key corresponds to a NumPy array. The arrays have the shape (H, W, S, T) where S is the number of slices in the volume. Each .npz file contains the following keys:
sax: Short-axis images.lax: Long-axis images.seg_sax: Segmentation maps for short-axis images.seg_lax: Segmentation maps for long-axis images.
This project is licensed under the MIT License - see the LICENSE file for details.
For questions or suggestions, contact yundi.zhang@tum.de or jiazhen.pan@tum.de. If you use this code in your research, please cite the above-mentioned paper.



