Peirong Liu1,2, Oula Puonti2, Xiaoling Hu2, Karthik Gopinath2, Annabel Sorby-Adams2, Daniel C. Alexander3, Juan Eugenio Iglesias2,3,4
1Johns Hopkins University
2Harvard Medical School and Massachusetts General Hospital
3University College London
4Massachusetts Institute of Technology
This is the official repository for our preprint: A Modality-agnostic Multi-task Foundation Model for Human Brain Imaging [arXiv]
More detailed and organized instructions are coming soon...
Training and evaluation environment: Python 3.11.4, PyTorch 2.0.1, CUDA 12.2. Run the following command to install required packages.
conda create -n pre python=3.11
conda activate pre
git clone https://github.com/jhuldr/BrainFM
cd /path/to/brainfm
pip install -r requirements.txt
The pre-trained model weight is available on OneDrive. After downloading, please put them under ckp/.
cd scripts
python demo_test.py
cd scripts
python demo_generator.py
Setups are in cfgs/generator, default setups are in default.yaml. A customized setup example can be found in train/brain_id.yaml, where several Brain-ID-specific setups are added. During Config reading/implementation, customized yaml will overwrite default.yaml if they have the same keys.
dataset_setups: information for all datasets, in Generator/constants.py
augmentation_funcs: augmentation functions and steps, in Generator/constants.py
processing_funcs: image processing functions for each modality/task, in Generator/constants.py
dataset_names: dataset name list, paths setups in Generator/constants.py
mix_synth_prob: if the input mode is synthesizing, then probability for blending synth with real images
dataset_option: generator types, could be BaseGen or customized generator
task: switch on/off individual training tasks
cd Generator
python datasets.py
The dataset paths setups are in constants.py. In datasets.py, different datasets been used are fomulated as a list of dataset names.
A customized data generator module example can be found in datasets.py -- BrainIDGen.
Refer to "getitem" function. Specifically, it includes:
(1) read original input: could be either generation labels or real images;
(2) generate augmentation setups and deformation fields;
(3) read target(s) according to the assigned tasks -- here I seperate the processing functions for each item/modality, in case we want different processing steps for them;
(4) augment input sample: either synthesized or real image input.
(Some of the functions are leaved blank for now.)
cd scripts
python train.py
@article{Liu_2025_BrainFM,
author = {Liu, Peirong and Puonti, Oula and Hu, Xiaoling and Gopinath, Karthik and Sorby-Adams, Annabel and Alexander, Daniel C. and Iglesias, Juan E.},
title = {A Modality-agnostic Multi-task Foundation Model for Human Brain Imaging},
booktitle = {arXiv preprint arXiv:2509.00549},
year = {2025},
}