This is the main repository associated for the paper An OpenMind for 3D medical vision self-supervised learning
, intended for the Review.
It holds the code for the self-supervised pre-trainings conducted in the Benchmark study.
Currently it includes the ResEnc-L [a,b] CNN architecture and the Primus-M Transformer architecture, as well as the following pre-training methods for both architectures, where applicable
- Volume Contrastive (VoCo)
- VolumeFusion (VF)
- Models Genesis (MG)
- Default MAE (MAE)
- Spark 3D (S3D)
- SimMIM (SimMIM)
- SwinUNETR pre-training (SwinUNETR)
- SimCLR (SimCLR)
Segmentation Fine-tuning Framework: To be shared
Classification Fine-tuning Framework
Checkpoints: To be shared
- Download/clone the repository
- Unzip and navigate into the repository
- Install the repository
pip install -e .
(-e optional)
4. Setting environment variables:
In addition to the installation, this repository requires setting up three additional pathnnssl_raw
-- The path holding datasets of rawpretrain_data.json
files.nnssl_preprocessed
-- A path where preprocessed data will be stored.nnssl_results
-- A path where results will be stored.
In order to conduct pre-training with this repository three main steps need to be conducted:
First, some pre-training dataset needs to be chosen. You can use the OpenMind
dataset. However any other dataset could be used as well. Opposed to nnU-Net, the data does not have to be in a specific format. Instead, a pretrain_data.json
file needs to be created detailing the datasets specific information. (For simplicity the OpenMind dataset comes with this). To create this file for your own dataset or to understand the file, we refer to the instructions below.
Understanding and creating the `pretrain_data.json` file
Medical datasets generally center around studies of subjects. These subjects can be imaged in different sessions with different scanners or through different imaging protocols. This is reflected in the common BIDS data structure, which differentiates into:subjects
- The individual subjects in the datasetsessions
- The individual sessions of the subjectsscans
- The individual scans of the sessions
In our case, we are also interested in aggregating multiple datasets, hence we include
dataset
- The individual datasets that was included
All this information may be valuable for pre-training, e.g. one may want to develop a contrastive pre-training method that uses scans
of the same subject
during one session
as positive pair and others as negative. Or one may want to develop a longitudinal pre-training
that e.g. tries to predict the next scan of the next session
. To allow using such information, we need to maintain this information in the pretrain_data.json
file.
Hence, our `pretrain_data.json` file mirrors the BIDS structure:
To generate this file, we recommend writing a python script that creates a Collection
dataclass (located in src/nnssl/data/raw_dataset.py
) and uses the .to_dict()
method of the collection which will yield a valid pretrain_data.json
file.
To allow this file to be valid for differing machines, the file-paths support relative paths.
Relative paths are indicated through the pre-fix $
. Moreover, when saving absolute paths the paths are checked, if the image path beginnings can be replaced by the paths in the Environment Variables: ["nnssl_raw", "nnssl_preprocessed"]
, replacing them with $nnssl_raw
or $nnssl_preprocessed
respectively.
Currently the framework follows the nnU-Net preprocessing pipeline. This generally includes a fingerprinting, planning, and lastly preprocessing of the data.
Fingerprinting determines overall shape and spacing of the data.
Planning determines which patch size to use and which spacing to resample to.
Preprocessing normalizes, crops and resamples the data and saves it compressed in the bloscv2
format1. Moreover, the pretrain_data.json
file will be copied to the nnssl_preprocessed
directory, with the image and mask paths adjusted accordingly.
To conduct these three steps run
nnssl_plan_and_preprocess -d <Dataset ID>
Given the preprocessed data we can now pre-train the models. This is done by selecting a trainer
a dataset
and a plan
.
The trainer
determines pre-training method and architecture, the dataset
the data to use and the plan
the preprocessing of the data.
An exemplary pre-training call for a 4xGPU ResEnc-L MAE pre-training would be:
python ./nnssl/run/run_training.py -tr BaseMAETrainer_BS8 -p nnsslPlans -num_gpus 4
or
nnssl_train -tr BaseMAETrainer_BS8 -p nnsslPlans -num_gpus 4
Note: Due to the lack of e.g. linear-probing for segmentation, no metrics aside from the train and validation loss are tracked during pre-training.
After pre-training, the resulting model checkpoint (or, pre-existing checkpoints) can be adapted to a specific downstream task. This can be done via the associated adaptation frameworks linked above.
Due to the lack of established frameworks in the domain of 3D SSL, we are open to code contributions and extensions of the current framework.
If you are interested in contributing, please refer to the WIP_Contributing.md
file.
Footnotes
-
Bloscv2 is a compressed format that allows partial decompression and reading of the data, allowing fast I/O while minimizing CPU usage. ↩