Paper | OpenReview | ArXiv
This repository contains the implementation code for our paper:
Exploring Image Augmentations for Siamese Representation Learning with Chest X-Rays
- Authors: Rogier van der Sluijs*, Nandita Bhaskhar*, Daniel Rubin, Curtis Langlotz, Akshay Chaudhari
- *- co-first authors
- Published at Medical Imaging with Deep Learning (MIDL)
Tailored augmentation strategies for image-only Siamese representation learning can outperform supervised baselines with zero-shot learning, linear probing and fine-tuning for chest X-ray classification. We systematically assess the effect of various augmentations on the quality and robustness of the learned representations. We train and evaluate Siamese Networks for abnormality detection on chest X-Rays across three large datasets (MIMIC-CXR, CheXpert and VinDr-CXR). We investigate the efficacy of the learned representations through experiments involving linear probing, fine-tuning, zero-shot transfer, and data efficiency. Finally, we identify a set of augmentations that yield robust representations that generalize well to both out-of-distribution data and diseases, while outperforming supervised baselines using just zero-shot transfer and linear probes by up to 20%.
To contribute to siaug, you can install the package in editable mode:
pip install -e .
pip install -r requirements.txt
pre-commit install
pre-commit
Make sure to update the .env
file according to the setup of your cluster and placement of your project folder on disk. Also, run accelerate config
to generate a config file, and copy it from ~/cache/huggingface/accelerate/default_config.yaml
to the project directory. Finally, create symlinks from the data/
folder to the datasets you would want to train on.
Currently, we support two modes of training: pretraining and linear evaluation.
To learn a new representation, you can use the train_repr.py
script.
# Train and log to WandB
accelerate launch siaug/train_repr.py experiment=experiment_name logger=wandb
# Resume from checkpoint
accelerate launch siaug/train_repr.py ... resume_from_ckpt=/path/to/accelerate/ckpt/dir
# Run a fast_dev_run
accelerate launch siaug/train_repr.py ... fast_dev_run=True max_epoch=10 log_every_n_steps=1 ckpt_every_n_epochs=1
To train a linear classifier on top of a frozen backbone, use the train_lcls.py
script.
# Train a linear classifier on top of a frozen backbone
accelerate launch siaug/train_lcls.py experiment=experiment_name model.ckpt_path=/path/to/model/weights
# Train a linear classifier on top of a random initialized backbone
accelerate launch siaug/train_lcls.py model.ckpt_path=None
# Use ImageNet pretrained weights
accelerate launch siaug/train_lcls.py +model.pretrained=True
To evaluate a model on a downstream task without fine-tuning, use the siaug/eval.py
script.
python siaug/eval.py experiment=eval_chex_resnet +checkpoint_folder=/path/to/model/checkpoints/folder +save_path=/path/to/save/resulting/pickle/files
This repository is being developed at the Stanford's MIMI Lab. Please reach out to sluijs [at] stanford [dot] edu
and nanbhas [at] stanford [dot] edu
if you would like to use or contribute to siaug
.
If you find our paper and/or code useful, please use the following BibTex for citation:
@article{sluijsnanbhas2023_siaug,
title={Exploring Image Augmentations for Siamese Representation Learning with Chest X-Rays},
author={Rogier van der Sluijs and Nandita Bhaskhar and Daniel Rubin and Curtis Langlotz and Akshay Chaudhari},
year={2023},
journal={Medical Imaging with Deep Learning (MIDL)},
}