Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise
Yeonguk Yu
·
Minhwan Ko
·
Sungho Shin
·
Kangmin Kim
·
Kyoobin Lee
Artificial Intelligence LAB
GIST, South Korea
TL;DR: We propose CUFIT, a robust fine-tuning method for vision foundation models under noisy label conditions, based on the advantages of linear probing and adapters.
Our CUrriculum FIne-Tuning of Vision Foundation Model (CUFIT) offers a robust training framework for medical multi-class image classification under noisy label conditions. Leveraging vision foundation models (VFMs) pretrained on large-scale datasets, CUFIT effectively handles noisy labels without modifying the feature extractor, using linear probing. Subsequently, it employs a curriculum fine-tuning approach, beginning with linear probing to ensure robustness to noisy samples, followed by fine-tuning two adapters for enhanced classification performance. CUFIT outperforms conventional methods across various medical image benchmarks, achieving superior results at various noise rates on datasets such as HAM10000 and APTOS-2019, highlighting its capability to address the challenges posed by noisy labels in medical datasets.
git clone https://github.com/gist-ailab/CUFIT.git
cd CUFIT
This code is tested under Linux 20.04 and Python 3.8.18 environment, and the code requires following main packages to be installed:
- Pytorch: Tested under 2.0.1 version of Pytorch-GPU.
- torchvision: which will be installed along Pytorch. Tested under 0.15.2 version.
- MedMNIST: which is needed for experiments with BloodMnist, OrgancMnist. Tested under 3.0.1 version.
you may use the follwoing lines.
conda create -n cufit python=3.8
conda activate cufit
pip install -r requirement.txt
Some public datasets are required to be downloaded for running experiments.
HAM10000 preparation
-
Download the training data, training ground truth, Test data, Test ground truth of task 3 in this link.
-
Place the zip files in "CUFIT/data" folder and extract them.
-
Run the python code "ham10000.py" in "CUFIT/data".
-
This will create a folder named "ham10000" where images are sorted by its corrseponding disease.
APTOS-2019 preparation
-
Download the zip files by clicking "download all" button in kaggle site.
-
Place the zip files in "CUFIT/data" folder and extract it.
-
Create a folder named "APTOS-2019" in "CUFIT/data".
-
Place the extracted files in the "APTOS-2019" folder.
# conf/ham10000.json
{
"epoch" : "100",
"id_dataset" : "./data/ham10000", # Your path to dataset
"batch_size" : 32,
"save_path" : "./checkpoints/ham10000", # Your path to checkpoint
"num_classes" : 7
}
Place the data and create checkpoint folder following this directory structure:
CUFIT/
├── assets/
├── checkpoints/
├── HAM10000/
└── APTOS-2019/
├── conf/
├── HAM10000.json
└── aptos.json
├── data/
├── HAM10000/
├── test/
└── train/
└── APTOS-2019
├── test_images/
├── train_images/
├── val_images/
├── test.csv
├── train_1.csv
└── valid.csv
├── rein/
└── utils/
python train_linear.py -d 'data_name' -g 'gpu-num' -n 'noise_rate' -s 'save_name'
for example,
python train_linear.py -d ham10000 -g 0 -n 0.2 -s dinov2s_linear_0.2
python train_rein.py -d 'data_name' -g 'gpu-num' -n 'noise_rate -s 'save_name'
for example,
python train_rein.py -d ham10000 -g 0 -n 0.2 -s dinov2s_single_rein_0.2
python train_cuft.py -d 'data_name' -g 'gpu-num' -n 'noise_rate -s 'save_name'
for example,
python train_cufit.py -d ham10000 -g 0 -n 0.2 -s dinov2s_cufit_0.2
This work waspartly supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No. RS-2022-II0951, Development of Uncertainty-Aware Agents Learning by Asking Questions, 90%) and Institute of Civil Military Technology Cooperation funded by the Defense Acquisition Program Administration and Ministry of Trade, Industry and Energy of Korean government under grant No. 22-CM-GU-08, 10%.
The source code of this repository is released only for academic use. See the license file for details.
If you use CUFIT in your research, please consider citing us.
@inproceedings{
yu2024curriculum,
title={Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise},
author={Yeonguk Yu and Minhwan Ko and Sungho Shin and Kangmin Kim and Kyoobin Lee},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=vYUx8j5KK2}
}