MeDiVLAD: Efficient Lung Ultrasound Severity Scoring

This repository contains the official implementation of the paper "Efficient Lung Ultrasound Severity Scoring Using Dedicated Feature Extractor"

Timeline

2025-01-02: Accepted by IEEE International Symposium on Biomedical Imaging (ISBI) 2025 🎉🎉.
2025-01-23: Our paper (Camera Ready Version) is now avaliable on arXiv 🎉🎉.

Abstract

Lung ultrasound (LUS) has become an invaluable tool for assessing pulmonary conditions due to its non-invasive nature and portability. However, AI-driven LUS severity scoring faces challenges from limited datasets and annotation constraints. To address this, we introduce MeDiVLAD, a semi-self-supervised learning pipeline leveraging:

Self-knowledge distillation to pretrain a Vision Transformer (ViT) without labeled data.
Dual-level VLAD aggregation to capture temporal and spatial patterns for frame- and video-level scoring.

The proposed method achieves state-of-the-art performance in both frame- and video-level scoring tasks, significantly reducing the reliance on expert annotations.

Features

Self-supervised Pretraining: Employs DINO to pretrain the ViT model without labeled data.
Dual-level VLAD Aggregation: Improves the representation of frame embeddings for robust video-level classification.
Minimal Supervision: Requires only a small amount (or No data) of annotated data for fine-tuning.
Interpretability: Produces precise attention maps to enhance classification reasoning.

Key Results

Visualization of attention heatmap as interpretation of classification

Frame-level Accuracy: 75.05% (without labeled data), improving to 82.47% with fine-tuning.
Video-level Classification: Outperforms Bi-LSTM and NetVLAD on the LUS severity scoring task.
Applications: Supports critical pathology identification and broader medical imaging tasks.

Dataset

The experiments use a curated dataset of 283 ultrasound videos from 156 patients, including data from:

COVIDx-US
Private dataset collected from "G. Papanikolaou" General Hospital, Thessaloniki, Greece.

Frame-level annotations include 585 images categorized into three severity levels using an enhanced LUS scoring system.

Installation and Usage

Our Environment Setting (Recommended)

Python: 3.9
PyTorch: 2.0.0
TorchVision: 0.15.0

Installation

Clone this repository:

git clone https://github.com/GuoJiaqi-1020/MeDiVLAD.git
cd MeDiVLAD

Download Dataset (image & video) and pretrained models from: 😊 [link will be made public in April]
- 15_frames.zip: Original video data (15 frame rate) in .avi
- 15_frames_processed.zip: Preprocessed video data in .npy
- labeled_frames.zip: Frame-level labeled data
- dino_checkpoint.zip: Pretrained models

Unzip the files:

MeDiVLAD
├── dino_checkpoint
│   ├── backbone
│   ├── classifier
├── data
│   ├── 15_frames
│   ├── 15_frames_processed
│   ├── labeled_frames
│   ├── video_list.csv (already included)
└── img

Running the test scripts

All runnable scripts are located in the MeDiVLAD/Medino/ directory. Below are the descriptions of the main scripts and their purposes:

Script Name	Purpose
`dino_train.py`	Performs DINO self-distillation. Uses KNN as a classifier to evaluate accuracy.
`lus_image_cls.py`	Fine-tunes the distilled DINO teacher.
`lus_video_cls.py`	For video-level scoring.

🟥 Don't forget to modify the data path in the script. Please follow the comments in the script.

Checkpoint Description

The following table provides the expected accuracy for each script based on the classifier type (KNN or Linear):

Checkpoint	KNN	Linear	Purpose
backbone/Teacher_DINO_IMGLUS_Acc67.49.pth	~63 %	N/A	self-distilled backbone (No label)
backbone/VAL_F1_dino_acc_8013.pth	N/A	~80%	fine-tuned on fold 1
backbone/VAL_F2_dino_acc_8481.pth	N/A	~84%	fine-tuned on fold 2
video_cls/VAL_F1_dino_acc_8357.pth	N/A	~83%	Video-level scoring on fold 1
video_cls/VAL_F2_dino_acc_8162.pth	N/A	~81%	Video-level scoring on fold 2

📖 Citing This Work

If you find this code useful in your research, please consider citing our paper:

(We will update this arxiv reference after the official publication of the paper)

@misc{guo2025efficientlungultrasoundseverity,
      title={Efficient Lung Ultrasound Severity Scoring Using Dedicated Feature Extractor}, 
      author={Jiaqi Guo and Yunnan Wu and Evangelos Kaimakamis and Georgios Petmezas and Vasileios E. Papageorgiou and Nicos Maglaveras and Aggelos K. Katsaggelos},
      year={2025},
      eprint={2501.12524},
      archivePrefix={arXiv},
      primaryClass={eess.IV},
      url={https://arxiv.org/abs/2501.12524}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
Medino		Medino
data		data
img		img
trained_model		trained_model
utils/__pycache__		utils/__pycache__
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MeDiVLAD: Efficient Lung Ultrasound Severity Scoring

Timeline

Abstract

Features

Key Results

Dataset

Installation and Usage

Our Environment Setting (Recommended)

Installation

📖 Citing This Work

About

Releases

Packages

Languages

GuoJiaqi-1020/MeDiVLAD

Folders and files

Latest commit

History

Repository files navigation

MeDiVLAD: Efficient Lung Ultrasound Severity Scoring

Timeline

Abstract

Features

Key Results

Dataset

Installation and Usage

Our Environment Setting (Recommended)

Installation

📖 Citing This Work

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages