HST: Hierarchical Spectrogram Transformers

This repository contains the official implementation of Hierarchical Spectrogram Transformers (HST) described in the following paper:

Aytekin, I., Dalmaz, O., Gonc, K., Ankishan, H., Saritas, E.U., Bagci, U., Celik, H., & Çukur, T. (2022). COVID-19 Detection from Respiratory Sounds with Hierarchical Spectrogram Transformers. ArXiv, abs/2207.09529.

Dependencies

python>=3.6.9
torch>=1.7.0
torchvision>=0.8.1
librosa
cuda=>11.3

Download pre-trained HST models

The following links contain pre-trained HST model weights on ImageNet:

After downloading the weights, please align them as HST/model/imagenet_weights/hst_base_imagenet.pth for a smooth process.

Dataset

The dataset in the paper Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data is used in this work. Their dataset is not publicly available but can be released for research purposes as said here.

For Task 1,

covid: covidandroidnocough + covidandroidwithcough + covidwebnocough + covidwebwithcough
healthy: healthyandroidnosymp + healthywebnosymp

For Task 2,

covid: covidandroidwithcough + covidwebwithcough
healthy: healthyandroidwithcough + healthywebwithcough

The audio files in the folders mentioned above are converted to spectrograms by wave2spectrogram.py. Then, the dataset should be aligned as:

/data/
  ├── task1_cough
  ├── task1_breath
  ├── task2_cough
  ├── task2_breath  
 
/data/task1_cough/
  ├── train_test
  ├── val  
  
/data/task1_cough/train_test
  ├── covid
  ├── healthy

Train and test

To train and test the chosen model with the determined seed, follow:

cd HST
python3 train.py --dataset "/data/task1_cough/train_test"  --model "hst_base"  --pretrained True  --seed 1

In our paper, HST is trained with 10 different seed for 10-fold like cross-validation. The results are averaged and reported in the paper.

Demo

An audio file of a respiratory sound can be tested with demo.py. The HST-Base model trained with task 2 cough modality data with seed 1 can be downloaded from this link.

python3 demo.py --audio_path "sample_resp_sound"

Result is printed as "healthy" or "covid".

Citation

You are encouraged to modify/distribute this code. However, please acknowledge this code and cite the paper appropriately.

@misc{hst,
  doi = {10.48550/ARXIV.2207.09529},
  
  url = {https://arxiv.org/abs/2207.09529},
  
  author = {Aytekin, Idil and Dalmaz, Onat and Gonc, Kaan and Ankishan, Haydar and Saritas, Emine U and Bagci, Ulas and Celik, Haydar and Cukur, Tolga},
  
  keywords = {Sound (cs.SD), Machine Learning (cs.LG), Audio and Speech Processing (eess.AS), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering},
  
  title = {COVID-19 Detection from Respiratory Sounds with Hierarchical Spectrogram Transformers},
  
  publisher = {arXiv},
  
  year = {2022},
  
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Acknowledgements

This code uses libraries from covid19-sounds-kdd20.

For questions and comments, please contact me: aytekinayceidil@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
baselines		baselines
model		model
README.md		README.md
demo.py		demo.py
main.png		main.png
train.py		train.py
wave2spectogram.py		wave2spectogram.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HST: Hierarchical Spectrogram Transformers

Dependencies

Download pre-trained HST models

Dataset

Train and test

Demo

Citation

Acknowledgements

About

Releases

Packages

Contributors 3

Languages

icon-lab/HST

Folders and files

Latest commit

History

Repository files navigation

HST: Hierarchical Spectrogram Transformers

Dependencies

Download pre-trained HST models

Dataset

Train and test

Demo

Citation

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages