WavRx - a Speech Health Diagnostic Model

This repository provides scripts for running a new SOTA speech health diagnostic model WavRx. WavRx obtains SOTA performance on 6 datasets covering 4 different pathologies, and shows good zero-shot generalizability. The health embeddings encoded by WavRx are shown to carry minimal speaker identity attributes.

This repository can be used to (1) conduct training of WavRx on the 6 datasets; (2) run inference using the pretrained WavRx backbones; (3) train and test your self-customized models on the 6 datasets without efforts needed for editing training/evaluation scripts.

For detailed information, refer to paper:

@article{zhu2024wavrx,
  title={WavRx: a Disease-Agnostic, Generalizable, and Privacy-Preserving Speech Health Diagnostic Model},
  author={Zhu, Yi and Falk, Tiago},
  journal={arXiv preprint arXiv:2406.18731},
  year={2024}
}

🛠️ Dependencies

We use PyTorch and SpeechBrain as the main frameworks. To set up the environment for WavRx, follow these steps:

Clone the repository:

git clone https://github.com/zhu00121/WavRx
cd WavRx

Create a virtual env for the repo

python3.10.13 -m venv <NAME_YOUR_VENV>
source <NAME_YOUR_VENV>/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```

🌟 Pretrained Model Backbones (to be released on HG)

Note that some employed datasets are subject to confidentiality agreement, this restriction may also apply to the pretrained model weights. We are currently working on making the pretrained backbones open-source on HuggingFace.

Model	Dataset	Repo
WavRx-respiratory	Cambridge COVID-19 Sound	huggingface.co/
WavRx-COVID	DiCOVA2	huggingface.co/
WavRx-dysarthria	TORGO	huggingface.co/
WavRx-dysarthria	Nemours	huggingface.co/
WavRx-cancer	NCSC	huggingface.co/

👷 Data download and preparation

Majority of the datasets require agreements to be signed for obtaining access. Please refer to the Download links in the table below to go to the data download pages and follows their instructions to obtain the data. Once the data are downloaded, refer to the data prepration guide which helps to prepare the data in the required format.

Dataset	Task	Download links	Data preparation guide
Cambridge-Task1	Respiraty Symptom Detection	Contact author of paper	`exps/Cambridge_Respiratory/Guide.ipynb`
Cambridge-EN	Respiraty Symptom	TBD	`exps/Cambridge_Respiratory_Task1/Guide.ipynb`
DiCOVA2	COVID-19	Contact organizers of challenge	`exps/DiCOVA2/Guide.ipynb`
TORGO	Dysarthria	Link	`exps/TORGO/Guide.ipynb`
Nemours	Dysarthria	Contact author of paper	`exps/Nemours/Guide.ipynb`
NCSC	Cervical Cancer	Contact author of paper	`exps/NCSC/Guide.ipynb`

▶️ Quickstart

Running a single task

Since each dataset has a different dataset structure with the corresponding partition, the training receipes are therefore stored separately in different folders. Links in the table above can be used to locate the corresponding recipes for a given dataset.

The steps for training WavRx (or your own model) are as follows:

Medical data are hard to access and they typically do not have similar data structures. We make this easy for you. Use the Link to data preparation scripts to see where to download the data files, and how to prepare each dataset in the required format.
[Optional only if you want to train your own model] Place the code of your model in the model folder, it needs to have 1 output neuron (without sigmoid).
Check the hyperparameter file at exps/<DATASET>/hparams/wavrx_<DATASET>.yaml. Modify the variables if needed. We provide a detailed guidance in demos/demo_hparam.md where we walk through the hyperparam file and demonstrate how to modify it for your own usage. If you simply want to replicate our results, there is no need to change it.
The train.py does NOT need to be edited. Unless you want to change the training strategy or the loss function (i.e., Supervised training with BCEwithlogits loss). All the hyperparameters and the input models are controlled by modifying the hyperparam file. This helps to ensure that models are compared in a fair manner.
Initiate training by calling python train.py hparams/wavrx_<DATASET>.yaml. The test evaluation will be automatically conducted at the end of training using the best checkpoint with the highest F1 score. The results will be automatically saved.
Repeat step-5 for each dataset, results will be saved independently in the corresponding exp/<DATASET> folder.

😎 Voila! Enjoy your model training 😎

Running multiple tasks in one-shot

Currently the repository is not built for running multiple tasks in one-shot, while this can be done by wrapping them in one shell script. However, such function will be made available with our ongoing health benchmark project - a larger health benchmark for easy implementation and evaluation of SOTA diagnostic models for 10+ diseases. If you are interested, keep an eye on the SpeechBrain Benchmark where we will be releasing our scripts.

📧 Contact

For questions or inquiries, feel free to open an issue or you can reach the author Yi Zhu at (yi.zhu@inrs.ca).

📖 Citing

If you use WavRx and/or its backbones and/or tge training recipes, please cite:

@article{zhu2024wavrx,
  title={WavRx: a Disease-Agnostic, Generalizable, and Privacy-Preserving Speech Health Diagnostic Model},
  author={Zhu, Yi and Falk, Tiago},
  journal={arXiv preprint arXiv:2406.18731},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
data_og		data_og
demos		demos
exps		exps
model		model
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WavRx - a Speech Health Diagnostic Model

Table of Contents

🛠️ Dependencies

🌟 Pretrained Model Backbones (to be released on HG)

👷 Data download and preparation

▶️ Quickstart

Running a single task

Running multiple tasks in one-shot

📧 Contact

📖 Citing

About

Releases

Packages

Languages

zhu00121/WavRx

Folders and files

Latest commit

History

Repository files navigation

WavRx - a Speech Health Diagnostic Model

Table of Contents

🛠️ Dependencies

🌟 Pretrained Model Backbones (to be released on HG)

👷 Data download and preparation

▶️ Quickstart

Running a single task

Running multiple tasks in one-shot

📧 Contact

📖 Citing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages