This repository contains the python implementation for the paper "TSSL: Trusted Sound Source Localization".
- Source signals: LibriSpeech
- Noise signals: Noise92X
- The real-world dataset: LOCATA
These datasets mentioned above can be downloaded from this OneDrive link.
The data directory structure is shown as follows:
.
|---data
|---LibriSpeech
|---dev-clean
|---test-clean
|---train-clean-100
|---NoiSig
|---test
|---train
|---dev
Note: The data/
file does not have to be within your project, you can put it somewhere you want. Please remembet to fill the correct data path in config/tcrnn.yaml
.
We strongly recommend that you can use VSCode and Docker for this project, it can save you much time😁! Note that the related configurations has already been within .devcontainer
. The detail information can be found in this Tutorial_for_Vscode&Dokcer.
The environment:
- cuda:11.8.0
- cudnn: 8
- python: 3.10
- pytorch: 2.1.0
- pytorch lightning: 2.1
The realted configurations are all saved in config/
.
- The
data_simu.yaml
is used to configure the data generation. - The
tcrnn.yaml
is used to configure the dataloader, model training & test.
You can change the value of these items based on your need.
Note: Do not forget to intall gpuRIR and webrtcvad.
- Data Generation
Generate the training data:
python data_simu.py DATA_SIMU.TRAIN=True DATA_SIMU.TRAIN_NUM=10000
In the same way, you can also generate the validation and test datasets by changing the DATA_SIMU.TRAIN=True
to DATA_SIMU.DEV=True
or DATA_SIMU.TEST=True
.
- Model Training
python main_crnn.py fit --config /workspaces/tssl/config/tcrnn.yaml
The parameter for --config
should point to your config file path.
- Model Evaluation
- Change the
ckpt_path
in theconfig/tcrnn.yaml
to the trained model weight. - Use Multiple GPUs or Single GPU to test the model performance.
python main_crnn.py test --config /workspaces/tssl/config/tcrnn.yaml
If you want to evaluate the model using the Single GPU, you can change the value of the devices
from "0,1"
to "0,"
in the config/tcrnn.yaml
.
If you find our work useful in your research, please consider citing:
This repository adapts and integrates from some wonderful work, shown as follows: