Geometry-Agnostic Acoustic Processing: A Dynamic Spatial Network for Joint Echo Cancellation and Noise Suppression
pip install -r requirements.txtThis project is built on the pytorch-lightning package.
Train
python Trainer.py fit \
--config=configs/config.yaml \ # network config
--model.arch.dim_output=2 \ # output dim per T-F point
--model.arch.num_freqs=129 \ # the number of frequencies, related to model.stft.n_fft
--data.train_dir=/datasets/train \ # the path of train dataset
--data.test_dir=/datasets/val \ # the path of val dataset
--data.batch_size=[8,16] \ # batch size for train and val
--trainer.devices=0, \ # train device
--trainer.max_epochs=100 # better performance may be obtained if more epochs are givenTest the model trained:
python Trainer.py test --config=logs/VSAECNet/version_x/config.yaml \
--ckpt_path=logs/VSAECNet/version_x/checkpoints/epochY.ckpt \
--trainer.devices=0,The code repository includes an interactive demo with test audio samples in the wav directory, demonstrating the system's core capabilities. Using a single trained checkpoint, we evaluate performance across three key scenarios: far-end single-talk (FST), near-end single-talk (NST), and double-talk (DT). For the DT scenario specifically, we provide acoustic echo cancellation (AEC) and noise suppression results for three different microphone array configurations (3-, 4-, and 6-microphone setups).