Skip to content

Neuro-Holistic Audio-eNhancement System (N-HANS)

License

Notifications You must be signed in to change notification settings

SureArtificialIntelligence/N-HANS-1

 
 

Repository files navigation

M Latest News: (Dec. 13, 2019) N-HANS supports python 3 now!!!

N-HANS is a Python toolkit for in-the-wild speech enhancement, including speech, music, and general audio denoising, separation, and selective noise or source suppression. The functionalities are realised based on two neural network models sharing the same architecture, but trained separately. The models are comprised of stacks of residual blocks, each conditioned on additional speech or environmental noise recordings for adapting to different unseen speakers or environments in real life.

In addition to a Python API, a command line interface is provided to researchers and developers: pip install N-HANS.

(c) 2019 Shuo Liu, Gil Keren, Björn Schuller: University of Augsburg published under GPL v3, see the LICENSE file for details.

Please direct any questions or requests to Shuo Liu (shuo.liu@informatik.uni-augsburg.de).

Citation

If you use N-HANS or any code from N-HANS in your research work, you are kindly asked to acknowledge the use of N-HANS in your publications.

https://arxiv.org/pdf/1911.07062.pdf

Prerequisites

  • Python 3 / Python 2.7

Python Dependencies

  • numpy >=1.14.5
  • scipy >=1.0.1
  • six >=1.10.0
  • tensorflow 1.14.0 or tensforflow-gpu 1.14.0

Usage

Loading Models

After pip install N-HANS, users are expexted to create a workspace for audio denoising or separation task, and then Linux users can utilise commands load_denoiser or load_separator to download the trained models and audio examples into the workspace. For other operation systems, please download the trained_model and audio_examples in the corresponding N_HANS subfolders, and put into the created workspace.

Applying N-HANS

N-HANS processes standard .wav audios with sample rate of 16kHz and coded in 16-bit Signed Integer PCM. Other formats are sugguested to convert to this standard setting.

Commands

Task Command Discription
speech denoising nhans_denoiser --input noisy.wav      --neg=noise.wav --neg implicates the environmental noise
selective noise suppresion nhans_denoiser --input noisy.wav --pos=preserve.wav --neg=suppress.wav --pos implicates the noise to keep               --neg hints the noise to suppress
speech separation nhans_separator --mixed.wav --pos=target.wav --neg=interference.wav --pos implicates the target speaker               --neg hints the interference speaker
  • All commands can have an additional --output path to save the processed results, default output path is audio_examples/.

Examples

Processing single wav sample

Task Example
speech denoising nhans_denoiser audio_examples/exp2_noisy.wav --neg=audio_examples/exp2_noise.wav
selective noise suppresion nhans_denoiser audio_examples/exp1_noisy.wav --pos=audio_examples/exp1_posnoise.wav --neg=audio_examples/exp2_negnoise.wav
speech separation nhans_separator audio_examples/mixed.wav --pos=audio_examples/target_speaker.wav --neg=audio_examples/noise_speaker.wav

Processing multiple wav samples in folders

Please create folders containing noisy, (positive) negative recordings, the recordings for each sample in different folders should have an identical filename.

Task Example
speech denoising nhans_denoiser audio_examples/noisy --neg=audio_examples/neg
selective noise suppresion nhans_denoiser audio_examples/noisy --pos=audio_examples/pos --neg=audio_examples/neg
speech separation nhans_separator audio_examples/mixed --pos=audio_examples/target --neg=audio_examples/interference

Train your own N-HANS

You can download the respository to train your own selective audio suppression system and separation system using N-HANS architecture.

  1. To train a selective audio suppression system, please go into N-HANS/N_HANS___Selective_Noise/ and create clean speech and noise list using create_seeds specific to your folders containg speech .wav files and noise .wav files, which will generate for two .pkl files. The AudioSet seeds list that we used for generating training, validation and test set in our publication is provided as .pkl files. To download AudioSet_seeds.

    To train an speech separation system, please go into N-HANS/N_HANS___Speech_Separation/ and create a speech list using create_seeds specific to your folder containing speech .wav files, which will produce a .pkl file.

  2. Run main.py script with your specifications indicated by FLAGS appear in the following table (default specifications were used to achieve our trained_models). The reader.py provides the training, validataion and test data pipeline and feeds the data to N-HANS neural networks constructed in main.py.

FLAGS Default Funcationalities
--speech_wav_dir './speech_wav_dir/' the directory contains all speech .wav files
--noise_wav_dir './noise_wav_dir/' the directory contains all noise .wav files
--wav_dump_folder './wav_dump/' the directory to save denoised signals
--eval_seeds 'valid' evaluation is applied for 'valid' dataset. In test, change it to 'test'
--window_frames 35 number of frames of input noisy signal
--context_frames 200 number of frames of reference context signal
--random_slices 50 number of random samples from each pair of clean speech and noise signal
--model_name 'nhans' model name
--restore_path '' the path to restore trained model
--alg 'sgd' optimiser used to train N-HANS
--train_mb 64 mini-batch size for training data
--eval_mb 64 mini-batch size for validation or test data
--lr 0.1 learning rate
--mom 0.0 monentum for optimiser
--bn_decay 0.95 batch normalisation decay
--eval_before_training False Training phase: False, Test phase: True
--eval_after_training True Training phase: True, Test phase: False
--train_monitor_every 1000 show training information for each "train_monitor_every" batches
--eval_every 5000 show evaluation information for each "eval_every" training batches
--checkpoint_dir './checkpoints' directory to save checkpoints
--summaries './summaries' directory for summairies
--dump_results './dump' directory for intermediate output of model during training
  1. To test your model, restore_path is set to the trained models, --eval_seeds=test is also required.

Authors and Contact Information

About

Neuro-Holistic Audio-eNhancement System (N-HANS)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%