Fotios Drakopoulos, Deepak Baby and Sarah Verhulst. Real-time audio processing on a Raspberry Pi using deep neural networks, 23rd International Congress on Acoustics, 9 to 13 September 2019 in Aachen, Germany.
This work received funding from the European Research Council (ERC) under the Horizon 2020 Research and Innovation Programme (grant agreement No 678120 RobSpear)
The Keras framework for the implementation of the AECNN models is adapted from here. The necessary scripts can be found in the AECNN folder.
-
The dataset of Valentini et al. was used for the training and testing of the models. The
download_dataset.sh
script can be used to download the dataset (it requires sox to downsample the data to 16 kHz)../download_dataset.sh
-
The data need to be segmented in the training and testing sets depending on the desired window size (the input/output size of the AECNN model). For this, the variable
opts ['window_size']
needs to be defined in theprepare_data.py
script.python prepare_data.py
python run_aecnn.py
The opts
variable needs to be edited to modify the architecture configurations of the AECNN model.
-
To run the real-time framework, the jackclient-python module needs to be installed first. You can find the installations instructions on its github page: https://github.com/spatialaudio/jackclient-python
-
Set your sound device in the
start_jackd.sh
script (by default the primary sound device hw:0 is used). -
After the installation of the jackclient, the trained model can be executed using the following command:
python audio_processing.py -m model_directory -n model_input_size -f keras
The directory of the trained model needs to be defined with the
-m
argument as well as the input/output size of the model with the-n
argument. Keras or Tensorflow can be used as the frontend (-f
) and 0% or 50% overlap (-o
) or frame buffering (-b
) can be applied.
A trained model can be benchmarked within the current framework in terms of execution time, in order to get the time constrains for different settings. A .wav file needs to be provided (-i
) and the (parent) directory containing the model folder(s) needs to be defined (-d
). This way, multiple models can be benchmarked with this script.
python model_benchmark.py -i wav_file -d parent_directory -f keras -it 1
A trained model can be converted to protobuf format for inference in Tensorflow:
python tensorflow_converter.py -m model_directory
The number of parameters and floating-point operations of a trained model can be computed:
python measure_complexity.py -m model_directory