Align audio array shape #45

emanuele-moscato · 2024-07-11T12:59:10Z

Various fixes:

Align the audio array reshaping with what librosa needs for resampling (when working with mono audio, librosa requires the audio array to have shape (n_samples,), not (n_samples, 1) as we have).
Fix bug related to input features to speech models (sometimes the feature extractor returns input tensors under the input_values key, sometimes under the input_features one). Note: should this be done for all the model helpers?
In utils_removal.py, fix the case in which the offset a subtracted from the word start timestamp would have caused the audio to be read from the end.
In utils_removal.py, fix the paths for the (local) mp3 files with white and pink noise.
Added tests for text and speech
Readme updated and Testing.md created

To do after merging the testing_environment branch:

Make sure that tests still work.
Add pytest among the requirements (pyproject.toml).
Try creating a new environment from scratch, install all the requirements (watch out for the manual ones!) and make sure tests still run.

…nal and structural tests

… TESTTING.md

…o speech models

…d from the end

…so added an audio that I use to test the functions of the benchmark_speech bug: bug fix in gradient_speech_explainer.py because it should have been commented the transcription part since the whole FerretAudio was changed and its functions changed fix: there is an update on ctranslate2, so I added a piece of code that fixes that takes into account that update of ctranslate2

Testing environment

gaiageagea and others added 5 commits April 11, 2024 15:38

initial experiments in changing from unittest to pytest

400c9b8

changing structure to reduce redundancy and adding some other functio…

d8d3452

…nal and structural tests

refactor:

0b33b4a

docs: adjusting README.md to include mention about testing and adding…

5c19d15

… TESTTING.md

Fix issue with audio array reshaping, fix issue with input features t…

04affcc

…o speech models

emanuele-moscato requested a review from g8a9 July 11, 2024 12:59

emanuele-moscato self-assigned this Jul 11, 2024

emanuele-moscato changed the base branch from main to dev July 11, 2024 12:59

emanuele-moscato and others added 6 commits July 11, 2024 17:12

Fix words removal when offset would cause the audio segment to be rea…

a1054a9

…d from the end

Fix bug in reading white and pink noise mp3 files from local path

7847354

Add comments

2165a2a

Merge pull request #49 from g8a9/testing_environment

9a6b2e9

Testing environment

docs: adding pytest dependency

169c286

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align audio array shape #45

Align audio array shape #45

emanuele-moscato commented Jul 11, 2024 •

edited by gaiageagea

Loading

Align audio array shape #45

Are you sure you want to change the base?

Align audio array shape #45

Conversation

emanuele-moscato commented Jul 11, 2024 • edited by gaiageagea Loading

emanuele-moscato commented Jul 11, 2024 •

edited by gaiageagea

Loading