A command-line tool for augmenting WAV audio files by adding uniform random noise and shifting pitch via linear resampling. Designed for audio data preparation in machine learning pipelines, it processes files recursively while preserving the original directory structure in the output.
- Recursive Processing: Scans input directories (including subfolders) for
.wavfiles. - Noise Augmentation: Adds configurable uniform random noise to simulate real-world audio perturbations.
- Pitch Shifting: Applies simple pitch shifts by resampling, adjusting playback speed (note: this changes file duration).
- Format Validation: Ensures input files are mono, 16-bit PCM at 16 kHz (as specified for common speech datasets).
- Efficient & Idiomatic: Built with Rust for safety, speed, and minimal dependencies.
-
Clone the repository:
git clone https://github.com/RustedBytes/wav-files-augment.git cd wav-files-augment -
Build and install:
cargo install --path .
Requires Rust 1.75+ (stable channel).
wav-files-augment [OPTIONS] --input-dir <INPUT_DIR> --output-dir <OUTPUT_DIR>--input-dir <INPUT_DIR>: Path to the input directory containing WAV files (processed recursively).--output-dir <OUTPUT_DIR>: Path to the output directory for augmented files.
--noise-level <NOISE_LEVEL>: Noise amplitude (0.0 to 1.0; default:0.01). Higher values add more noise.--pitch-ratio <PITCH_RATIO>: Pitch shift factor (>1.0 raises pitch/shortens duration; <1.0 lowers pitch/lengthens duration; default:1.0for no shift).
Run wav-files-augment --help for full details.
Process all WAV files in data/raw/ and save to data/augmented/ with default noise (0.01) and no pitch shift:
wav-files-augment --input-dir data/raw --output-dir data/augmentedAdd stronger noise (0.05) and raise pitch by 10% (ratio 1.1, shortening files):
wav-files-augment --input-dir data/raw --output-dir data/augmented --noise-level 0.05 --pitch-ratio 1.1Output files maintain the relative path structure (e.g., data/raw/subdir/file.wav → data/augmented/subdir/file.wav).
- Input: RIFF (little-endian) WAVE audio, Microsoft PCM, 16-bit, mono, 16000 Hz.
- Output: Identical format to input.
Unsupported files are skipped with an error message.
- Pitch Shifting: Uses linear interpolation resampling, which is fast and dependency-free but can introduce aliasing for large shifts and always alters duration. For time-preserving pitch shifts, consider integrating crates like
rubatoordasp. - Noise Type: Uniform random noise; extend with Gaussian via
rand_distrif needed. - Performance: Single-threaded; for large datasets, parallelize with
rayon.
clap: Argument parsing.hound: WAV I/O.walkdir: Recursive directory traversal.rand: Noise generation.anyhow: Error handling.
See Cargo.toml for versions.
Run the test suite:
cargo testIncludes unit tests for noise addition, resampling (identity, up/downsampling), and clamping.
cargo build --releaseThe binary is in target/release/wav-files-augment.
- Fork the repo and create a feature branch (
git checkout -b feat/amazing-feature). - Commit changes (
git commit -m 'Add some AmazingFeature'). - Push to the branch (
git push origin feat/amazing-feature). - Open a Pull Request.
Please adhere to Rust style guidelines (run cargo fmt and cargo clippy before submitting).
MIT License - see LICENSE for details.
@software{Smoliakov_Wav_Files_Toolkit,
author = {Smoliakov, Yehor},
month = oct,
title = {{WAV Files Toolkit: A suite of command-line tools for common WAV audio processing tasks, including conversion from other formats, data augmentation, loudness normalization, spectrogram generation, and validation.}},
url = {https://github.com/RustedBytes/wav-files-toolkit},
version = {0.4.0},
year = {2025}
}