Skip to content

RustedBytes/wav-files-augment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wav-files-augment

A command-line tool for augmenting WAV audio files by adding uniform random noise and shifting pitch via linear resampling. Designed for audio data preparation in machine learning pipelines, it processes files recursively while preserving the original directory structure in the output.

Features

  • Recursive Processing: Scans input directories (including subfolders) for .wav files.
  • Noise Augmentation: Adds configurable uniform random noise to simulate real-world audio perturbations.
  • Pitch Shifting: Applies simple pitch shifts by resampling, adjusting playback speed (note: this changes file duration).
  • Format Validation: Ensures input files are mono, 16-bit PCM at 16 kHz (as specified for common speech datasets).
  • Efficient & Idiomatic: Built with Rust for safety, speed, and minimal dependencies.

Installation

From Source

  1. Clone the repository:

    git clone https://github.com/RustedBytes/wav-files-augment.git
    cd wav-files-augment
  2. Build and install:

    cargo install --path .

Requires Rust 1.75+ (stable channel).

Usage

wav-files-augment [OPTIONS] --input-dir <INPUT_DIR> --output-dir <OUTPUT_DIR>

Required Arguments

  • --input-dir <INPUT_DIR>: Path to the input directory containing WAV files (processed recursively).
  • --output-dir <OUTPUT_DIR>: Path to the output directory for augmented files.

Optional Arguments

  • --noise-level <NOISE_LEVEL>: Noise amplitude (0.0 to 1.0; default: 0.01). Higher values add more noise.
  • --pitch-ratio <PITCH_RATIO>: Pitch shift factor (>1.0 raises pitch/shortens duration; <1.0 lowers pitch/lengthens duration; default: 1.0 for no shift).

Run wav-files-augment --help for full details.

Examples

Basic Augmentation

Process all WAV files in data/raw/ and save to data/augmented/ with default noise (0.01) and no pitch shift:

wav-files-augment --input-dir data/raw --output-dir data/augmented

With Custom Noise and Pitch Shift

Add stronger noise (0.05) and raise pitch by 10% (ratio 1.1, shortening files):

wav-files-augment --input-dir data/raw --output-dir data/augmented --noise-level 0.05 --pitch-ratio 1.1

Output files maintain the relative path structure (e.g., data/raw/subdir/file.wavdata/augmented/subdir/file.wav).

Supported Format

  • Input: RIFF (little-endian) WAVE audio, Microsoft PCM, 16-bit, mono, 16000 Hz.
  • Output: Identical format to input.

Unsupported files are skipped with an error message.

Limitations & Trade-offs

  • Pitch Shifting: Uses linear interpolation resampling, which is fast and dependency-free but can introduce aliasing for large shifts and always alters duration. For time-preserving pitch shifts, consider integrating crates like rubato or dasp.
  • Noise Type: Uniform random noise; extend with Gaussian via rand_distr if needed.
  • Performance: Single-threaded; for large datasets, parallelize with rayon.

Development

Dependencies

  • clap: Argument parsing.
  • hound: WAV I/O.
  • walkdir: Recursive directory traversal.
  • rand: Noise generation.
  • anyhow: Error handling.

See Cargo.toml for versions.

Testing

Run the test suite:

cargo test

Includes unit tests for noise addition, resampling (identity, up/downsampling), and clamping.

Building

cargo build --release

The binary is in target/release/wav-files-augment.

Contributing

  1. Fork the repo and create a feature branch (git checkout -b feat/amazing-feature).
  2. Commit changes (git commit -m 'Add some AmazingFeature').
  3. Push to the branch (git push origin feat/amazing-feature).
  4. Open a Pull Request.

Please adhere to Rust style guidelines (run cargo fmt and cargo clippy before submitting).

License

MIT License - see LICENSE for details.

Cite

@software{Smoliakov_Wav_Files_Toolkit,
  author = {Smoliakov, Yehor},
  month = oct,
  title = {{WAV Files Toolkit: A suite of command-line tools for common WAV audio processing tasks, including conversion from other formats, data augmentation, loudness normalization, spectrogram generation, and validation.}},
  url = {https://github.com/RustedBytes/wav-files-toolkit},
  version = {0.4.0},
  year = {2025}
}

About

Create modified audio by adding noise, shifting pitch for WAV files

Topics

Resources

License

Stars

Watchers

Forks

Languages