Implementation of Vocos with the MLX framework in Swift. Vocos allows for high quality reconstruction of audio from Mel spectrograms.
The Vocos
Swift package can be built and run from Xcode or SwiftPM.
A pretrained model is available on Huggingface.
import Vocos
// Load audio as an MLXArray
let audio = try AudioUtilities.loadAudioFile(url: ...)
// Reconstruct the audio from a Mel spectrogram
let vocos = try await Vocos.fromPretrained(repoId: "lucasnewman/vocos-mel-24khz-mlx")
let reconstructedAudio = vocos(audio)
// Save the reconstructed audio to a file.
try AudioUtilities.saveAudioFile(url: ..., samples: reconstructedAudio)
@article{siuzdak2023vocos,
title={Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis},
author={Siuzdak, Hubert},
journal={arXiv preprint arXiv:2306.00814},
year={2023}
}
The code in this repository is released under the MIT license as found in the LICENSE file.