This is an unofficial Python package for UTMOS (UTokyo-SaruLab MOS Prediction System). This repository is based on the original code. The paper is available here.
UTMOS is designed for calculating the mean opinion score (MOS) for a given voice sample. It can be used to calculate audio quality for datasets.
The score is on a scale of 1 to 5. If you'd like a score on 1 to 100, just multiply the score by 20 (score * 20
).
Example: new_score = round(score * 100, 2)
This implementation supports CPU, CUDA, and MPS, as well as ROCm if PyTorch is configured properly. This implementation will automatically use the GPU if available.
pip install utmos
utmos audio.wav
import utmos
model = utmos.Score() # The model will be automatically downloaded and will automatically utilize the GPU if available.
model.calculate_wav_file('audio_file.wav') # -> Float
# or model.calculate_wav(wav, sample_rate)
This software is licensed under the MIT license.