The Torch-CREPE project re-develop the CREPE pitch estimation model in PyTorch, empowering its optimization and adaptation for real-time voice pitch detection tasks. By re-developping this deep learning-based system, we unlock new research possibilities for music signal processing and audio analysis applications.
The PyTorch CREPE implementation utilizes the Torch and Torchaudio library to process and analyze audio signals. The project's core functionality is based on the CREPE model, which estimates fundamental frequencies from audio data.
The way this model achieve this is by doing a classification of 20ms audio chunks on 350 classes representing the audio range in cents of the observed fundamental frequency.
- Real-time pitch detection: Processing done in realtime using the given script.
- Optimized for instrument and voices: Trained on instruments and voices for maximum usescases focuses.
- Deep learning-based: system with full PyTorch implementation
- Fast Integration with Torchaudio library
- Trainable on Consumer GPU (complete train done on an RTX-3080)
To run the PyTorch CREPE demo locally, you can use the following Python code:
import torchaudio
from crepe.model import crepe
from crepe.utils import load_test_file
crepe = crepe(model_capacity="tiny", device='cpu')
audio, sr = load_test_file()
time, frequency, confidence, activation = crepe.predict(
audio=audio,
sr = sr
)
For a detailed documentation of the PyTorch CREPE implementation, including the API and usage guidelines, please refer to [this link].
The model is still in my training queue so only the 'tiny' version of Crepe has been trained yet.
This project is an open-source project, and contributions are always welcome. If you would like to contribute to the project, you can do so by submitting a pull request or by creating an issue on the project's GitHub page.
This project is licensed under the MIT License. See the LICENSE file for details.