This project demonstrates automatic speech recognition (ASR) using OpenAI Whisper, a state-of-the-art model for transcribing speech into text.
It works with .wav
audio files and outputs accurate transcriptions.
- Name: Custom audio file (
34210__acclivity__i-am-female.wav
) - Format:
.wav
- Usage: Used to test Whisper’s transcription capabilities
- Model Used: Whisper (small)
- Library:
openai-whisper
- Task: Speech-to-Text transcription
- Language Support: Multilingual
Install required Python libraries using:
pip install openai-whisper
pip install torch
Upload Audio File
Example: 34210__acclivity__i-am-female.wav
Transcribe with Whisper
import whisper
model = whisper.load_model("small")
result = model.transcribe("34210__acclivity__i-am-female.wav")
print("📝 Transcription:", result["text"])
Output
The model will return the spoken text from the audio file.
📊 Results
Sample Audio (34210__acclivity__i-am-female.wav): "I am female"
Model: Whisper Small
Accuracy: High for clear recordings
📈 Possible Extensions
Batch transcription for multiple .wav files
Real-time speech recognition using a microphone
Language detection & translation with Whisper
Integration into AI assistants or chatbots
Muhammad Rayan Shahid
AI Enthusiast | YouTuber at ByteBrilliance AI
Stay tuned for more projects on AI, ML, DL, and Computer Vision!