Skip to content

RayanAIX/Speech-to-Text-Translator

Repository files navigation

Speech-to-Text using OpenAI Whisper 🎤➡️📝

This project demonstrates automatic speech recognition (ASR) using OpenAI Whisper, a state-of-the-art model for transcribing speech into text.
It works with .wav audio files and outputs accurate transcriptions.


📁 Dataset / Audio File

  • Name: Custom audio file (34210__acclivity__i-am-female.wav)
  • Format: .wav
  • Usage: Used to test Whisper’s transcription capabilities

🧠 Model

  • Model Used: Whisper (small)
  • Library: openai-whisper
  • Task: Speech-to-Text transcription
  • Language Support: Multilingual

🛠️ Requirements

Install required Python libraries using:

pip install openai-whisper
pip install torch

▶️ How to Run

Upload Audio File

Example: 34210__acclivity__i-am-female.wav

Transcribe with Whisper

import whisper

Load model

model = whisper.load_model("small")

Transcribe audio

result = model.transcribe("34210__acclivity__i-am-female.wav")

Print text

print("📝 Transcription:", result["text"])

Output

The model will return the spoken text from the audio file.

📊 Results

Sample Audio (34210__acclivity__i-am-female.wav): "I am female"

Model: Whisper Small

Accuracy: High for clear recordings

📈 Possible Extensions

Batch transcription for multiple .wav files

Real-time speech recognition using a microphone

Language detection & translation with Whisper

Integration into AI assistants or chatbots


🤖 Author

Muhammad Rayan Shahid
AI Enthusiast | YouTuber at ByteBrilliance AI


⭐ GitHub Repo

Stay tuned for more projects on AI, ML, DL, and Computer Vision!