Skip to content

Latest commit

 

History

History
30 lines (19 loc) · 1.73 KB

README.md

File metadata and controls

30 lines (19 loc) · 1.73 KB

AudioVision

AudioVision is a web app designed to support individuals with hearing impairments by converting audio files into text and visual representations. Using audio processing and AI, the application not only transcribes audio but also generates graphical sound wave visualizations and provides insightful analyses of the transcribed content.

Access the App

You can access my app and test it at click here.

Captura de Tela (1)

Features

  • Audio Transcription: Upload audio files in various formats, such as .ogg and .wav, to receive an accurate text transcription of the spoken content.
  • Waveform Visualization: View a graphical representation of sound waves, offering a visual way to understand the characteristics of the audio.
  • Content Analysis: Leverage AI to interpret the transcription, generating summaries and contextual insights tailored for enhanced clarity and accessibility.

Technologies Used

  • Flask for the application backend
  • Librosa and Matplotlib for waveform visualizations
  • SpeechRecognition for audio transcription
  • Google Generative AI for intelligent summaries and contextual analysis of audio content

How to Use

  1. Upload an audio file on the main page.
  2. The app converts the audio to text, displays the transcription, and shows a representative waveform.
  3. Optionally, view an AI-driven analysis of the content for broader understanding of the transcribed audio.

AudioVision was created to make auditory information more accessible through visual and textual formats, fostering inclusion and accessibility for individuals with hearing loss.