TenserGo-Assessement

Generate a Speech- to-Speech LLM Bot using technologies cv2, pyttsx3, speech_recognition, threading, time.

Speech-to-Speech LLM Bot

This repository contains a Speech-to-Speech LLM bot that leverages computer vision, text-to-speech, and speech recognition to interact with users. The bot captures video and audio from the user, recognizes the spoken words, and responds by repeating the recognized words. The bot operates within a 3-second window to capture and process the input.

Features

Speech Recognition: The bot uses speech_recognition to capture and recognize spoken words from the user.
Text-to-Speech: The recognized words are converted into speech using pyttsx3, enabling the bot to respond verbally.
Computer Vision: The bot utilizes cv2 (OpenCV) for capturing video from the user's camera, adding a visual aspect to the interaction.
Threading: The application runs the speech recognition and video capture in parallel, ensuring a smooth user experience.

Technologies Used

Python: The core programming language used.
Streamlit: For creating the UI in app.py.
OpenCV (cv2): For video capture and processing.
pyttsx3: For text-to-speech conversion.
SpeechRecognition (speech_recognition): For capturing and recognizing speech.
Threading: To run multiple tasks concurrently.

Installation

Clone the repository:

git clone https://github.com/ph-22416/speech-to-speech-llm-bot.git
cd speech-to-speech-llm-bot

Install the required dependencies:

pip install speechrecognition pyttsx3 opencv-python pyaudio numpy
pip install streamlit

Run the application:
```
streamlit run app.py
```

How It Works

UI Design (app.py): The user interface is created using Streamlit, where the speech recognition function is integrated and triggered.
Speech Recognition (speech.py): The core logic for recognizing speech is implemented in speech.py. The function recognize_speech() captures the user's spoken words using speech_recognition.
Video Capture and Processing: Using OpenCV (cv2), the bot opens a 3-second video window to capture the user's video input.
Text-to-Speech Response: After recognizing the speech, the bot responds with a verbal confirmation of what it recognized, using pyttsx3 for text-to-speech conversion.
Threading for Concurrency: The bot uses threading to manage speech recognition and video capture concurrently, ensuring that the tasks run smoothly without blocking each other.

Example

When the user speaks into their microphone, the bot will capture the video and audio, recognize the spoken words, and respond with:

"You said: [recognized words]" and then response it.

Contact

For any questions or inquiries, please reach out via (priyanshichaudhary2015@gmail.com) .

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
PORJECT OUTOUT(1).png		PORJECT OUTOUT(1).png
PROJECT OUTPUT(2).png		PROJECT OUTPUT(2).png
PROJECT OUTPUT(3).png		PROJECT OUTPUT(3).png
README.md		README.md
app.py		app.py
speech.py		speech.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TenserGo-Assessement

Speech-to-Speech LLM Bot

Features

Technologies Used

Installation

How It Works

Example

Contact

About

Releases

Packages

Languages

PC-2208/Speech-to-Speech-LLM-Bot

Folders and files

Latest commit

History

Repository files navigation

TenserGo-Assessement

Speech-to-Speech LLM Bot

Features

Technologies Used

Installation

How It Works

Example

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages