AI Voice and Text Interaction API

This project is an AI model capable of understanding and responding to voice and text inputs related to specific business logic operations. The AI model interacts using HTTP protocol, receiving voice data as arrays of bytes and returning voice responses as arrays of bytes. Additionally, it handles text inputs and outputs in JSON format.

Overview

Developed to handle business logic operations, this AI model supports voice and text interactions. It converts received voice data to text, understands the user's intent, processes the business logic, and generates an appropriate response.

Functional Requirements

Voice Interaction

Input: Voice data received as an array of bytes.
Output: Voice response as an array of bytes.
Process: Convert the received voice data to text, understand the user's intent, and generate an appropriate voice response.

Text Interaction

Input: Text data received as JSON.
Output: Text response in JSON format.
Process: Understand the text input, process the business logic, and generate a relevant text response.

Endpoints

GET /: Welcome message.
GET /health: Health check endpoint.
POST /api/voice-input: Accepts voice data and returns a voice response.
POST /api/text-input: Accepts text data and returns a text response.

Setup and Installation

Prerequisites

Python 3.8 or above
Docker (optional, for containerization)

Installation

Clone the repository:

git clone https://github.com/elcaiseri/ai-voice-text-interaction-api.git
cd ai-voice-text-interaction-api

Create and activate a virtual environment:

python3 -m venv env
source env/bin/activate

Install the dependencies:
```
pip install -r requirements.txt
```

Running the Application

Run the FastAPI server:
```
uvicorn app.main:app --reload
```
Access the application at http://localhost:8000.

Docker Setup

Build the Docker image:
```
docker build -t my-fastapi-app .
```

Run the Docker container:

docker run -d --name fastapi-container -p 8000:8000 my-fastapi-app

Testing the Application

Voice Input

Use the following curl command to test the /api/voice-input endpoint (ensure you have a sample.wav file):

curl -X POST "http://localhost:8000/api/voice-input" \
    -H "Content-Type: multipart/form-data" \
    -F "file_upload=@sample.wav;type=audio/wav" --output response.wav

Text Input

Use the following curl command to test the /api/text-input endpoint:

curl -X POST "http://localhost:8000/api/text-input" -H "Content-Type: application/json" -d '{"input": "How many items do I have in location X?"}'

Tasks In Progress

Dynamic Data Handling
- Working on dynamic data responses for queries like item quantities.
Advanced NLP Models
- Exploring advanced NLP models for better query handling.

Next Steps

Complete dynamic data handling.
Refine NLP models.
Further testing and debugging.
Prepare for final deployment.

Project Structure

ai_model
│
├── app
│   ├── main.py
│   ├── nlp.py
│   ├── voice.py
│   ├── business_logic.py
│   ├── responses.py
│   └── models
│       └── user_input.py
│
├── tests
│   ├── test_main.py
│   └── test_business_logic.py
│
├── requirements.txt
├── Dockerfile
└── .dockerignore

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Voice and Text Interaction API

Table of Contents

Overview

Functional Requirements

Voice Interaction

Text Interaction

Endpoints

Setup and Installation

Prerequisites

Installation

Running the Application

Docker Setup

Testing the Application

Voice Input

Text Input

Tasks In Progress

Next Steps

Project Structure

Contributing

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
app		app
tests		tests
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
response.wav		response.wav
sample.wav		sample.wav

License

elcaiseri/ai-voice-text-interaction-api

Folders and files

Latest commit

History

Repository files navigation

AI Voice and Text Interaction API

Table of Contents

Overview

Functional Requirements

Voice Interaction

Text Interaction

Endpoints

Setup and Installation

Prerequisites

Installation

Running the Application

Docker Setup

Testing the Application

Voice Input

Text Input

Tasks In Progress

Next Steps

Project Structure

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages