This project implements a FastAPI-based web service that provides Speech-to-Text (STT) and Text-to-Speech (TTS) functionality. It includes endpoints for converting audio to text, text to audio, and a debug endpoint for generating test audio.
- Speech-to-Text (STT) conversion
- Text-to-Speech (TTS) conversion
- API key authentication
- Debug endpoint for audio generation
- Logging for better traceability and debugging
- Python 3.9+
- Docker (optional, for containerized deployment)
-
Clone the repository:
git clone https://github.com/yourusername/stt-tts-api.git cd stt-tts-api
-
Install the required packages:
pip install -r requirements.txt
To start the server, run:
python main.py
The server will start on http://0.0.0.0:8000
.
-
Speech-to-Text (STT)
- URL:
/stt
- Method: POST
- Headers:
X-App-ID
: Your App IDX-App-Key
: Your App Key
- Body: Raw audio file
- URL:
-
Text-to-Speech (TTS)
- URL:
/tts
- Method: POST
- Headers:
X-App-ID
: Your App IDX-App-Key
: Your App Key
- Body: JSON
{ "text": "Your text to convert to speech" }
- URL:
-
Debug Audio
- URL:
/debug_audio
- Method: GET
- Headers:
X-App-ID
: Your App IDX-App-Key
: Your App Key
- URL:
To build and run the Docker container:
docker build -t stt-tts-api .
docker run -p 8000:8000 stt-tts-api
Set the following environment variables:
APP_ID
: Your application IDAPP_KEY
: Your application key
Logs are configured to output to the console with INFO level. Check the logs for debugging and monitoring the application's behavior.