FastAPI backend for ESP32 audio interaction. It accepts audio input from ESP32 devices, processes it via LLM, and returns synthesized audio responses.
git clone https://github.com/Sathya4683/esp32Chatbot-server
cd esp32Chatbot-serverpython3 -m venv venv
source venv/bin/activatepython -m venv venv
venv\Scripts\activatepip install -r requirements.txtpython3 main.py
# or (on Windows)
python main.pyThe server would now be running on http://127.0.0.1:8000 and would take use http://127.0.0.1:8000/convert route to input the recorded audio from the I2S microphone (eg:INMP441) of the microcontroller and returns the LLM response in an audio file (.wav or .mp3- can be changed as needed) which can be played through speaker options such as I2S DAC Module (eg:MAX98357A). Basically an HTTP server implementation of sending requests (prompts) to the LLM and receiving responses.
If you have Docker and Docker Compose installed, you can start the backend without manually installing dependencies:
docker-compose up --build-
Health check endpoint.
-
Returns:
{ "status": "healthy" }
- Accepts audio file (WAV/MP3).
- Transcribes speech to text.
- Sends to LLM for response generation.
- Converts response text to audio.
- Returns audio as a streaming WAV/MP3 response.
You can simulate an ESP32 audio POST request using the following script:
request/python_simulation/post_simulation.py
This script is useful for logical and intuitive testing of the POST request and response retrieval functionality without actual hardware.
- FastAPI
- Redis (for conversation memory/short term)
- ChromaDB (for personal information recall/long term)
- Speech Recognition & Text-to-Speech
- ESP32 HTTP client integration
- LangChain and Gemini AI for initial implementation testing
- SQLite (storage of ChromaDB personal information recall)
