- TTS: Pvorca
- Object Detection: YOLOv8 by Ultralytics
- Vision Annotation: Roboflow
- Speech to Text: Wisper AI
- LLM: Google Gemini Pro
- Microcontroller: Arduino Uno
- UI: Gradio
- 2D Simulator: Matplotlib
- Code Language: Python 3.10
- Hot word detect : Porcupine
- VAD: Cobra
If you have $$$ for GPUs (not recommended), then you can make it completely offline.
- You can fine-tune a Mistral 7B with your data.
- Model: Mistral-7B-v0.1
- Tutorial: Mistral-7B Tutorial
- My model (LORA Adapter Model) (FLOP): Loara Chat Arm
- You can run the same Wisper model locally with minor changes.
- In the text-to-speech model, you can use Bark. I tried Bark for this project. In Bark, you can even clone your voice.