An AI-powered multilingual voice assistant for medical consultations. DocJarvis conducts initial patient assessments through voice interaction, supports 10 Indian languages, and generates consultation summaries.
Disclaimer: This is an AI-assisted tool for educational purpose only. Always consult qualified healthcare professionals for medical advice.
-
Voice-based interaction: Speech-to-text and text-to-spech capabilities
-
Multilingual support: English +9 regional Indian languages
-
AI-powered diagnosis: Uses Google's Gemini Pro for intellignet questioning
-
Prescription generation: Automated consultation summary documents
-
User-friendly interface: Gradio-based web UI
| Language | Code | Language | Code |
|---|---|---|---|
| English | en | Malayalam | ml |
| Bengali | bn | Marathi | mr |
| Gujarati | gu | Tamil | ta |
| Hindi | hi | Telugu | te |
| Kannada | kn | Urdu | ur |
Prerequisites
- Python 3.10+
- Working microphone
- Internet connection
- Google AI Studio API key
Installation
- Clone the repository
git clone https://github.com/singhdivyank/voice-assistant.git
cd docjarvis- Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
pip install -r requirements.txt- Configure environment
cp .env .example .env- Run the application
python -m src.appPyAudio installation on Mac:
brew install portaudio
python3 -m pip install pyaudioPyAudio installation on Linux:
sudo apt-get install python3-pyaudio portaudio19-dev
pip install pyaudioFor audio playback on Linux, install one of these players:
# Recommended
sudo apt-get install mpg123
# Alternatives
sudo apt-get install mpg321
sudo apt-get install ffmpeg # provides ffplayPyAudio installation on Windows: Audio playback uses PowerShell's built-in Media.SoundPlayer (no additional installation required)
Create a .env file with the following variables:
GOOGLE_API_KEY=your_gemini_api_key_here
# Optional (defaults shown)
GRADIO_SERVER_NAME=0.0.0.0
GRADIO_SERVER_PORT=7860- Open the web interface (default: http://localhost:7860)
- Select your preferred language
- Enter your gender and age
- Click Submit to start the consultation
- Speak your symptoms when prompted
- Answer follow-up questions verbally
- Receive your consultation summary
voice-assistant/
├── src/
│ ├── config/ # Configuration and settings
│ │ └── settings.py # App config, prompts, enums
│ ├── core/ # Core business logic
│ │ ├── diagnosis.py # LLM-powered diagnosis
│ │ └── prescription.py
│ ├── services/ # External service integrations
│ │ ├── speech.py # STT/TTS services
│ │ └── translation.py
│ ├── utils/ # Utilities and helpers
│ │ ├── exceptions.py
│ │ └── file_handler.py
│ └── app.py # Main application
├── requirements.txt
└── README.md
from src.core.diagnosis import DiagnosisService, PatientInfo
from src.config.settings import Gender
service = DiagnosisService()
patient = PatientInfo(age=30, gender=Gender.MALE)
# Create session with diagnostic questions
session = service.create_session(patient, "I have a headache")
# Add patient responses
service.add_response(session, 0, "It started yesterday")
# Get recommendations
recommendations = service.complete_session(session)
from src.services.translation import TranslationService
from src.config.settings import Language
translator = TranslationService(Language.HINDI)
# Translate for LLM (to English)
english_text = translator.to_english("सिरदर्द है")
# Translate for user
hindi_text = translator.to_user_language("Take rest")