diff --git a/Docs/getting-started.md b/Docs/getting-started.md index 6483f00f..3b4903de 100644 --- a/Docs/getting-started.md +++ b/Docs/getting-started.md @@ -342,7 +342,7 @@ curl -X POST "http://localhost:8000/api/process-audio-files" \ **Implementation**: - **Memory System**: `src/advanced_omi_backend/memory/memory_service.py` + `src/advanced_omi_backend/controllers/memory_controller.py` -- **Configuration**: `memory_config.yaml` + `src/advanced_omi_backend/memory_config_loader.py` +- **Configuration**: memory settings in `config.yml` (memory section) ### Authentication & Security - **Email Authentication**: Login with email and password @@ -541,10 +541,10 @@ OPENMEMORY_MCP_URL=http://host.docker.internal:8765 > 🎯 **New to memory configuration?** Read our [Memory Configuration Guide](./memory-configuration-guide.md) for a step-by-step setup guide with examples. -The system uses **centralized configuration** via `memory_config.yaml` for all memory extraction settings. All hardcoded values have been removed from the code to ensure consistent, configurable behavior. +The system uses **centralized configuration** via `config.yml` for all models (LLM, embeddings, vector store) and memory extraction settings. ### Configuration File Location -- **Path**: `backends/advanced-backend/memory_config.yaml` +- **Path**: repository `config.yml` (override with `CONFIG_FILE` env var) - **Hot-reload**: Changes are applied on next processing cycle (no restart required) - **Fallback**: If file is missing, system uses safe defaults with environment variables @@ -613,7 +613,7 @@ If you experience JSON parsing errors in fact extraction: 2. **Enable fact extraction** with reliable JSON output: ```yaml - # In memory_config.yaml + # In config.yml (memory section) fact_extraction: enabled: true # Safe to enable with GPT-4o ``` @@ -727,5 +727,5 @@ curl -H "Authorization: Bearer $ADMIN_TOKEN" \ - **Connect audio clients** using the WebSocket API - **Explore the dashboard** to manage conversations and users - **Review the user data architecture** for understanding data organization -- **Customize memory extraction** by editing `memory_config.yaml` -- **Monitor processing performance** using debug API endpoints \ No newline at end of file +- **Customize memory extraction** by editing the `memory` section in `config.yml` +- **Monitor processing performance** using debug API endpoints diff --git a/README.md b/README.md index 34027891..920d2433 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ Self-hostable AI system that captures audio/video data from OMI devices and othe ## Quick Start → [Get Started](quickstart.md) -Clone, run setup wizard, start services, access at http://localhost:5173 +Clone, customize config.yml, start services, access at http://localhost:5173 ## Screenshots diff --git a/backends/advanced/Dockerfile b/backends/advanced/Dockerfile index c8f54ac7..352bcfe9 100644 --- a/backends/advanced/Dockerfile +++ b/backends/advanced/Dockerfile @@ -35,7 +35,7 @@ COPY . . # Copy configuration files if they exist, otherwise they will be created from templates at runtime # The files are expected to exist, but we handle the case where they don't gracefully -COPY memory_config.yaml* ./ + COPY diarization_config.json* ./ diff --git a/backends/advanced/Dockerfile.k8s b/backends/advanced/Dockerfile.k8s index 097f5d7f..b746752a 100644 --- a/backends/advanced/Dockerfile.k8s +++ b/backends/advanced/Dockerfile.k8s @@ -34,7 +34,7 @@ RUN uv sync --extra deepgram COPY . . # Copy memory config (created by init.sh from template) -COPY memory_config.yaml ./ + # Copy and make K8s startup scripts executable COPY start-k8s.sh start-workers.sh ./ diff --git a/backends/advanced/Docs/README.md b/backends/advanced/Docs/README.md index 5943c5e5..f3feb59d 100644 --- a/backends/advanced/Docs/README.md +++ b/backends/advanced/Docs/README.md @@ -13,7 +13,7 @@ Welcome to chronicle! This guide provides the optimal reading sequence to unders - What the system does (voice → memories) - Key features and capabilities - Basic setup and configuration -- **Code References**: `src/advanced_omi_backend/main.py`, `memory_config.yaml`, `docker-compose.yml` +- **Code References**: `src/advanced_omi_backend/main.py`, `config.yml`, `docker-compose.yml` ### 2. **[System Architecture](./architecture.md)** **Read second** - Complete technical architecture with diagrams @@ -70,12 +70,12 @@ Welcome to chronicle! This guide provides the optimal reading sequence to unders ## 🔍 **Configuration & Customization** -### 6. **Configuration File** → `../memory_config.yaml` +### 6. **Configuration File** → `../config.yml` **Central configuration for all extraction** - Memory extraction settings and prompts - Quality control and debug settings - **Code References**: - - `src/advanced_omi_backend/memory_config_loader.py` (config loading) + - `src/advanced_omi_backend/model_registry.py` (config loading) - `src/advanced_omi_backend/memory/memory_service.py` (config usage) --- @@ -86,11 +86,11 @@ Welcome to chronicle! This guide provides the optimal reading sequence to unders 1. [quickstart.md](./quickstart.md) - System overview 2. [architecture.md](./architecture.md) - Technical architecture 3. `src/advanced_omi_backend/main.py` - Core imports and setup -4. `memory_config.yaml` - Configuration overview +4. `config.yml` - Configuration overview ### **"I want to work on memory extraction"** 1. [memories.md](./memories.md) - Memory system details -2. `../memory_config.yaml` - Memory configuration +2. `../config.yml` - Models and memory configuration 3. `src/advanced_omi_backend/memory/memory_service.py` - Implementation 4. `src/advanced_omi_backend/controllers/memory_controller.py` - Processing triggers @@ -128,9 +128,9 @@ backends/advanced-backend/ │ ├── controllers/ # Business logic controllers │ ├── memory/ │ │ └── memory_service.py # Memory system (Mem0) -│ └── memory_config_loader.py # Configuration loading +│ └── model_registry.py # Configuration loading │ -├── memory_config.yaml # 📋 Central configuration +├── config.yml # 📋 Central configuration ├── MEMORY_DEBUG_IMPLEMENTATION.md # Debug system details ``` @@ -147,8 +147,8 @@ backends/advanced-backend/ - **Memories**: `src/advanced_omi_backend/memory/memory_service.py` → Mem0 → Qdrant ### **Configuration** -- **Loading**: `src/advanced_omi_backend/memory_config_loader.py` -- **File**: `memory_config.yaml` +- **Loading**: `src/advanced_omi_backend/model_registry.py` +- **File**: `config.yml` - **Usage**: `src/advanced_omi_backend/memory/memory_service.py` ### **Authentication** @@ -162,7 +162,7 @@ backends/advanced-backend/ 1. **Follow the references**: Each doc links to specific code files and line numbers 2. **Use the debug API**: `GET /api/debug/memory/stats` shows live system status -3. **Check configuration first**: Many behaviors are controlled by `memory_config.yaml` +3. **Check configuration first**: Many behaviors are controlled by `config.yml` 4. **Understand the memory pipeline**: Memories (end-of-conversation) 5. **Test with curl**: All API endpoints have curl examples in the docs @@ -175,23 +175,23 @@ backends/advanced-backend/ 1. **Set up the system**: Follow [quickstart.md](./quickstart.md) to get everything running 2. **Test the API**: Use the curl examples in the documentation to test endpoints 3. **Explore the debug system**: Check `GET /api/debug/memory/stats` to see live data -4. **Modify configuration**: Edit `memory_config.yaml` to see how it affects extraction +4. **Modify configuration**: Edit `config.yml` (memory section) to see how it affects extraction 5. **Read the code**: Start with `src/advanced_omi_backend/main.py` and follow the references in each doc ### **Contributing Guidelines** - **Add code references**: When updating docs, include file paths and line numbers - **Test your changes**: Use the debug API to verify your modifications work -- **Update configuration**: Add new settings to `memory_config.yaml` when needed +- **Update configuration**: Add new settings to `config.yml` when needed - **Follow the architecture**: Keep memories in their respective services ### **Getting Help** - **Debug API**: `GET /api/debug/memory/*` endpoints show real-time system status -- **Configuration**: Check `memory_config.yaml` for behavior controls +- **Configuration**: Check `config.yml` for behavior controls - **Logs**: Check Docker logs with `docker compose logs friend-backend` - **Documentation**: Each doc file links to relevant code sections --- -This documentation structure ensures you understand both the **big picture** and **implementation details** in a logical progression! \ No newline at end of file +This documentation structure ensures you understand both the **big picture** and **implementation details** in a logical progression! diff --git a/backends/advanced/Docs/contribution.md b/backends/advanced/Docs/contribution.md index dd645eca..a5766828 100644 --- a/backends/advanced/Docs/contribution.md +++ b/backends/advanced/Docs/contribution.md @@ -1,12 +1,12 @@ 1. Docs/quickstart.md (15 min) 2. Docs/architecture.md (20 min) 3. main.py - just the imports and WebSocket sections (15 min) - 4. memory_config.yaml (10 min) + 4. config.yml (memory section) (10 min) 🔧 "I want to work on memory extraction" 1. Docs/quickstart.md → Docs/memories.md - 2. memory_config.yaml (memory_extraction section) + 2. config.yml (memory.extraction section) 3. main.py lines 1047-1065 (trigger) 4. main.py lines 1163-1195 (processing) 5. src/memory/memory_service.py @@ -32,4 +32,4 @@ Data Flow Audio → Transcription → Dual Processing - └─ Memory Pipeline (end-of-conversation) \ No newline at end of file + └─ Memory Pipeline (end-of-conversation) diff --git a/backends/advanced/Docs/memories.md b/backends/advanced/Docs/memories.md index b2887dc9..38eed697 100644 --- a/backends/advanced/Docs/memories.md +++ b/backends/advanced/Docs/memories.md @@ -10,7 +10,7 @@ This document explains how to configure and customize the memory service in the - **Repository Layer**: `src/advanced_omi_backend/conversation_repository.py` (clean data access) - **Processing Manager**: `src/advanced_omi_backend/processors.py` (MemoryProcessor class) - **Conversation Management**: `src/advanced_omi_backend/conversation_manager.py` (lifecycle coordination) -- **Configuration**: `memory_config.yaml` + `src/memory_config_loader.py` +- **Configuration**: `config.yml` (memory section) + `src/model_registry.py` ## Overview @@ -180,7 +180,7 @@ OPENAI_MODEL=gpt-5-mini # Recommended for reliable JSON output # OPENAI_MODEL=gpt-3.5-turbo # Budget option ``` -Or configure via `memory_config.yaml`: +Or configure via `config.yml` (memory block): ```yaml memory_extraction: @@ -674,4 +674,4 @@ The new architecture ensures proper user isolation and simplifies admin debuggin Both load all user memories and view all memories are helpful Both views complement each other - the debug view helps you understand how the system is working, while the clean view -helps you understand what content is being stored. \ No newline at end of file +helps you understand what content is being stored. diff --git a/backends/advanced/Docs/memory-configuration-guide.md b/backends/advanced/Docs/memory-configuration-guide.md index 1f564564..9a694ac5 100644 --- a/backends/advanced/Docs/memory-configuration-guide.md +++ b/backends/advanced/Docs/memory-configuration-guide.md @@ -6,10 +6,10 @@ This guide helps you set up and configure the memory system for the Friend Advan 1. **Copy the template configuration**: ```bash -cp memory_config.yaml.template memory_config.yaml +Edit the `memory` section of `config.yml`. ``` -2. **Edit memory_config.yaml** with your preferred settings: +2. **Edit `config.yml`** with your preferred settings in the `memory` section: ```yaml memory: provider: "mem0" # or "basic" for simpler setup @@ -127,6 +127,6 @@ memory: ## Next Steps -- Configure action items detection in `memory_config.yaml` +- Configure action items detection in `config.yml` (memory.extraction) - Set up custom prompt templates for your use case -- Monitor memory processing in the debug dashboard \ No newline at end of file +- Monitor memory processing in the debug dashboard diff --git a/backends/advanced/Docs/quickstart.md b/backends/advanced/Docs/quickstart.md index fc5a77b7..b6528c71 100644 --- a/backends/advanced/Docs/quickstart.md +++ b/backends/advanced/Docs/quickstart.md @@ -340,7 +340,7 @@ curl -X POST "http://localhost:8000/api/audio/upload" \ **Implementation**: - **Memory System**: `src/advanced_omi_backend/memory/memory_service.py` + `src/advanced_omi_backend/controllers/memory_controller.py` -- **Configuration**: `memory_config.yaml` + `src/advanced_omi_backend/memory_config_loader.py` +- **Configuration**: `config.yml` (memory + models) in repo root ### Authentication & Security - **Email Authentication**: Login with email and password @@ -539,10 +539,10 @@ OPENMEMORY_MCP_URL=http://host.docker.internal:8765 > 🎯 **New to memory configuration?** Read our [Memory Configuration Guide](./memory-configuration-guide.md) for a step-by-step setup guide with examples. -The system uses **centralized configuration** via `memory_config.yaml` for all memory extraction settings. All hardcoded values have been removed from the code to ensure consistent, configurable behavior. +The system uses **centralized configuration** via `config.yml` for all memory extraction and model settings. ### Configuration File Location -- **Path**: `backends/advanced-backend/memory_config.yaml` +- **Path**: `config.yml` in repo root - **Hot-reload**: Changes are applied on next processing cycle (no restart required) - **Fallback**: If file is missing, system uses safe defaults with environment variables @@ -611,7 +611,7 @@ If you experience JSON parsing errors in fact extraction: 2. **Enable fact extraction** with reliable JSON output: ```yaml - # In memory_config.yaml + # In config.yml (memory section) fact_extraction: enabled: true # Safe to enable with GPT-4o ``` @@ -725,5 +725,5 @@ curl -H "Authorization: Bearer $ADMIN_TOKEN" \ - **Connect audio clients** using the WebSocket API - **Explore the dashboard** to manage conversations and users - **Review the user data architecture** for understanding data organization -- **Customize memory extraction** by editing `memory_config.yaml` -- **Monitor processing performance** using debug API endpoints \ No newline at end of file +- **Customize memory extraction** by editing the `memory` section in `config.yml` +- **Monitor processing performance** using debug API endpoints diff --git a/backends/advanced/SETUP_SCRIPTS.md b/backends/advanced/SETUP_SCRIPTS.md index c253e429..8fbc0ab2 100644 --- a/backends/advanced/SETUP_SCRIPTS.md +++ b/backends/advanced/SETUP_SCRIPTS.md @@ -6,7 +6,7 @@ This document explains the different setup scripts available in Friend-Lite and | Script | Purpose | When to Use | |--------|---------|-------------| -| `init.py` | **Main interactive setup wizard** | **Recommended for all users** - First time setup with guided configuration (located at repo root) | +| `init.py` | **Main interactive setup wizard** | **Recommended for all users** - First time setup with guided configuration (located at repo root). Memory now configured in `config.yml`. | | `setup-https.sh` | HTTPS certificate generation | **Optional** - When you need secure connections for microphone access | ## Main Setup Script: `init.py` @@ -157,4 +157,4 @@ Setup scripts are located as follows: ✅ **Provider selection**: Choose best services for your needs ✅ **Complete configuration**: Creates working .env with all settings ✅ **Next steps guidance**: Clear instructions for starting services -✅ **No manual editing**: Reduces errors from manual .env editing \ No newline at end of file +✅ **No manual editing**: Reduces errors from manual .env editing diff --git a/backends/advanced/docker-compose-test.yml b/backends/advanced/docker-compose-test.yml index f72ca54d..6c0ee447 100644 --- a/backends/advanced/docker-compose-test.yml +++ b/backends/advanced/docker-compose-test.yml @@ -24,10 +24,7 @@ services: # Import API keys from environment - DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY} - OPENAI_API_KEY=${OPENAI_API_KEY} - - OPENAI_BASE_URL=https://api.openai.com/v1 - # LLM provider configuration (required for memory service) - - LLM_PROVIDER=${LLM_PROVIDER:-openai} - - OPENAI_MODEL=${OPENAI_MODEL:-gpt-4o-mini} + - GROQ_API_KEY=${GROQ_API_KEY} # Authentication (test-specific) - AUTH_SECRET_KEY=test-jwt-signing-key-for-integration-tests - ADMIN_PASSWORD=test-admin-password-123 @@ -140,8 +137,7 @@ services: - DEBUG_DIR=/app/debug_dir - DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY} - OPENAI_API_KEY=${OPENAI_API_KEY} - - LLM_PROVIDER=${LLM_PROVIDER:-openai} - - OPENAI_MODEL=${OPENAI_MODEL:-gpt-4o-mini} + - GROQ_API_KEY=${GROQ_API_KEY} - AUTH_SECRET_KEY=test-jwt-signing-key-for-integration-tests - ADMIN_PASSWORD=test-admin-password-123 - ADMIN_EMAIL=test-admin@example.com diff --git a/backends/advanced/docker-compose.yml b/backends/advanced/docker-compose.yml index 8d4bc42f..dced4041 100644 --- a/backends/advanced/docker-compose.yml +++ b/backends/advanced/docker-compose.yml @@ -12,6 +12,7 @@ services: - ./data/audio_chunks:/app/audio_chunks - ./data/debug_dir:/app/debug_dir - ./data:/app/data + - ../../config.yml:/app/config.yml:ro environment: - DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY} - MISTRAL_API_KEY=${MISTRAL_API_KEY} @@ -24,10 +25,7 @@ services: - ADMIN_PASSWORD=${ADMIN_PASSWORD} - ADMIN_EMAIL=${ADMIN_EMAIL} - AUTH_SECRET_KEY=${AUTH_SECRET_KEY} - - LLM_PROVIDER=${LLM_PROVIDER} - OPENAI_API_KEY=${OPENAI_API_KEY} - - OPENAI_BASE_URL=${OPENAI_BASE_URL} - - OPENAI_MODEL=${OPENAI_MODEL} - NEO4J_HOST=${NEO4J_HOST} - NEO4J_USER=${NEO4J_USER} - NEO4J_PASSWORD=${NEO4J_PASSWORD} @@ -67,15 +65,15 @@ services: - ./start-workers.sh:/app/start-workers.sh - ./data/audio_chunks:/app/audio_chunks - ./data:/app/data + - ../../config.yml:/app/config.yml:ro environment: - DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY} - MISTRAL_API_KEY=${MISTRAL_API_KEY} - MISTRAL_MODEL=${MISTRAL_MODEL} - TRANSCRIPTION_PROVIDER=${TRANSCRIPTION_PROVIDER} + - PARAKEET_ASR_URL=${PARAKEET_ASR_URL} - OPENAI_API_KEY=${OPENAI_API_KEY} - - OPENAI_BASE_URL=${OPENAI_BASE_URL} - - OPENAI_MODEL=${OPENAI_MODEL} - - LLM_PROVIDER=${LLM_PROVIDER} + - GROQ_API_KEY=${GROQ_API_KEY} - REDIS_URL=redis://redis:6379/0 depends_on: redis: @@ -226,6 +224,7 @@ services: networks: default: name: chronicle-network + external: true volumes: ollama_data: diff --git a/backends/advanced/init-https.sh b/backends/advanced/init-https.sh index cfeebf61..d1c1b5af 100755 --- a/backends/advanced/init-https.sh +++ b/backends/advanced/init-https.sh @@ -70,17 +70,8 @@ else echo " 2. Add: CORS_ORIGINS=https://localhost,https://localhost:443,https://127.0.0.1,https://$TAILSCALE_IP" fi -# Create memory_config.yaml from template if it doesn't exist echo "" -echo "📄 Step 4: Checking memory configuration..." -if [ ! -f "memory_config.yaml" ] && [ -f "memory_config.yaml.template" ]; then - cp memory_config.yaml.template memory_config.yaml - echo "✅ memory_config.yaml created from template" -elif [ -f "memory_config.yaml" ]; then - echo "✅ memory_config.yaml already exists" -else - echo "⚠️ Warning: memory_config.yaml.template not found" -fi +echo "📄 Step 4: Memory configuration now lives in config.yml (memory section)" echo "" echo "🎉 Initialization complete!" @@ -102,4 +93,4 @@ echo " - Chronicle Backend: Internal (proxied through nginx)" echo " - Web Dashboard: https://localhost/ or https://$TAILSCALE_IP/" echo " - WebSocket Audio: wss://localhost/ws_pcm or wss://$TAILSCALE_IP/ws_pcm" echo "" -echo "📚 For more details, see: Docs/HTTPS_SETUP.md" \ No newline at end of file +echo "📚 For more details, see: Docs/HTTPS_SETUP.md" diff --git a/backends/advanced/init.py b/backends/advanced/init.py index 0d7e6996..0205ddae 100644 --- a/backends/advanced/init.py +++ b/backends/advanced/init.py @@ -201,19 +201,22 @@ def setup_transcription(self): self.console.print("[blue][INFO][/blue] Skipping transcription setup") def setup_llm(self): - """Configure LLM provider""" + """Configure LLM provider - shows guidance for config.yml""" self.print_section("LLM Provider Configuration") + self.console.print("[blue][INFO][/blue] LLM configuration is now managed in config.yml") + self.console.print("Edit the 'defaults.llm' field and model definitions in config.yml") + self.console.print() + choices = { "1": "OpenAI (GPT-4, GPT-3.5 - requires API key)", - "2": "Ollama (local models - requires Ollama server)", + "2": "Ollama (local models - configure in config.yml)", "3": "Skip (no memory extraction)" } - choice = self.prompt_choice("Choose your LLM provider for memory extraction:", choices, "1") + choice = self.prompt_choice("Which LLM provider will you use?", choices, "1") if choice == "1": - self.config["LLM_PROVIDER"] = "openai" self.console.print("[blue][INFO][/blue] OpenAI selected") self.console.print("Get your API key from: https://platform.openai.com/api-keys") @@ -227,35 +230,20 @@ def setup_llm(self): else: api_key = self.prompt_value("OpenAI API key (leave empty to skip)", "") - model = self.prompt_value("OpenAI model", "gpt-5-mini") - base_url = self.prompt_value("OpenAI base URL (for proxies/compatible APIs)", "https://api.openai.com/v1") - if api_key: self.config["OPENAI_API_KEY"] = api_key - self.config["OPENAI_MODEL"] = model - self.config["OPENAI_BASE_URL"] = base_url - self.console.print("[green][SUCCESS][/green] OpenAI configured") + self.console.print("[green][SUCCESS][/green] OpenAI API key configured") + self.console.print("[blue][INFO][/blue] Set 'defaults.llm: openai-llm' in config.yml to use OpenAI") else: self.console.print("[yellow][WARNING][/yellow] No API key provided - memory extraction will not work") elif choice == "2": - self.config["LLM_PROVIDER"] = "ollama" self.console.print("[blue][INFO][/blue] Ollama selected") - - base_url = self.prompt_value("Ollama server URL", "http://host.docker.internal:11434") - if not base_url.endswith("/v1"): - base_url = base_url.rstrip("/") + "/v1" - self.console.print(f"[blue][INFO][/blue] Automatically appending /v1 to Ollama URL: {base_url}") - - model = self.prompt_value("Ollama model", "llama3.2") - - embedder_model = self.prompt_value("Ollama embedder model", "nomic-embed-text:latest") - - self.config["OLLAMA_BASE_URL"] = base_url - self.config["OLLAMA_MODEL"] = model - self.config["OLLAMA_EMBEDDER_MODEL"] = embedder_model - self.console.print("[green][SUCCESS][/green] Ollama configured") - self.console.print("[yellow][WARNING][/yellow] Make sure Ollama is running and all required models (LLM and embedder) are pulled") + self.console.print("[blue][INFO][/blue] Configure Ollama in config.yml:") + self.console.print(" 1. Set 'defaults.llm: local-llm'") + self.console.print(" 2. Edit the 'local-llm' model definition with your Ollama URL and model") + self.console.print("[green][SUCCESS][/green] See config.yml for Ollama configuration") + self.console.print("[yellow][WARNING][/yellow] Make sure Ollama is running and models are pulled") elif choice == "3": self.console.print("[blue][INFO][/blue] Skipping LLM setup - memory extraction disabled") @@ -457,9 +445,6 @@ def generate_env_file(self): def copy_config_templates(self): """Copy other configuration files""" - if not Path("memory_config.yaml").exists() and Path("memory_config.yaml.template").exists(): - shutil.copy2("memory_config.yaml.template", "memory_config.yaml") - self.console.print("[green][SUCCESS][/green] memory_config.yaml created") if not Path("diarization_config.json").exists() and Path("diarization_config.json.template").exists(): shutil.copy2("diarization_config.json.template", "diarization_config.json") @@ -472,7 +457,7 @@ def show_summary(self): self.console.print(f"✅ Admin Account: {self.config.get('ADMIN_EMAIL', 'Not configured')}") self.console.print(f"✅ Transcription: {self.config.get('TRANSCRIPTION_PROVIDER', 'Not configured')}") - self.console.print(f"✅ LLM Provider: {self.config.get('LLM_PROVIDER', 'Not configured')}") + self.console.print("✅ LLM: Configured in config.yml (defaults.llm)") self.console.print(f"✅ Memory Provider: {self.config.get('MEMORY_PROVIDER', 'chronicle')}") # Auto-determine URLs based on HTTPS configuration if self.config.get('HTTPS_ENABLED') == 'true': @@ -586,4 +571,4 @@ def main(): if __name__ == "__main__": - main() \ No newline at end of file + main() diff --git a/backends/advanced/memory_config.yaml.template b/backends/advanced/memory_config.yaml.template deleted file mode 100644 index 84ab963c..00000000 --- a/backends/advanced/memory_config.yaml.template +++ /dev/null @@ -1,247 +0,0 @@ -# Memory Extraction Configuration -# This file controls how memories and facts are extracted from conversations -# -# REQUIRED ENVIRONMENT VARIABLES: -# - LLM_PROVIDER: Set to "openai" or "ollama" -# - For OpenAI: OPENAI_API_KEY (required), OPENAI_MODEL (optional, defaults to config) -# - For Ollama: OPENAI_BASE_URL or OLLAMA_BASE_URL (required), OPENAI_MODEL (optional, defaults to config) -# - QDRANT_BASE_URL: Qdrant service URL (e.g., "qdrant" for Docker) -# -# OPTIONAL ENVIRONMENT VARIABLES: -# - MEM0_ORGANIZATION_ID: Organization ID for mem0 (default: "friend-lite-org") -# - MEM0_PROJECT_ID: Project ID for mem0 (default: "audio-conversations") -# - MEM0_APP_ID: Application ID for mem0 (default: "omi-backend") -# - OPENAI_EMBEDDER_MODEL: OpenAI embedder model (default: "text-embedding-3-small") -# - OLLAMA_EMBEDDER_MODEL: Ollama embedder model (default: "nomic-embed-text:latest") -# - NEO4J_HOST, NEO4J_USER, NEO4J_PASSWORD: For graph storage (optional) - -# General memory extraction settings -memory_extraction: - # Whether to extract general memories (conversation summaries, topics, etc.) - enabled: true - - # Main prompt for memory extraction - INCLUDES JSON format and few-shot examples for mem0 compatibility - prompt: | - You are a Personal Information Organizer, specialized in accurately storing facts, user memories, and preferences. Your primary role is to extract relevant pieces of information from conversations and organize them into distinct, manageable facts. This allows for easy retrieval and personalization in future interactions. - - **Types of Information to Remember:** - - 1. **Personal Preferences**: Keep track of likes, dislikes, and specific preferences in various categories such as food, products, activities, hobbies, and entertainment. - 2. **Important Personal Details**: Remember significant personal information like names, relationships, and important dates. - 3. **Plans and Intentions**: Note upcoming events, trips, goals, and any plans the user has shared. - 4. **Activity and Service Preferences**: Recall preferences for dining, travel, hobbies, crafts, DIY projects, and other services. - 5. **Health and Wellness Preferences**: Keep a record of dietary restrictions, fitness routines, and other wellness-related information. - 6. **Professional Details**: Remember job titles, work habits, career goals, and other professional information. - 7. **Learning and Skills**: Track skills they're developing, tutorials they follow, techniques they're practicing. - 8. **Entertainment and Media**: Remember favorite movies, shows, books, games, creators, channels they follow. - 9. **Interests from Content**: Extract personal interests even from tutorial, educational, or entertainment content they engage with. - - **Output Format:** Return the facts and preferences in JSON format as shown in examples below. - - **Examples:** - - Input: Hi. - Output: {"facts" : []} - - Input: There are branches in trees. - Output: {"facts" : []} - - Input: Hi, I am looking for a restaurant in San Francisco. - Output: {"facts" : ["Looking for a restaurant in San Francisco"]} - - Input: Hi, my name is John. I am a software engineer. - Output: {"facts" : ["Name is John", "Is a software engineer"]} - - Input: My favourite movies are Inception and Interstellar. - Output: {"facts" : ["Favourite movies are Inception and Interstellar"]} - - Input: I've been watching this YouTube channel about rug tufting. They compared cheap versus expensive kits and I found it really interesting. The expensive one was $470 but worked much better than the $182 cheap kit. - Output: {"facts" : ["Interested in rug tufting and DIY crafts", "Watches tutorial content on YouTube", "Values product quality comparisons", "Learning about craft equipment and pricing"]} - - Input: I'm working on a Pokemon-themed rug design. I really like Pikachu and Charmander characters. - Output: {"facts" : ["Working on Pokemon-themed craft projects", "Likes Pokemon characters Pikachu and Charmander", "Enjoys character-based creative projects"]} - - **Remember:** - - Extract facts about the USER's interests, preferences, and activities - - Include information that reveals personality traits, hobbies, and learning goals - - Even casual mentions of topics can indicate interests worth remembering - - Return empty facts array only if genuinely no personal information is present - - Focus on actionable information that helps understand the user better - - # LLM parameters for memory extraction - # Provider is controlled by LLM_PROVIDER environment variable (ollama/openai) - llm_settings: - # temperature: removed - GPT-5-mini only supports default value - # Model selection based on provider: - # - Ollama: "gemma3n:e4b", "llama3.1:latest", "llama3.2:latest", etc. - # - OpenAI: "gpt-4o-mini" (recommended), "gpt-4o", "gpt-3.5-turbo", etc. - # - # RECOMMENDATION: Set OPENAI_MODEL environment variable instead of hardcoding - # Set environment variables: LLM_PROVIDER=openai and OPENAI_MODEL=gpt-4o-mini - # model: "gemma3n:e4b" - # model: Uses OPENAI_MODEL environment variable - embedding_model: "text-embedding-3-small" - -# Fact extraction settings (structured information) -fact_extraction: - # Whether to extract structured facts separately from general memories - # ENABLED: Using proper fact extraction prompt format - enabled: true - - # Prompt for extracting structured facts - prompt: | - Extract important information from this conversation, including facts, events, and personal details. Focus on: - - Names of people and their roles/titles. Ensure to extract the names of all existing participants in the conversation, even if they're only mentioned once. - - Company names, organizations, brands, and products mentioned - - Dates and specific times - - Locations and addresses - - Numbers, quantities, and measurements - - Contact information (emails, phone numbers) - - Project names and code names - - Technical specifications or requirements - - User's interests, hobbies, and activities they mention trying or wanting to try - - Things the user likes or dislikes (preferences, opinions) - - Skills the user is learning or wants to develop - - Personal experiences and stories shared - - Recommendations given or received - - Problems they're trying to solve - - Personality traits that come through in the conversation - - Contributions by each participant to the conversation or to the task - - Return the facts in JSON format as an array of strings. If no specific facts are mentioned, return an empty JSON array []. - Make sure to not wrap the JSON in ```json or ``` or any other markdown formatting. Only return the JSON array, that's all. - - Examples of JSON output: - ["John Smith works as Software Engineer at Acme Corp", - "Project deadline is December 15th, 2024", - "Meeting scheduled for 2 PM EST on Monday", - "Budget approved for $50,000", - "The participants in the conversation were John and Rose.", - "Discussion is about DnD", - "There is a tense conversation about the upcoming demo"] - - # LLM parameters for fact extraction - llm_settings: - # temperature: removed - GPT-5-mini only supports default value - # RECOMMENDATION: Set OPENAI_MODEL environment variable instead of hardcoding - # model: "gemma3n:e4b" # Model based on LLM_PROVIDER (ollama/openai) - # model: Uses OPENAI_MODEL environment variable - embedding_model: "text-embedding-3-small" - - -# Memory categorization settings -categorization: - # Whether to automatically categorize memories - enabled: true - - # Predefined categories - categories: - - personal - - work - - meeting - - project - - learning - - social - - health - - finance - - travel - - hobbies - - crafts - - diy - - tutorials - - entertainment - - gaming - - food - - technology - - shopping - - creativity - - other - - # Prompt for categorizing memories - prompt: | - Categorize this conversation into one or more of these categories: - personal, work, meeting, project, learning, social, health, finance, travel, other - - Return only the category names, comma-separated. - Examples: "work, meeting" or "personal, health" or "project" - - # LLM parameters for categorization - llm_settings: - # temperature: removed - GPT-5-mini only supports default value - # model: "gemma3n:e4b" # Model based on LLM_PROVIDER (ollama/openai) - # model: Uses OPENAI_MODEL environment variable - embedding_model: "text-embedding-3-small" - -# Quality control settings -quality_control: - # Minimum conversation length (in characters) to process - # MODIFIED: Reduced from 50 to 1 to process almost all transcripts - min_conversation_length: 1 - - # Maximum conversation length (in characters) to process - max_conversation_length: 50000 - - # Whether to skip conversations that are mostly silence/filler - # MODIFIED: Disabled to ensure all transcripts are processed - skip_low_content: false - - # Minimum meaningful content ratio (0.0-1.0) - # MODIFIED: Reduced to 0.0 to process all content - min_content_ratio: 0.0 - - # Skip conversations with these patterns - # MODIFIED: Removed most patterns to ensure all transcripts are processed - skip_patterns: - # Only skip completely empty patterns - removed test patterns to ensure all content is processed - [] - -# Processing settings -processing: - # Whether to process memories in parallel - parallel_processing: true - - # Maximum number of concurrent processing tasks - reduced to avoid overwhelming Ollama - max_concurrent_tasks: 1 - - # Timeout for memory processing (seconds) - generous timeout for Ollama processing - processing_timeout: 600 - - # Whether to retry failed extractions - retry_failed: true - - # Maximum number of retries - max_retries: 2 - - # Delay between retries (seconds) - retry_delay: 5 - -# Storage settings -storage: - # Whether to store detailed extraction metadata - store_metadata: true - - # Whether to store the original prompts used - store_prompts: true - - # Whether to store LLM responses - store_llm_responses: true - - # Whether to store processing timing information - store_timing: true - -# Debug settings -debug: - # Whether to enable debug tracking - enabled: true - - # Debug database path - db_path: "/app/data/debug_dir/memory_debug.db" - - # Log level for memory processing - log_level: "INFO" # DEBUG, INFO, WARNING, ERROR - - # Whether to log full conversations (privacy consideration) - log_full_conversations: false - - # Whether to log extracted memories - log_extracted_memories: true \ No newline at end of file diff --git a/backends/advanced/run-test.sh b/backends/advanced/run-test.sh index 632e9290..925e3615 100755 --- a/backends/advanced/run-test.sh +++ b/backends/advanced/run-test.sh @@ -108,12 +108,7 @@ case "$LLM_PROVIDER" in ;; esac -# Ensure memory_config.yaml exists -if [ ! -f "memory_config.yaml" ] && [ -f "memory_config.yaml.template" ]; then - print_info "Creating memory_config.yaml from template..." - cp memory_config.yaml.template memory_config.yaml - print_success "memory_config.yaml created" -fi +# memory_config.yaml deprecated; using config.yml for memory settings # Ensure diarization_config.json exists if [ ! -f "diarization_config.json" ] && [ -f "diarization_config.json.template" ]; then @@ -179,4 +174,4 @@ print_info "Cleaning up test containers..." docker compose -f docker-compose-test.yml down -v || true docker system prune -f || true -print_success "Advanced Backend integration tests completed!" \ No newline at end of file +print_success "Advanced Backend integration tests completed!" diff --git a/backends/advanced/setup-https.sh b/backends/advanced/setup-https.sh index e0f733df..b565cddc 100755 --- a/backends/advanced/setup-https.sh +++ b/backends/advanced/setup-https.sh @@ -134,27 +134,7 @@ fi # Step 2: Memory configuration print_header "Step 2: Memory Configuration" -if [ -f "memory_config.yaml" ]; then - print_info "memory_config.yaml already exists" - if prompt_yes_no "Do you want to reset it from template?" "n"; then - backup_path=$(backup_with_timestamp "memory_config.yaml") - if [ $? -eq 0 ]; then - print_info "Backed up existing memory_config.yaml to $backup_path" - cp memory_config.yaml.template memory_config.yaml - print_success "memory_config.yaml reset from template" - else - print_error "Failed to backup memory_config.yaml file, aborting reset" - fi - fi -else - if [ -f "memory_config.yaml.template" ]; then - cp memory_config.yaml.template memory_config.yaml - print_success "memory_config.yaml created from template" - else - print_error "memory_config.yaml.template not found!" - exit 1 - fi -fi +print_info "Memory settings are managed in config.yml (memory section)." # Step 3: Diarization configuration print_header "Step 3: Diarization Configuration" @@ -295,7 +275,7 @@ print_header "Setup Complete!" echo "Configuration Summary:" echo "----------------------" echo "✅ Environment file (.env) configured" -echo "✅ Memory configuration (memory_config.yaml) ready" +echo "✅ Memory configuration (config.yml) ready" echo "✅ Diarization configuration (diarization_config.json) ready" if [ "$HTTPS_ENABLED" = true ]; then @@ -353,4 +333,4 @@ echo " - Docs/memory-configuration-guide.md" echo " - MEMORY_PROVIDERS.md" echo "" -print_success "Initialization complete! 🎉" \ No newline at end of file +print_success "Initialization complete! 🎉" diff --git a/backends/advanced/src/advanced_omi_backend/app_config.py b/backends/advanced/src/advanced_omi_backend/app_config.py index 51a38159..1e24fb54 100644 --- a/backends/advanced/src/advanced_omi_backend/app_config.py +++ b/backends/advanced/src/advanced_omi_backend/app_config.py @@ -15,6 +15,7 @@ from advanced_omi_backend.constants import OMI_CHANNELS, OMI_SAMPLE_RATE, OMI_SAMPLE_WIDTH from advanced_omi_backend.services.transcription import get_transcription_provider +from advanced_omi_backend.model_registry import get_models_registry # Load environment variables load_dotenv() @@ -51,13 +52,8 @@ def __init__(self): self.min_speech_segment_duration = float(os.getenv("MIN_SPEECH_SEGMENT_DURATION", "1.0")) self.cropping_context_padding = float(os.getenv("CROPPING_CONTEXT_PADDING", "0.1")) - # Transcription Configuration - self.transcription_provider_name = os.getenv("TRANSCRIPTION_PROVIDER") - self.deepgram_api_key = os.getenv("DEEPGRAM_API_KEY") - self.mistral_api_key = os.getenv("MISTRAL_API_KEY") - - # Get configured transcription provider - self.transcription_provider = get_transcription_provider(self.transcription_provider_name) + # Transcription Configuration (registry-based) + self.transcription_provider = get_transcription_provider(None) if self.transcription_provider: logger.info( f"✅ Using {self.transcription_provider.name} transcription provider ({self.transcription_provider.mode})" @@ -68,11 +64,10 @@ def __init__(self): # External Services Configuration self.qdrant_base_url = os.getenv("QDRANT_BASE_URL", "qdrant") self.qdrant_port = os.getenv("QDRANT_PORT", "6333") - self.memory_provider = os.getenv("MEMORY_PROVIDER", "chronicle").lower() - # Map legacy provider names to current names - if self.memory_provider in ("friend-lite", "friend_lite"): - logger.debug(f"Mapping legacy provider '{self.memory_provider}' to 'chronicle'") - self.memory_provider = "chronicle" + # Memory provider from registry + _reg = get_models_registry() + _mem = _reg.memory if _reg else {} + self.memory_provider = (_mem.get("provider") or "chronicle").lower() # Redis Configuration self.redis_url = os.getenv("REDIS_URL", "redis://localhost:6379/0") @@ -123,4 +118,4 @@ def get_redis_config(): 'url': app_config.redis_url, 'encoding': "utf-8", 'decode_responses': False - } \ No newline at end of file + } diff --git a/backends/advanced/src/advanced_omi_backend/chat_service.py b/backends/advanced/src/advanced_omi_backend/chat_service.py index 1cd1a2e3..b77f864a 100644 --- a/backends/advanced/src/advanced_omi_backend/chat_service.py +++ b/backends/advanced/src/advanced_omi_backend/chat_service.py @@ -27,8 +27,7 @@ logger = logging.getLogger(__name__) -# Configuration from environment variables -CHAT_TEMPERATURE = float(os.getenv("CHAT_TEMPERATURE", "0.7")) +# Configuration MAX_MEMORY_CONTEXT = 5 # Maximum number of memories to include in context MAX_CONVERSATION_HISTORY = 10 # Maximum conversation turns to keep in context @@ -381,8 +380,7 @@ async def generate_response_stream( # Note: For now, we'll use the regular generate method # In the future, this should be replaced with actual streaming response_content = self.llm_client.generate( - prompt=full_prompt, - temperature=CHAT_TEMPERATURE + prompt=full_prompt ) # Simulate streaming by yielding chunks diff --git a/backends/advanced/src/advanced_omi_backend/controllers/system_controller.py b/backends/advanced/src/advanced_omi_backend/controllers/system_controller.py index 44067a49..c2af3b25 100644 --- a/backends/advanced/src/advanced_omi_backend/controllers/system_controller.py +++ b/backends/advanced/src/advanced_omi_backend/controllers/system_controller.py @@ -8,13 +8,15 @@ import time from datetime import UTC, datetime +import yaml + from advanced_omi_backend.config import ( load_diarization_settings_from_file, save_diarization_settings_to_file, ) from advanced_omi_backend.models.user import User +from advanced_omi_backend.model_registry import _find_config_path, load_models_config from advanced_omi_backend.task_manager import get_task_manager -from fastapi.responses import JSONResponse logger = logging.getLogger(__name__) audio_logger = logging.getLogger("audio_processing") @@ -36,10 +38,8 @@ async def get_current_metrics(): return metrics except Exception as e: - audio_logger.error(f"Error fetching metrics: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to fetch metrics: {str(e)}"} - ) + audio_logger.exception("Error fetching metrics") + raise e async def get_auth_config(): @@ -71,10 +71,8 @@ async def get_diarization_settings(): "status": "success" } except Exception as e: - logger.error(f"Error getting diarization settings: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to get settings: {str(e)}"} - ) + logger.exception("Error getting diarization settings") + raise e async def save_diarization_settings(settings: dict): @@ -88,26 +86,18 @@ async def save_diarization_settings(settings: dict): for key, value in settings.items(): if key not in valid_keys: - return JSONResponse( - status_code=400, content={"error": f"Invalid setting key: {key}"} - ) + raise ValueError(f"Invalid setting key: {key}") # Type validation if key in ["min_speakers", "max_speakers"]: if not isinstance(value, int) or value < 1 or value > 20: - return JSONResponse( - status_code=400, content={"error": f"Invalid value for {key}: must be integer 1-20"} - ) + raise ValueError(f"Invalid value for {key}: must be integer 1-20") elif key == "diarization_source": if not isinstance(value, str) or value not in ["pyannote", "deepgram"]: - return JSONResponse( - status_code=400, content={"error": f"Invalid value for {key}: must be 'pyannote' or 'deepgram'"} - ) + raise ValueError(f"Invalid value for {key}: must be 'pyannote' or 'deepgram'") else: if not isinstance(value, (int, float)) or value < 0: - return JSONResponse( - status_code=400, content={"error": f"Invalid value for {key}: must be positive number"} - ) + raise ValueError(f"Invalid value for {key}: must be positive number") # Get current settings and merge with new values current_settings = load_diarization_settings_from_file() @@ -132,10 +122,8 @@ async def save_diarization_settings(settings: dict): } except Exception as e: - logger.error(f"Error saving diarization settings: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to save settings: {str(e)}"} - ) + logger.exception("Error saving diarization settings") + raise e async def get_speaker_configuration(user: User): @@ -147,10 +135,8 @@ async def get_speaker_configuration(user: User): "status": "success" } except Exception as e: - logger.error(f"Error getting speaker configuration for user {user.user_id}: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to get speaker configuration: {str(e)}"} - ) + logger.exception(f"Error getting speaker configuration for user {user.user_id}") + raise e async def update_speaker_configuration(user: User, primary_speakers: list[dict]): @@ -159,16 +145,12 @@ async def update_speaker_configuration(user: User, primary_speakers: list[dict]) # Validate speaker data format for speaker in primary_speakers: if not isinstance(speaker, dict): - return JSONResponse( - status_code=400, content={"error": "Each speaker must be a dictionary"} - ) + raise ValueError("Each speaker must be a dictionary") required_fields = ["speaker_id", "name", "user_id"] for field in required_fields: if field not in speaker: - return JSONResponse( - status_code=400, content={"error": f"Missing required field: {field}"} - ) + raise ValueError(f"Missing required field: {field}") # Enforce server-side user_id and add timestamp to each speaker for speaker in primary_speakers: @@ -189,10 +171,8 @@ async def update_speaker_configuration(user: User, primary_speakers: list[dict]) } except Exception as e: - logger.error(f"Error updating speaker configuration for user {user.user_id}: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to update speaker configuration: {str(e)}"} - ) + logger.exception(f"Error updating speaker configuration for user {user.user_id}") + raise e async def get_enrolled_speakers(user: User): @@ -224,13 +204,8 @@ async def get_enrolled_speakers(user: User): } except Exception as e: - logger.error(f"Error getting enrolled speakers for user {user.user_id}: {e}") - return { - "speakers": [], - "service_available": False, - "message": f"Failed to retrieve speakers: {str(e)}", - "status": "error" - } + logger.exception(f"Error getting enrolled speakers for user {user.user_id}") + raise e async def get_speaker_service_status(): @@ -272,171 +247,100 @@ async def get_speaker_service_status(): } except Exception as e: - logger.error(f"Error checking speaker service status: {e}") - return { - "service_available": False, - "healthy": False, - "message": f"Health check failed: {str(e)}", - "status": "error" - } + logger.exception("Error checking speaker service status") + raise e + # Memory Configuration Management Functions async def get_memory_config_raw(): - """Get current memory configuration YAML as plain text.""" + """Get current memory configuration (memory section of config.yml) as YAML.""" try: - from advanced_omi_backend.memory_config_loader import get_config_loader - - config_loader = get_config_loader() - config_path = config_loader.config_path - - if not os.path.exists(config_path): - return JSONResponse( - status_code=404, content={"error": f"Memory config file not found: {config_path}"} - ) - - with open(config_path, 'r') as file: - config_yaml = file.read() - + cfg_path = _find_config_path() + if not os.path.exists(cfg_path): + raise FileNotFoundError(f"Config file not found: {cfg_path}") + + with open(cfg_path, 'r') as f: + data = yaml.safe_load(f) or {} + memory_section = data.get("memory", {}) + config_yaml = yaml.safe_dump(memory_section, sort_keys=False) + return { "config_yaml": config_yaml, - "config_path": config_path, - "status": "success" + "config_path": str(cfg_path), + "section": "memory", + "status": "success", } - except Exception as e: - logger.error(f"Error reading memory config: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to read memory config: {str(e)}"} - ) + logger.exception("Error reading memory config") + raise e async def update_memory_config_raw(config_yaml: str): - """Update memory configuration YAML and hot reload.""" + """Update memory configuration in config.yml and hot reload registry.""" try: - import yaml - from advanced_omi_backend.memory_config_loader import get_config_loader - - # First validate YAML syntax + # Validate YAML try: - yaml.safe_load(config_yaml) + new_mem = yaml.safe_load(config_yaml) or {} except yaml.YAMLError as e: - return JSONResponse( - status_code=400, content={"error": f"Invalid YAML syntax: {str(e)}"} - ) - - config_loader = get_config_loader() - config_path = config_loader.config_path - - # Create backup - backup_path = f"{config_path}.bak" - if os.path.exists(config_path): - shutil.copy2(config_path, backup_path) - logger.info(f"Created backup at {backup_path}") - - # Write new configuration - with open(config_path, 'w') as file: - file.write(config_yaml) - - # Hot reload configuration - reload_success = config_loader.reload_config() - - if reload_success: - logger.info("Memory configuration updated and reloaded successfully") - return { - "message": "Memory configuration updated and reloaded successfully", - "config_path": config_path, - "backup_created": os.path.exists(backup_path), - "status": "success" - } - else: - return JSONResponse( - status_code=500, content={"error": "Configuration saved but reload failed"} - ) - + raise ValueError(f"Invalid YAML syntax: {str(e)}") + + cfg_path = _find_config_path() + if not os.path.exists(cfg_path): + raise FileNotFoundError(f"Config file not found: {cfg_path}") + + # Backup + backup_path = f"{cfg_path}.bak" + shutil.copy2(cfg_path, backup_path) + + # Update memory section and write file + with open(cfg_path, 'r') as f: + data = yaml.safe_load(f) or {} + data["memory"] = new_mem + with open(cfg_path, 'w') as f: + yaml.safe_dump(data, f, sort_keys=False) + + # Reload registry + load_models_config(force_reload=True) + + return { + "message": "Memory configuration updated and reloaded successfully", + "config_path": str(cfg_path), + "backup_created": os.path.exists(backup_path), + "status": "success", + } except Exception as e: - logger.error(f"Error updating memory config: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to update memory config: {str(e)}"} - ) + logger.exception("Error updating memory config") + raise e async def validate_memory_config(config_yaml: str): - """Validate memory configuration YAML syntax.""" + """Validate memory configuration YAML syntax (memory section).""" try: - import yaml - from advanced_omi_backend.memory_config_loader import MemoryConfigLoader - - # Parse YAML try: - parsed_config = yaml.safe_load(config_yaml) - if not parsed_config: - return JSONResponse( - status_code=400, content={"error": "Configuration file is empty"} - ) + parsed = yaml.safe_load(config_yaml) except yaml.YAMLError as e: - return JSONResponse( - status_code=400, content={"error": f"Invalid YAML syntax: {str(e)}"} - ) - - # Create a temporary config loader to validate structure - try: - # Create a temporary file for validation - import tempfile - with tempfile.NamedTemporaryFile(mode='w', suffix='.yaml', delete=False) as tmp_file: - tmp_file.write(config_yaml) - tmp_path = tmp_file.name - - # Try to load with MemoryConfigLoader to validate structure - temp_loader = MemoryConfigLoader(tmp_path) - temp_loader.validate_config() - - # Clean up temp file - os.unlink(tmp_path) - - return { - "message": "Configuration is valid", - "status": "success" - } - - except ValueError as e: - return JSONResponse( - status_code=400, content={"error": f"Configuration validation failed: {str(e)}"} - ) - + raise ValueError(f"Invalid YAML syntax: {str(e)}") + if not isinstance(parsed, dict): + raise ValueError("Configuration must be a YAML object") + # Minimal checks + # provider optional; timeout_seconds optional; extraction enabled/prompt optional + return {"message": "Configuration is valid", "status": "success"} except Exception as e: - logger.error(f"Error validating memory config: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to validate memory config: {str(e)}"} - ) + logger.exception("Error validating memory config") + raise e async def reload_memory_config(): - """Reload memory configuration from file.""" + """Reload config.yml (registry).""" try: - from advanced_omi_backend.memory_config_loader import get_config_loader - - config_loader = get_config_loader() - reload_success = config_loader.reload_config() - - if reload_success: - logger.info("Memory configuration reloaded successfully") - return { - "message": "Memory configuration reloaded successfully", - "config_path": config_loader.config_path, - "status": "success" - } - else: - return JSONResponse( - status_code=500, content={"error": "Failed to reload memory configuration"} - ) - + cfg_path = _find_config_path() + load_models_config(force_reload=True) + return {"message": "Configuration reloaded", "config_path": str(cfg_path), "status": "success"} except Exception as e: - logger.error(f"Error reloading memory config: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to reload memory config: {str(e)}"} - ) + logger.exception("Error reloading config") + raise e async def delete_all_user_memories(user: User): @@ -459,10 +363,8 @@ async def delete_all_user_memories(user: User): } except Exception as e: - logger.error(f"Error deleting all memories for user {user.user_id}: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to delete memories: {str(e)}"} - ) + logger.exception(f"Error deleting all memories for user {user.user_id}") + raise e # Memory Provider Configuration Functions @@ -485,10 +387,8 @@ async def get_memory_provider(): } except Exception as e: - logger.error(f"Error getting memory provider: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to get memory provider: {str(e)}"} - ) + logger.exception("Error getting memory provider") + raise e async def set_memory_provider(provider: str): @@ -499,19 +399,13 @@ async def set_memory_provider(provider: str): valid_providers = ["chronicle", "openmemory_mcp", "mycelia"] if provider not in valid_providers: - return JSONResponse( - status_code=400, - content={"error": f"Invalid provider '{provider}'. Valid providers: {', '.join(valid_providers)}"} - ) + raise ValueError(f"Invalid provider '{provider}'. Valid providers: {', '.join(valid_providers)}") # Path to .env file (assuming we're running from backends/advanced/) env_path = os.path.join(os.getcwd(), ".env") if not os.path.exists(env_path): - return JSONResponse( - status_code=404, - content={"error": f".env file not found at {env_path}"} - ) + raise FileNotFoundError(f".env file not found at {env_path}") # Read current .env file with open(env_path, 'r') as file: @@ -556,9 +450,5 @@ async def set_memory_provider(provider: str): } except Exception as e: - logger.error(f"Error setting memory provider: {e}") - return JSONResponse( - status_code=500, content={"error": f"Failed to set memory provider: {str(e)}"} - ) - - + logger.exception("Error setting memory provider") + raise e diff --git a/backends/advanced/src/advanced_omi_backend/llm_client.py b/backends/advanced/src/advanced_omi_backend/llm_client.py index 21ee3331..5b6e907d 100644 --- a/backends/advanced/src/advanced_omi_backend/llm_client.py +++ b/backends/advanced/src/advanced_omi_backend/llm_client.py @@ -11,6 +11,8 @@ from abc import ABC, abstractmethod from typing import Dict +from advanced_omi_backend.model_registry import get_models_registry + logger = logging.getLogger(__name__) @@ -54,8 +56,9 @@ def __init__( self.api_key = api_key or os.getenv("OPENAI_API_KEY") self.base_url = base_url or os.getenv("OPENAI_BASE_URL") self.model = model or os.getenv("OPENAI_MODEL") + if not self.api_key or not self.base_url or not self.model: - raise ValueError("OPENAI_API_KEY, OPENAI_BASE_URL, and OPENAI_MODEL must be set") + raise ValueError(f"LLM configuration incomplete: api_key={'set' if self.api_key else 'MISSING'}, base_url={'set' if self.base_url else 'MISSING'}, model={'set' if self.model else 'MISSING'}") # Initialize OpenAI client with optional Langfuse tracing try: @@ -141,21 +144,26 @@ def get_default_model(self) -> str: class LLMClientFactory: - """Factory for creating LLM clients based on environment configuration.""" + """Factory for creating LLM clients based on configuration registry.""" @staticmethod def create_client() -> LLMClient: - """Create an LLM client based on LLM_PROVIDER environment variable.""" - provider = os.getenv("LLM_PROVIDER", "openai").lower() - - if provider in ["openai", "ollama"]: - return OpenAILLMClient( - api_key=os.getenv("OPENAI_API_KEY"), - base_url=os.getenv("OPENAI_BASE_URL"), - model=os.getenv("OPENAI_MODEL"), - ) - else: - raise ValueError(f"Unsupported LLM provider: {provider}") + """Create an LLM client based on model registry configuration (config.yml).""" + registry = get_models_registry() + + if registry: + llm_def = registry.get_default("llm") + if llm_def: + logger.info(f"Creating LLM client from registry: {llm_def.name} ({llm_def.model_provider})") + params = llm_def.model_params or {} + return OpenAILLMClient( + api_key=llm_def.api_key, + base_url=llm_def.model_url, + model=llm_def.model_name, + temperature=params.get("temperature", 0.1), + ) + + raise ValueError("No default LLM defined in config.yml") @staticmethod def get_supported_providers() -> list: diff --git a/backends/advanced/src/advanced_omi_backend/memory_config_loader.py b/backends/advanced/src/advanced_omi_backend/memory_config_loader.py deleted file mode 100644 index 60073f5a..00000000 --- a/backends/advanced/src/advanced_omi_backend/memory_config_loader.py +++ /dev/null @@ -1,333 +0,0 @@ -""" -Memory Configuration Loader - -This module loads and manages memory extraction configuration from YAML files. -""" - -import logging -import os -from typing import Any, Dict - -import yaml - -# Logger for configuration -config_logger = logging.getLogger("memory_config") - - -class MemoryConfigLoader: - """ - Loads and manages memory extraction configuration from YAML files. - """ - - def __init__(self, config_path: str | None = None): - """ - Initialize the config loader. - - Args: - config_path: Path to the configuration YAML file - """ - if config_path is None: - # Default to memory_config.yaml in the backend root - config_path = os.path.join( - os.path.dirname(os.path.dirname(os.path.dirname(__file__))), - "memory_config.yaml", - ) - - self.config_path = config_path - self.config = self._load_config() - - # Set up logging level from config - debug_config = self.config.get("debug", {}) - log_level = debug_config.get("log_level", "INFO") - numeric_level = getattr(logging, log_level.upper(), logging.INFO) - config_logger.setLevel(numeric_level) - - # Validate configuration on load - try: - self.validate_config() - config_logger.info(f"Loaded and validated memory configuration from {config_path}") - except ValueError as e: - config_logger.error(f"Configuration validation failed: {e}") - raise - - def _load_config(self) -> Dict[str, Any]: - """Load configuration from YAML file.""" - try: - with open(self.config_path, "r") as file: - config = yaml.safe_load(file) - if not config: - raise ValueError("Configuration file is empty") - return config - except FileNotFoundError: - raise FileNotFoundError( - f"Memory configuration file not found: {self.config_path}\n" - f"Please ensure memory_config.yaml exists in the backend directory" - ) - except yaml.YAMLError as e: - raise ValueError( - f"Error parsing YAML configuration: {e}\n" - f"Please check the syntax of {self.config_path}" - ) - - def validate_config(self) -> None: - """Validate that all required configuration sections and fields are present.""" - required_sections = [ - "memory_extraction", - "quality_control", - "processing", - "storage", - "debug", - ] - - for section in required_sections: - if section not in self.config: - raise ValueError( - f"Missing required configuration section: '{section}'\n" - f"Please check {self.config_path}" - ) - - # Validate memory_extraction section - mem_extract = self.config.get("memory_extraction", {}) - if "enabled" not in mem_extract: - raise ValueError("memory_extraction.enabled is required in configuration") - if "prompt" not in mem_extract: - raise ValueError("memory_extraction.prompt is required in configuration") - if "llm_settings" not in mem_extract: - raise ValueError("memory_extraction.llm_settings is required in configuration") - - # Validate quality_control section - quality = self.config.get("quality_control", {}) - if "min_conversation_length" not in quality: - raise ValueError("quality_control.min_conversation_length is required") - if "max_conversation_length" not in quality: - raise ValueError("quality_control.max_conversation_length is required") - - # Validate processing section - processing = self.config.get("processing", {}) - if "processing_timeout" not in processing: - raise ValueError("processing.processing_timeout is required") - - # Validate debug section - debug = self.config.get("debug", {}) - if "enabled" not in debug: - raise ValueError("debug.enabled is required") - if "db_path" not in debug: - raise ValueError("debug.db_path is required") - - def reload_config(self) -> bool: - """Reload configuration from file.""" - try: - self.config = self._load_config() - config_logger.info("Configuration reloaded successfully") - return True - except Exception as e: - config_logger.error(f"Failed to reload configuration: {e}") - return False - - def get_memory_extraction_config(self) -> Dict[str, Any]: - """Get memory extraction configuration.""" - return self.config.get("memory_extraction", {}) - - def get_fact_extraction_config(self) -> Dict[str, Any]: - """Get fact extraction configuration.""" - return self.config.get("fact_extraction", {}) - - def get_categorization_config(self) -> Dict[str, Any]: - """Get categorization configuration.""" - return self.config.get("categorization", {}) - - def get_quality_control_config(self) -> Dict[str, Any]: - """Get quality control configuration.""" - return self.config.get("quality_control", {}) - - def get_processing_config(self) -> Dict[str, Any]: - """Get processing configuration.""" - return self.config.get("processing", {}) - - def get_storage_config(self) -> Dict[str, Any]: - """Get storage configuration.""" - return self.config.get("storage", {}) - - def get_debug_config(self) -> Dict[str, Any]: - """Get debug configuration.""" - return self.config.get("debug", {}) - - def is_memory_extraction_enabled(self) -> bool: - """Check if memory extraction is enabled.""" - return self.get_memory_extraction_config().get("enabled") - - def is_fact_extraction_enabled(self) -> bool: - """Check if fact extraction is enabled.""" - return self.get_fact_extraction_config().get("enabled", False) # Optional feature - - def is_categorization_enabled(self) -> bool: - """Check if categorization is enabled.""" - return self.get_categorization_config().get("enabled", False) # Optional feature - - def is_debug_enabled(self) -> bool: - """Check if debug tracking is enabled.""" - return self.get_debug_config().get("enabled") - - def get_memory_prompt(self) -> str: - """Get the memory extraction prompt.""" - prompt = self.get_memory_extraction_config().get("prompt") - if not prompt: - raise ValueError("memory_extraction.prompt is not configured") - return prompt - - def get_fact_prompt(self) -> str: - """Get the fact extraction prompt.""" - # Fact extraction is optional, so we can provide a default - return self.get_fact_extraction_config().get( - "prompt", "Extract specific facts from this conversation." - ) - - def get_categorization_prompt(self) -> str: - """Get the categorization prompt.""" - # Categorization is optional, so we can provide a default - return self.get_categorization_config().get("prompt", "Categorize this conversation.") - - def get_llm_settings(self, extraction_type: str) -> Dict[str, Any]: - """ - Get LLM settings for a specific extraction type. - - Args: - extraction_type: One of 'memory', 'fact', 'categorization' - """ - config_key = f"{extraction_type}_extraction" - if extraction_type == "memory": - config_key = "memory_extraction" - elif extraction_type == "fact": - config_key = "fact_extraction" - elif extraction_type == "categorization": - config_key = "categorization" - - extraction_config = self.config.get(config_key, {}) - return extraction_config.get("llm_settings", {}) - - def should_skip_conversation(self, conversation_text: str) -> bool: - """ - Check if a conversation should be skipped based on quality control settings. - - Args: - conversation_text: The full conversation text - - Returns: - True if the conversation should be skipped - """ - quality_config = self.get_quality_control_config() - - # Check length constraints - these are required fields - min_length = quality_config.get("min_conversation_length") - max_length = quality_config.get("max_conversation_length") - - if min_length is None or max_length is None: - config_logger.error("Missing required quality control configuration") - return False # Don't skip if config is missing - - if len(conversation_text) < min_length: - config_logger.debug( - f"Skipping conversation: too short ({len(conversation_text)} < {min_length})" - ) - return True - - if len(conversation_text) > max_length: - config_logger.debug( - f"Skipping conversation: too long ({len(conversation_text)} > {max_length})" - ) - return True - - # Check skip patterns - skip_patterns = quality_config.get("skip_patterns", []) - if skip_patterns: - import re - - for pattern in skip_patterns: - if re.match(pattern, conversation_text.strip(), re.IGNORECASE): - config_logger.debug(f"Skipping conversation: matches skip pattern '{pattern}'") - return True - - # Check content ratio (if enabled) - if quality_config.get("skip_low_content", False): - min_content_ratio = quality_config.get("min_content_ratio") - if min_content_ratio is None: - min_content_ratio = 0.3 # Reasonable default for optional feature - - # Simple heuristic: calculate ratio of meaningful words to total words - words = conversation_text.split() - if len(words) > 0: - filler_words = { - "um", - "uh", - "hmm", - "yeah", - "ok", - "okay", - "like", - "you", - "know", - "so", - "well", - } - meaningful_words = [ - word for word in words if word.lower() not in filler_words and len(word) > 2 - ] - content_ratio = len(meaningful_words) / len(words) - - if content_ratio < min_content_ratio: - config_logger.debug( - f"Skipping conversation: low content ratio ({content_ratio:.2f} < {min_content_ratio})" - ) - return True - - return False - - def get_categories(self) -> list[str]: - """Get available categories for classification.""" - return self.get_categorization_config().get("categories", []) - - def get_debug_db_path(self) -> str: - """Get the debug database path.""" - path = self.get_debug_config().get("db_path") - if not path: - raise ValueError("debug.db_path is not configured") - return path - - def should_log_full_conversations(self) -> bool: - """Check if full conversations should be logged.""" - return self.get_debug_config().get("log_full_conversations", False) # Privacy-safe default - - def should_log_extracted_memories(self) -> bool: - """Check if extracted memories should be logged.""" - return self.get_debug_config().get("log_extracted_memories", True) # Useful default - - def get_processing_timeout(self) -> int: - """Get the processing timeout in seconds.""" - timeout = self.get_processing_config().get("processing_timeout") - if timeout is None: - raise ValueError("processing.processing_timeout is not configured") - return timeout - - def should_retry_failed(self) -> bool: - """Check if failed extractions should be retried.""" - return self.get_processing_config().get("retry_failed", True) # Reasonable default - - def get_max_retries(self) -> int: - """Get the maximum number of retries.""" - return self.get_processing_config().get("max_retries", 2) # Reasonable default - - def get_retry_delay(self) -> int: - """Get the delay between retries in seconds.""" - return self.get_processing_config().get("retry_delay", 5) # Reasonable default - - -# Global instance -_config_loader = None - - -def get_config_loader() -> MemoryConfigLoader: - """Get the global configuration loader instance.""" - global _config_loader - if _config_loader is None: - _config_loader = MemoryConfigLoader() - return _config_loader diff --git a/backends/advanced/src/advanced_omi_backend/model_registry.py b/backends/advanced/src/advanced_omi_backend/model_registry.py new file mode 100644 index 00000000..47bef4ba --- /dev/null +++ b/backends/advanced/src/advanced_omi_backend/model_registry.py @@ -0,0 +1,353 @@ +"""Model registry and config loader. + +Loads a single source of truth from config.yml and exposes model +definitions (LLM, embeddings, etc.) in a provider-agnostic way. + +Now using Pydantic for robust validation and type safety. +""" + +from __future__ import annotations + +import os +import re +import yaml +from pathlib import Path +from typing import Any, Dict, List, Optional + +import logging +from pydantic import BaseModel, Field, field_validator, model_validator, ConfigDict, ValidationError + +def _resolve_env(value: Any) -> Any: + """Resolve ``${VAR:-default}`` patterns inside a single value. + + This helper is intentionally minimal: it only operates on strings and leaves + all other types unchanged. Patterns of the form ``${VAR}`` or + ``${VAR:-default}`` are expanded using ``os.getenv``: + + - If the environment variable **VAR** is set, its value is used. + - Otherwise the optional ``default`` is used (or ``\"\"`` if omitted). + + Examples: + >>> os.environ.get("OLLAMA_MODEL") + >>> _resolve_env("${OLLAMA_MODEL:-llama3.1:latest}") + 'llama3.1:latest' + + >>> os.environ["OLLAMA_MODEL"] = "llama3.2:latest" + >>> _resolve_env("${OLLAMA_MODEL:-llama3.1:latest}") + 'llama3.2:latest' + + >>> _resolve_env("Bearer ${OPENAI_API_KEY:-}") + 'Bearer ' # when OPENAI_API_KEY is not set + + Note: + Use :func:`_deep_resolve_env` to apply this logic to an entire + nested config structure (dicts/lists) loaded from YAML. + """ + if not isinstance(value, str): + return value + + pattern = re.compile(r"\$\{([^}:]+)(?::-(.*?))?\}") + + def repl(match: re.Match[str]) -> str: + var, default = match.group(1), match.group(2) + return os.getenv(var, default or "") + + return pattern.sub(repl, value) + + +def _deep_resolve_env(data: Any) -> Any: + """Recursively resolve environment variables in nested structures. + + This walks arbitrary Python structures produced by ``yaml.safe_load`` and + applies :func:`_resolve_env` to every string it finds. Dictionaries and + lists are traversed deeply; scalars are passed through unchanged. + + Examples: + >>> os.environ["OPENAI_MODEL"] = "gpt-4o-mini" + >>> cfg = { + ... "models": [ + ... {"model_name": "${OPENAI_MODEL:-gpt-4o-mini}"}, + ... {"model_url": "${OPENAI_BASE_URL:-https://api.openai.com/v1}"} + ... ] + ... } + >>> resolved = _deep_resolve_env(cfg) + >>> resolved["models"][0]["model_name"] + 'gpt-4o-mini' + >>> resolved["models"][1]["model_url"] + 'https://api.openai.com/v1' + + This is what :func:`load_models_config` uses immediately after loading + ``config.yml`` so that all ``${VAR:-default}`` placeholders are resolved + before Pydantic validation and model registry construction. + """ + if isinstance(data, dict): + return {k: _deep_resolve_env(v) for k, v in data.items()} + if isinstance(data, list): + return [_deep_resolve_env(v) for v in data] + return _resolve_env(data) + + +class ModelDef(BaseModel): + """Model definition with validation. + + Represents a single model configuration (LLM, embedding, STT, TTS, etc.) + from config.yml with automatic validation and type checking. + """ + + model_config = ConfigDict( + extra='allow', # Allow extra fields for extensibility + validate_assignment=True, # Validate on attribute assignment + arbitrary_types_allowed=True, + ) + + name: str = Field(..., min_length=1, description="Unique model identifier") + model_type: str = Field(..., description="Model type: llm, embedding, stt, tts, etc.") + model_provider: str = Field(default="unknown", description="Provider name: openai, ollama, deepgram, parakeet, etc.") + api_family: str = Field(default="openai", description="API family: openai, http, websocket, etc.") + model_name: str = Field(default="", description="Provider-specific model name") + model_url: str = Field(default="", description="Base URL for API requests") + api_key: Optional[str] = Field(default=None, description="API key or authentication token") + description: Optional[str] = Field(default=None, description="Human-readable description") + model_params: Dict[str, Any] = Field(default_factory=dict, description="Model-specific parameters") + model_output: Optional[str] = Field(default=None, description="Output format: json, text, vector, etc.") + embedding_dimensions: Optional[int] = Field(default=None, ge=1, description="Embedding vector dimensions") + operations: Dict[str, Any] = Field(default_factory=dict, description="API operation definitions") + + @field_validator('model_name', mode='before') + @classmethod + def default_model_name(cls, v: Any, info) -> str: + """Default model_name to name if not provided.""" + if not v and info.data.get('name'): + return info.data['name'] + return v or "" + + @field_validator('model_url', mode='before') + @classmethod + def validate_url(cls, v: Any) -> str: + """Ensure URL doesn't have trailing whitespace.""" + if isinstance(v, str): + return v.strip() + return v or "" + + @field_validator('api_key', mode='before') + @classmethod + def sanitize_api_key(cls, v: Any) -> Optional[str]: + """Sanitize API key, treat empty strings as None.""" + if isinstance(v, str): + v = v.strip() + if not v or v.lower() in ['dummy', 'none', 'null']: + return None + return v + return v + + @model_validator(mode='after') + def validate_model(self) -> ModelDef: + """Cross-field validation.""" + # Ensure embedding models have dimensions specified + if self.model_type == 'embedding' and not self.embedding_dimensions: + # Common defaults + defaults = { + 'text-embedding-3-small': 1536, + 'text-embedding-3-large': 3072, + 'text-embedding-ada-002': 1536, + 'nomic-embed-text-v1.5': 768, + } + if self.model_name in defaults: + self.embedding_dimensions = defaults[self.model_name] + + return self + + +class AppModels(BaseModel): + """Application models registry. + + Contains default model selections and all available model definitions. + """ + + model_config = ConfigDict( + extra='allow', + validate_assignment=True, + ) + + defaults: Dict[str, str] = Field( + default_factory=dict, + description="Default model names for each model_type" + ) + models: Dict[str, ModelDef] = Field( + default_factory=dict, + description="All available model definitions keyed by name" + ) + memory: Dict[str, Any] = Field( + default_factory=dict, + description="Memory service configuration" + ) + + def get_by_name(self, name: str) -> Optional[ModelDef]: + """Get a model by its unique name. + + Args: + name: Model name to look up + + Returns: + ModelDef if found, None otherwise + """ + return self.models.get(name) + + def get_default(self, model_type: str) -> Optional[ModelDef]: + """Get the default model for a given type. + + Args: + model_type: Type of model (llm, embedding, stt, tts, etc.) + + Returns: + Default ModelDef for the type, or first available model of that type, + or None if no models of that type exist + """ + # Try explicit default first + name = self.defaults.get(model_type) + if name: + model = self.get_by_name(name) + if model: + return model + + # Fallback: first model of that type + for m in self.models.values(): + if m.model_type == model_type: + return m + + return None + + def get_all_by_type(self, model_type: str) -> List[ModelDef]: + """Get all models of a specific type. + + Args: + model_type: Type of model to filter by + + Returns: + List of ModelDef objects matching the type + """ + return [m for m in self.models.values() if m.model_type == model_type] + + def list_model_types(self) -> List[str]: + """Get all unique model types in the registry. + + Returns: + Sorted list of model types + """ + return sorted(set(m.model_type for m in self.models.values())) + + +# Global registry singleton +_REGISTRY: Optional[AppModels] = None + + +def _find_config_path() -> Path: + """Find config.yml in expected locations. + + Search order: + 1. CONFIG_FILE environment variable + 2. Current working directory + 3. /app/config.yml (Docker container) + 4. Walk up from module directory + + Returns: + Path to config.yml (may not exist) + """ + # ENV override + cfg_env = os.getenv("CONFIG_FILE") + if cfg_env and Path(cfg_env).exists(): + return Path(cfg_env) + + # Common locations (container vs repo root) + candidates = [Path("config.yml"), Path("/app/config.yml")] + + # Also walk up from current file's parents defensively + try: + for parent in Path(__file__).resolve().parents: + c = parent / "config.yml" + if c.exists(): + return c + except Exception: + pass + + for c in candidates: + if c.exists(): + return c + + # Last resort: return /app/config.yml path (may not exist yet) + return Path("/app/config.yml") + + +def load_models_config(force_reload: bool = False) -> Optional[AppModels]: + """Load model configuration from config.yml. + + This function loads and parses the config.yml file, resolves environment + variables, validates model definitions using Pydantic, and caches the result. + + Args: + force_reload: If True, reload from disk even if already cached + + Returns: + AppModels instance with validated configuration, or None if config not found + + Raises: + ValidationError: If config.yml has invalid model definitions + yaml.YAMLError: If config.yml has invalid YAML syntax + """ + global _REGISTRY + if _REGISTRY is not None and not force_reload: + return _REGISTRY + + cfg_path = _find_config_path() + if not cfg_path.exists(): + return None + + # Load and parse YAML + with cfg_path.open("r") as f: + raw = yaml.safe_load(f) or {} + + # Resolve environment variables + raw = _deep_resolve_env(raw) + + # Extract sections + defaults = raw.get("defaults", {}) or {} + model_list = raw.get("models", []) or [] + memory_settings = raw.get("memory", {}) or {} + + # Parse and validate models using Pydantic + models: Dict[str, ModelDef] = {} + for m in model_list: + try: + # Pydantic will handle validation automatically + model_def = ModelDef(**m) + models[model_def.name] = model_def + except ValidationError as e: + # Log but don't fail the entire registry load + logging.warning(f"Failed to load model '{m.get('name', 'unknown')}': {e}") + continue + + # Create and cache registry + _REGISTRY = AppModels( + defaults=defaults, + models=models, + memory=memory_settings + ) + return _REGISTRY + + +def get_models_registry() -> Optional[AppModels]: + """Get the global models registry. + + This is the primary interface for accessing model configurations. + The registry is loaded once and cached for performance. + + Returns: + AppModels instance, or None if config.yml not found + + Example: + >>> registry = get_models_registry() + >>> if registry: + ... llm = registry.get_default('llm') + ... print(f"Default LLM: {llm.name} ({llm.model_provider})") + """ + return load_models_config(force_reload=False) diff --git a/backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py b/backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py index d94940ce..5ffa5d6f 100644 --- a/backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py +++ b/backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py @@ -20,6 +20,7 @@ from advanced_omi_backend.llm_client import async_health_check from advanced_omi_backend.services.memory import get_memory_service from advanced_omi_backend.services.transcription import get_transcription_provider +from advanced_omi_backend.model_registry import get_models_registry # Create router router = APIRouter(tags=["health"]) @@ -38,9 +39,17 @@ # Transcription provider transcription_provider = get_transcription_provider() -# Qdrant Configuration -QDRANT_BASE_URL = os.getenv("QDRANT_BASE_URL", "qdrant") -QDRANT_PORT = os.getenv("QDRANT_PORT", "6333") +# Registry-driven configuration for display +REGISTRY = get_models_registry() +if REGISTRY: + _llm_def = REGISTRY.get_default("llm") + _embed_def = REGISTRY.get_default("embedding") + _vs_def = REGISTRY.get_default("vector_store") +else: + _llm_def = _embed_def = _vs_def = None + +QDRANT_BASE_URL = (_vs_def.model_params.get("host") if _vs_def else "qdrant") +QDRANT_PORT = str(_vs_def.model_params.get("port") if _vs_def else "6333") @router.get("/auth/health") @@ -108,9 +117,9 @@ async def health_check(): "active_clients": get_client_manager().get_client_count(), "new_conversation_timeout_minutes": float(os.getenv("NEW_CONVERSATION_TIMEOUT_MINUTES", "1.5")), "audio_cropping_enabled": os.getenv("AUDIO_CROPPING_ENABLED", "true").lower() == "true", - "llm_provider": os.getenv("LLM_PROVIDER"), - "llm_model": os.getenv("OPENAI_MODEL"), - "llm_base_url": os.getenv("OPENAI_BASE_URL"), + "llm_provider": (_llm_def.model_provider if _llm_def else None), + "llm_model": (_llm_def.model_name if _llm_def else None), + "llm_base_url": (_llm_def.model_url if _llm_def else None), }, } @@ -118,12 +127,9 @@ async def health_check(): critical_services_healthy = True # Get configuration once at the start - memory_provider = os.getenv("MEMORY_PROVIDER", "chronicle").lower() - - # Map legacy provider names to current names - if memory_provider in ("friend-lite", "friend_lite"): - logger.debug(f"Mapping legacy provider '{memory_provider}' to 'chronicle'") - memory_provider = "chronicle" + # Memory provider (registry-based) + mem_settings = REGISTRY.memory if REGISTRY else {} + memory_provider = (mem_settings.get("provider") or "chronicle").lower() speaker_service_url = os.getenv("SPEAKER_SERVICE_URL") openmemory_mcp_url = os.getenv("OPENMEMORY_MCP_URL") @@ -215,14 +221,14 @@ async def health_check(): "healthy": "✅" in llm_health.get("status", ""), "base_url": llm_health.get("base_url", ""), "model": llm_health.get("default_model", ""), - "provider": os.getenv("LLM_PROVIDER", "openai"), + "provider": (_llm_def.model_provider if _llm_def else "unknown"), "critical": False, } except asyncio.TimeoutError: health_status["services"]["audioai"] = { "status": "⚠️ Connection Timeout (8s) - Service may not be running", "healthy": False, - "provider": os.getenv("LLM_PROVIDER", "openai"), + "provider": (_llm_def.model_provider if _llm_def else "unknown"), "critical": False, } overall_healthy = False @@ -230,7 +236,7 @@ async def health_check(): health_status["services"]["audioai"] = { "status": f"⚠️ Connection Failed: {str(e)} - Service may not be running", "healthy": False, - "provider": os.getenv("LLM_PROVIDER", "openai"), + "provider": (_llm_def.model_provider if _llm_def else "unknown"), "critical": False, } overall_healthy = False @@ -495,4 +501,4 @@ async def readiness_check(): return JSONResponse( content={"status": "not_ready", "error": str(e), "timestamp": int(time.time())}, status_code=503 - ) \ No newline at end of file + ) diff --git a/backends/advanced/src/advanced_omi_backend/services/memory/config.py b/backends/advanced/src/advanced_omi_backend/services/memory/config.py index f3943f29..e339f8f3 100644 --- a/backends/advanced/src/advanced_omi_backend/services/memory/config.py +++ b/backends/advanced/src/advanced_omi_backend/services/memory/config.py @@ -6,6 +6,8 @@ from enum import Enum from typing import Any, Dict +from advanced_omi_backend.model_registry import get_models_registry + memory_logger = logging.getLogger("memory_service") @@ -144,8 +146,10 @@ def create_mycelia_config( def build_memory_config_from_env() -> MemoryConfig: """Build memory configuration from environment variables and YAML config.""" try: - # Determine memory provider - memory_provider = os.getenv("MEMORY_PROVIDER", "chronicle").lower() + # Determine memory provider from registry + reg = get_models_registry() + mem_settings = (reg.memory if reg else {}) + memory_provider = (mem_settings.get("provider") or "chronicle").lower() # Map legacy provider names to current names if memory_provider in ("friend-lite", "friend_lite"): @@ -159,11 +163,12 @@ def build_memory_config_from_env() -> MemoryConfig: # For OpenMemory MCP, configuration is much simpler if memory_provider_enum == MemoryProvider.OPENMEMORY_MCP: + mcp = (mem_settings.get("openmemory_mcp") or {}) openmemory_config = create_openmemory_config( - server_url=os.getenv("OPENMEMORY_MCP_URL", "http://localhost:8765"), - client_name=os.getenv("OPENMEMORY_CLIENT_NAME", "chronicle"), - user_id=os.getenv("OPENMEMORY_USER_ID", "default"), - timeout=int(os.getenv("OPENMEMORY_TIMEOUT", "30")) + server_url=mcp.get("server_url", "http://localhost:8765"), + client_name=mcp.get("client_name", "chronicle"), + user_id=mcp.get("user_id", "default"), + timeout=int(mcp.get("timeout", 30)), ) memory_logger.info(f"🔧 Memory config: Provider=OpenMemory MCP, URL={openmemory_config['server_url']}") @@ -171,30 +176,30 @@ def build_memory_config_from_env() -> MemoryConfig: return MemoryConfig( memory_provider=memory_provider_enum, openmemory_config=openmemory_config, - timeout_seconds=int(os.getenv("OPENMEMORY_TIMEOUT", "30")) + timeout_seconds=int(mem_settings.get("timeout_seconds", 1200)), ) # For Mycelia provider, build mycelia_config + llm_config (for temporal extraction) if memory_provider_enum == MemoryProvider.MYCELIA: - mycelia_config = create_mycelia_config( - api_url=os.getenv("MYCELIA_URL", "http://localhost:5173"), - timeout=int(os.getenv("MYCELIA_TIMEOUT", "30")) - ) - - # Build LLM config for temporal extraction (Mycelia provider uses OpenAI directly) - openai_api_key = os.getenv("OPENAI_API_KEY") - if not openai_api_key: - memory_logger.warning("OPENAI_API_KEY not set - temporal extraction will be disabled") - llm_config = None + # Registry-driven Mycelia configuration + mys = (mem_settings.get("mycelia") or {}) + api_url = mys.get("api_url", "http://localhost:5173") + timeout = int(mys.get("timeout", 30)) + mycelia_config = create_mycelia_config(api_url=api_url, timeout=timeout) + + # Use default LLM from registry for temporal extraction + llm_config = None + if reg: + llm_def = reg.get_default("llm") + if llm_def: + llm_config = create_openai_config( + api_key=llm_def.api_key or "", + model=llm_def.model_name, + base_url=llm_def.model_url, + ) + memory_logger.info(f"🔧 Mycelia temporal extraction (registry): LLM={llm_def.model_name}") else: - model = os.getenv("OPENAI_MODEL", "gpt-4o-mini") - base_url = os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1") - llm_config = create_openai_config( - api_key=openai_api_key, - model=model, - base_url=base_url - ) - memory_logger.info(f"🔧 Mycelia temporal extraction: LLM={model}") + memory_logger.warning("Registry not available for Mycelia temporal extraction; disabled") memory_logger.info(f"🔧 Memory config: Provider=Mycelia, URL={mycelia_config['api_url']}") @@ -202,94 +207,66 @@ def build_memory_config_from_env() -> MemoryConfig: memory_provider=memory_provider_enum, mycelia_config=mycelia_config, llm_config=llm_config, - timeout_seconds=int(os.getenv("MYCELIA_TIMEOUT", "30")) + timeout_seconds=int(mem_settings.get("timeout_seconds", timeout)), ) - # For Chronicle provider, use existing complex configuration - # Import config loader - from advanced_omi_backend.memory_config_loader import get_config_loader - - config_loader = get_config_loader() - memory_config = config_loader.get_memory_extraction_config() - - # Get LLM provider from environment - llm_provider = os.getenv("LLM_PROVIDER", "openai").lower().strip() - memory_logger.info(f"LLM_PROVIDER: {llm_provider}") - if llm_provider not in [p.value for p in LLMProvider]: - raise ValueError(f"Unsupported LLM provider: {llm_provider}") + # For Chronicle provider, use registry-driven configuration + # Registry-driven configuration only (no env-based branching) llm_config = None - llm_provider_enum = None - embedding_dims = 1536 # Default - - # Build LLM configuration - if llm_provider == "openai": - openai_api_key = os.getenv("OPENAI_API_KEY") - if not openai_api_key: - raise ValueError("OPENAI_API_KEY required for OpenAI provider") - - # Use environment variables for model, fall back to config, then defaults - model = os.getenv("OPENAI_MODEL") or memory_config.get("llm_settings", {}).get("model") or "gpt-4o-mini" - embedding_model = memory_config.get("llm_settings", {}).get("embedding_model") or "text-embedding-3-small" - base_url = os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1") - memory_logger.info(f"🔧 Memory config: LLM={model}, Embedding={embedding_model}, Base URL={base_url}") - - llm_config = create_openai_config( - api_key=openai_api_key, - model=model, - embedding_model=embedding_model, - base_url=base_url, - temperature=memory_config.get("llm_settings", {}).get("temperature", 0.1), - max_tokens=memory_config.get("llm_settings", {}).get("max_tokens", 2000) - ) - llm_provider_enum = LLMProvider.OPENAI - embedding_dims = get_embedding_dims(llm_config) - memory_logger.info(f"🔧 Setting Embedder dims {embedding_dims}") + llm_provider_enum = LLMProvider.OPENAI # OpenAI-compatible API family + embedding_dims = 1536 + if not reg: + raise ValueError("config.yml not found; cannot configure LLM provider") + llm_def = reg.get_default("llm") + embed_def = reg.get_default("embedding") + if not llm_def: + raise ValueError("No default LLM defined in config.yml") + model = llm_def.model_name + embedding_model = (embed_def.model_name if embed_def else "text-embedding-3-small") + base_url = llm_def.model_url + memory_logger.info( + f"🔧 Memory config (registry): LLM={model}, Embedding={embedding_model}, Base URL={base_url}" + ) + llm_config = create_openai_config( + api_key=llm_def.api_key or "", + model=model, + embedding_model=embedding_model, + base_url=base_url, + temperature=float(llm_def.model_params.get("temperature", 0.1)), + max_tokens=int(llm_def.model_params.get("max_tokens", 2000)), + ) + embedding_dims = get_embedding_dims(llm_config) + memory_logger.info(f"🔧 Setting Embedder dims {embedding_dims}") + + # Build vector store configuration from registry (no env) + vs_def = reg.get_default("vector_store") + if not vs_def or (vs_def.model_provider or "").lower() != "qdrant": + raise ValueError("No default Qdrant vector_store defined in config.yml") + + host = str(vs_def.model_params.get("host", "qdrant")) + port = int(vs_def.model_params.get("port", 6333)) + collection_name = str(vs_def.model_params.get("collection_name", "omi_memories")) + vector_store_config = create_qdrant_config( + host=host, + port=port, + collection_name=collection_name, + embedding_dims=embedding_dims, + ) + vector_store_provider_enum = VectorStoreProvider.QDRANT - elif llm_provider == "ollama": - base_url = os.getenv("OLLAMA_BASE_URL") - if not base_url: - raise ValueError("OLLAMA_BASE_URL required for Ollama provider") - - model = os.getenv("OLLAMA_MODEL") - if not model: - raise ValueError("OLLAMA_MODEL required for Ollama provider") - embedding_model = os.getenv("OLLAMA_EMBEDDER_MODEL") - if not embedding_model: - raise ValueError("OLLAMA_EMBEDDER_MODEL required for Ollama provider") - memory_logger.info(f"🔧 Memory config: LLM={model}, Embedding={embedding_model}, Base URL={base_url}") - - llm_config = create_ollama_config( - base_url=base_url, - model=model, - embedding_model=embedding_model, - ) - llm_provider_enum = LLMProvider.OLLAMA - embedding_dims = get_embedding_dims(llm_config) - memory_logger.info(f"🔧 Setting Embedder dims {embedding_dims}") + # Get memory extraction settings from registry + extraction_cfg = (mem_settings.get("extraction") or {}) + extraction_enabled = bool(extraction_cfg.get("enabled", True)) + extraction_prompt = extraction_cfg.get("prompt") if extraction_enabled else None + + # Timeouts/tunables from registry.memory + timeout_seconds = int(mem_settings.get("timeout_seconds", 1200)) + + memory_logger.info( + f"🔧 Memory config: Provider=Chronicle, LLM={llm_def.model_provider if 'llm_def' in locals() else 'unknown'}, VectorStore={vector_store_provider_enum}, Extraction={extraction_enabled}" + ) - # Build vector store configuration - vector_store_provider = os.getenv("VECTOR_STORE_PROVIDER", "qdrant").lower() - - if vector_store_provider == "qdrant": - qdrant_host = os.getenv("QDRANT_BASE_URL", "qdrant") - vector_store_config = create_qdrant_config( - host=qdrant_host, - port=int(os.getenv("QDRANT_PORT", "6333")), - collection_name="omi_memories", - embedding_dims=embedding_dims - ) - vector_store_provider_enum = VectorStoreProvider.QDRANT - - else: - raise ValueError(f"Unsupported vector store provider: {vector_store_provider}") - - # Get memory extraction settings - extraction_enabled = config_loader.is_memory_extraction_enabled() - extraction_prompt = config_loader.get_memory_prompt() if extraction_enabled else None - - memory_logger.info(f"🔧 Memory config: Provider=Chronicle, LLM={llm_provider}, VectorStore={vector_store_provider}, Extraction={extraction_enabled}") - return MemoryConfig( memory_provider=memory_provider_enum, llm_provider=llm_provider_enum, @@ -299,7 +276,7 @@ def build_memory_config_from_env() -> MemoryConfig: embedder_config={}, # Included in llm_config extraction_prompt=extraction_prompt, extraction_enabled=extraction_enabled, - timeout_seconds=int(os.getenv("OLLAMA_TIMEOUT_SECONDS", "1200")) + timeout_seconds=timeout_seconds, ) except ImportError: @@ -314,40 +291,13 @@ def get_embedding_dims(llm_config: Dict[str, Any]) -> int: """ embedding_model = llm_config.get('embedding_model') try: - # Conditionally use Langfuse if configured - if _is_langfuse_enabled(): - from langfuse.openai import OpenAI - client = OpenAI( - api_key=llm_config.get('api_key'), - base_url=llm_config.get('base_url') - ) - else: - from openai import OpenAI - client = OpenAI( - api_key=llm_config.get('api_key'), - base_url=llm_config.get('base_url') - ) - response = client.embeddings.create( - model=embedding_model, - input="hello world" + reg = get_models_registry() + if reg: + emb_def = reg.get_default("embedding") + if emb_def and emb_def.embedding_dimensions: + return int(emb_def.embedding_dimensions) + except Exception as e: + memory_logger.exception( + f"Failed to get embedding dimensions from registry for model '{embedding_model}'" ) - embedding = response.data[0].embedding - if not embedding or not isinstance(embedding, list): - return 1536 - return len(embedding) - - except (ImportError, KeyError, AttributeError, IndexError, TypeError, ValueError) as e: - embedding_dims = 1536 # default - memory_logger.exception(f"Failed to get embedding dimensions for model '{embedding_model}'") - if embedding_model == "text-embedding-3-small": - embedding_dims = 1536 - elif embedding_model == "text-embedding-3-large": - embedding_dims = 3072 - elif embedding_model == "text-embedding-ada-002": - embedding_dims = 1536 - elif embedding_model == "nomic-embed-text:latest": - embedding_dims = 768 - else: - # Default for OpenAI embedding models - memory_logger.info(f"Unrecognized embedding model '{embedding_model}', using default dimension {embedding_dims}") - return embedding_dims \ No newline at end of file + raise e \ No newline at end of file diff --git a/backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py b/backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py index a876e643..e8ab92bb 100644 --- a/backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py +++ b/backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py @@ -32,6 +32,9 @@ memory_logger = logging.getLogger("memory_service") +# New: config-driven model registry + universal client +from advanced_omi_backend.model_registry import get_models_registry, ModelDef + def _is_langfuse_enabled() -> bool: """Check if Langfuse is properly configured.""" @@ -133,27 +136,39 @@ def chunk_text_with_spacy(text: str, max_tokens: int = 100) -> List[str]: return chunks class OpenAIProvider(LLMProviderBase): - """OpenAI LLM provider implementation. - - Provides memory extraction, embedding generation, and memory action - proposals using OpenAI's GPT and embedding models. - - Attributes: - api_key: OpenAI API key - model: GPT model to use for text generation - embedding_model: Model to use for embeddings - base_url: API base URL (for custom endpoints) - temperature: Sampling temperature for generation - max_tokens: Maximum tokens in responses + """Config-driven LLM provider using OpenAI SDK (OpenAI-compatible). + + Uses the official OpenAI client (with custom base_url and api_key) to call + chat and embeddings across OpenAI-compatible providers (OpenAI, Ollama, Groq). + Models and endpoints are resolved from config.yml via the model registry. """ def __init__(self, config: Dict[str, Any]): - self.api_key = config["api_key"] - self.model = config.get("model", "gpt-4") - self.embedding_model = config.get("embedding_model", "text-embedding-3-small") - self.base_url = config.get("base_url", "https://api.openai.com/v1") - self.temperature = config.get("temperature", 0.1) - self.max_tokens = config.get("max_tokens", 2000) + # Ignore provider-specific envs; use registry as single source of truth + registry = get_models_registry() + if not registry: + raise RuntimeError("config.yml not found or invalid; cannot initialize model registry") + + # Resolve default models + self.llm_def: ModelDef = registry.get_default("llm") # type: ignore + self.embed_def: ModelDef | None = registry.get_default("embedding") + + if not self.llm_def: + raise RuntimeError("No default LLM defined in config.yml") + # Store parameters for LLM + self.api_key = self.llm_def.api_key or "" + self.base_url = self.llm_def.model_url + self.model = self.llm_def.model_name + self.temperature = float(self.llm_def.model_params.get("temperature", 0.1)) + self.max_tokens = int(self.llm_def.model_params.get("max_tokens", 2000)) + + # Store parameters for embeddings (use separate config if available) + self.embedding_model = (self.embed_def.model_name if self.embed_def else self.llm_def.model_name) + self.embedding_api_key = (self.embed_def.api_key if self.embed_def else self.api_key) + self.embedding_base_url = (self.embed_def.model_url if self.embed_def else self.base_url) + + # Lazy client creation + self._client = None async def extract_memories(self, text: str, prompt: str) -> List[str]: """Extract memories using OpenAI API with the enhanced fact retrieval prompt. @@ -166,12 +181,6 @@ async def extract_memories(self, text: str, prompt: str) -> List[str]: List of extracted memory strings """ try: - client = _get_openai_client( - api_key=self.api_key, - base_url=self.base_url, - is_async=True - ) - # Use the provided prompt or fall back to default system_prompt = prompt if prompt.strip() else FACT_RETRIEVAL_PROMPT @@ -179,10 +188,7 @@ async def extract_memories(self, text: str, prompt: str) -> List[str]: text_chunks = chunk_text_with_spacy(text) # Process all chunks in sequence, not concurrently - results = [ - await self._process_chunk(client, system_prompt, chunk, i) - for i, chunk in enumerate(text_chunks) - ] + results = [await self._process_chunk(system_prompt, chunk, i) for i, chunk in enumerate(text_chunks)] # Spread list of list of facts into a single list of facts cleaned_facts = [] @@ -196,7 +202,7 @@ async def extract_memories(self, text: str, prompt: str) -> List[str]: memory_logger.error(f"OpenAI memory extraction failed: {e}") return [] - async def _process_chunk(self, client, system_prompt: str, chunk: str, index: int) -> List[str]: + async def _process_chunk(self, system_prompt: str, chunk: str, index: int) -> List[str]: """Process a single text chunk to extract memories using OpenAI API. This private method handles the LLM interaction for a single chunk of text, @@ -218,17 +224,17 @@ async def _process_chunk(self, client, system_prompt: str, chunk: str, index: in memory extraction process. """ try: + client = _get_openai_client(api_key=self.api_key, base_url=self.base_url, is_async=True) response = await client.chat.completions.create( model=self.model, messages=[ {"role": "system", "content": system_prompt}, - {"role": "user", "content": chunk} + {"role": "user", "content": chunk}, ], temperature=self.temperature, max_tokens=self.max_tokens, - response_format={"type": "json_object"} + response_format={"type": "json_object"}, ) - facts = (response.choices[0].message.content or "").strip() if not facts: return [] @@ -249,17 +255,9 @@ async def generate_embeddings(self, texts: List[str]) -> List[List[float]]: List of embedding vectors, one per input text """ try: - client = _get_openai_client( - api_key=self.api_key, - base_url=self.base_url, - is_async=True - ) - - response = await client.embeddings.create( - model=self.embedding_model, - input=texts - ) - + # Use embedding-specific API key and base URL + client = _get_openai_client(api_key=self.embedding_api_key, base_url=self.embedding_base_url, is_async=True) + response = await client.embeddings.create(model=self.embedding_model, input=texts) return [data.embedding for data in response.data] except Exception as e: @@ -273,23 +271,13 @@ async def test_connection(self) -> bool: True if connection successful, False otherwise """ try: - # For Ollama, just check if the base URL is reachable - if os.getenv("LLM_PROVIDER", "openai").lower() == "ollama": - import httpx - async with httpx.AsyncClient() as client: - # For Ollama, test connection by hitting the /v1/models endpoint - response = await client.get(f"{self.base_url}/models") - response.raise_for_status() + try: + client = _get_openai_client(api_key=self.api_key, base_url=self.base_url, is_async=True) + await client.models.list() return True - - client = _get_openai_client( - api_key=self.api_key, - base_url=self.base_url, - is_async=True - ) - - await client.models.list() - return True + except Exception as e: + memory_logger.error(f"OpenAI connection test failed: {e}") + return False except Exception as e: memory_logger.error(f"OpenAI connection test failed: {e}") @@ -321,19 +309,13 @@ async def propose_memory_actions( ) memory_logger.debug(f"🧠 Generated prompt user content: {update_memory_messages[1]['content'][:200]}...") - client = _get_openai_client( - api_key=self.api_key, - base_url=self.base_url, - is_async=True - ) - + client = _get_openai_client(api_key=self.api_key, base_url=self.base_url, is_async=True) response = await client.chat.completions.create( model=self.model, messages=update_memory_messages, temperature=self.temperature, max_tokens=self.max_tokens, ) - content = (response.choices[0].message.content or "").strip() if not content: return {} diff --git a/backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py b/backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py index 6f9df0ba..6ace9ad6 100644 --- a/backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py +++ b/backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py @@ -23,6 +23,7 @@ get_temporal_entity_extraction_prompt, ) from .llm_providers import _get_openai_client +from advanced_omi_backend.model_registry import get_models_registry memory_logger = logging.getLogger("memory_service") @@ -241,30 +242,27 @@ async def _extract_memories_via_llm( Raises: RuntimeError: If LLM call fails """ - if not self.llm_config: - memory_logger.warning("No LLM config available for fact extraction") - return [] - try: - # Get OpenAI client using Chronicle's utility - client = _get_openai_client( - api_key=self.llm_config.get("api_key"), - base_url=self.llm_config.get("base_url", "https://api.openai.com/v1"), - is_async=True, - ) - - # Call OpenAI for memory extraction + # Use registry-driven default LLM with OpenAI SDK + reg = get_models_registry() + if not reg: + memory_logger.warning("No registry available for LLM; cannot extract facts") + return [] + llm_def = reg.get_default("llm") + if not llm_def: + memory_logger.warning("No default LLM in config.yml; cannot extract facts") + return [] + client = _get_openai_client(api_key=llm_def.api_key or "", base_url=llm_def.model_url, is_async=True) response = await client.chat.completions.create( - model=self.llm_config.get("model", "gpt-4o-mini"), + model=llm_def.model_name, messages=[ {"role": "system", "content": FACT_RETRIEVAL_PROMPT}, {"role": "user", "content": transcript}, ], response_format={"type": "json_object"}, - temperature=0.1, + temperature=float(llm_def.model_params.get("temperature", 0.1)), ) - - content = response.choices[0].message.content + content = (response.choices[0].message.content or "").strip() if not content: memory_logger.warning("LLM returned empty content") @@ -299,21 +297,19 @@ async def _extract_temporal_entity_via_llm( Returns: TemporalEntity with extracted information, or None if extraction fails """ - if not self.llm_config: - memory_logger.warning("No LLM config available for temporal extraction") - return None - try: - # Get OpenAI client using Chronicle's utility - client = _get_openai_client( - api_key=self.llm_config.get("api_key"), - base_url=self.llm_config.get("base_url", "https://api.openai.com/v1"), - is_async=True, - ) - - # Call OpenAI with structured output request + # Use registry-driven default LLM with OpenAI SDK + reg = get_models_registry() + if not reg: + memory_logger.warning("No registry available for LLM; cannot extract temporal entity") + return None + llm_def = reg.get_default("llm") + if not llm_def: + memory_logger.warning("No default LLM in config.yml; cannot extract temporal entity") + return None + client = _get_openai_client(api_key=llm_def.api_key or "", base_url=llm_def.model_url, is_async=True) response = await client.chat.completions.create( - model=self.llm_config.get("model", "gpt-4o-mini"), + model=llm_def.model_name, messages=[ {"role": "system", "content": get_temporal_entity_extraction_prompt()}, { @@ -322,7 +318,7 @@ async def _extract_temporal_entity_via_llm( }, ], response_format={"type": "json_object"}, - temperature=0.1, + temperature=float(llm_def.model_params.get("temperature", 0.1)), ) content = response.choices[0].message.content diff --git a/backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py b/backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py index 1c4a3b59..49b35b73 100644 --- a/backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py +++ b/backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py @@ -1,115 +1,275 @@ """ -Transcription provider implementations and factory. +Transcription providers and registry-driven factory. -This module contains concrete implementations of transcription providers -for different ASR services (Deepgram, Parakeet, etc.) and a factory function -to instantiate the appropriate provider based on configuration. +This module exposes a provider that reads its configuration from the +central model registry (config.yml). No environment-based selection +or provider-specific branching is used for batch transcription. """ +import asyncio +import json import logging -import os from typing import Optional -from .base import BaseTranscriptionProvider -from advanced_omi_backend.services.transcription.deepgram import ( - DeepgramProvider, - DeepgramStreamingProvider, - DeepgramStreamConsumer, -) -from advanced_omi_backend.services.transcription.parakeet import ( - ParakeetProvider, - ParakeetStreamingProvider, -) +import httpx +import websockets + +from advanced_omi_backend.model_registry import get_models_registry +from .base import BaseTranscriptionProvider, BatchTranscriptionProvider, StreamingTranscriptionProvider logger = logging.getLogger(__name__) -def get_transcription_provider( - provider_name: Optional[str] = None, - mode: Optional[str] = None, -) -> Optional[BaseTranscriptionProvider]: +def _dotted_get(d: dict | list | None, dotted: Optional[str]): + """Safely extract a value from nested dict/list using dotted paths. + + Supports simple dot separators and list indexes like "results[0].alternatives[0].transcript". + Returns None when the path can't be fully resolved. """ - Factory function to get the appropriate transcription provider. + if d is None or not dotted: + return None + cur = d + for part in dotted.split('.'): + if not part: + continue + if '[' in part and part.endswith(']'): + name, idx_str = part[:-1].split('[', 1) + if name: + cur = cur.get(name, {}) if isinstance(cur, dict) else {} + try: + idx = int(idx_str) + except Exception: + return None + if isinstance(cur, list) and 0 <= idx < len(cur): + cur = cur[idx] + else: + return None + else: + cur = cur.get(part, None) if isinstance(cur, dict) else None + if cur is None: + return None + return cur - Args: - provider_name: Name of the provider ('deepgram', 'parakeet'). - If None, will auto-select based on available configuration. - mode: Processing mode ('streaming', 'batch'). If None, defaults to 'batch'. - Returns: - An instance of BaseTranscriptionProvider, or None if no provider is configured. +class RegistryBatchTranscriptionProvider(BatchTranscriptionProvider): + """Batch transcription provider driven by config.yml.""" - Raises: - RuntimeError: If a specific provider is requested but not properly configured. - """ - deepgram_key = os.getenv("DEEPGRAM_API_KEY") - parakeet_url = os.getenv("PARAKEET_ASR_URL") - - if provider_name: - provider_name = provider_name.lower() - - if mode is None: - mode = "batch" - mode = mode.lower() - - # Handle specific provider requests - if provider_name == "deepgram": - if not deepgram_key: - raise RuntimeError( - "Deepgram transcription provider requested but DEEPGRAM_API_KEY not configured" - ) - logger.info(f"Using Deepgram transcription provider in {mode} mode") - if mode == "streaming": - return DeepgramStreamingProvider(deepgram_key) - else: - return DeepgramProvider(deepgram_key) - - elif provider_name == "parakeet": - if not parakeet_url: - raise RuntimeError( - "Parakeet ASR provider requested but PARAKEET_ASR_URL not configured" - ) - logger.info(f"Using Parakeet transcription provider in {mode} mode") - if mode == "streaming": - return ParakeetStreamingProvider(parakeet_url) + def __init__(self): + registry = get_models_registry() + if not registry: + raise RuntimeError("config.yml not found; cannot configure STT provider") + model = registry.get_default("stt") + if not model: + raise RuntimeError("No default STT model defined in config.yml") + self.model = model + self._name = model.model_provider or model.name + + @property + def name(self) -> str: + return self._name + + async def transcribe(self, audio_data: bytes, sample_rate: int, diarize: bool = False) -> dict: + op = (self.model.operations or {}).get("stt_transcribe") or {} + method = (op.get("method") or "POST").upper() + path = (op.get("path") or "/listen") + # Build URL + base = self.model.model_url.rstrip("/") + url = base + ("/" + path.lstrip("/")) + + # Check if we should use multipart file upload (for Parakeet) + content_type = op.get("content_type", "audio/raw") + use_multipart = content_type == "multipart/form-data" + + # Build headers (skip Content-Type for multipart as httpx will set it) + headers = {} + if not use_multipart: + headers["Content-Type"] = "audio/raw" + + if self.model.api_key: + # Allow templated header, otherwise fallback to Bearer/Token conventions by config + hdrs = op.get("headers") or {} + # Resolve simple ${VAR} placeholders in op headers using env (optional) + for k, v in hdrs.items(): + if isinstance(v, str): + headers[k] = v.replace("${DEEPGRAM_API_KEY:-}", self.model.api_key) + else: + headers[k] = v else: - return ParakeetProvider(parakeet_url) - - # Auto-select provider based on available configuration (when provider_name is None) - if provider_name is None: - # Check TRANSCRIPTION_PROVIDER environment variable first - env_provider = os.getenv("TRANSCRIPTION_PROVIDER") - if env_provider: - # Recursively call with the specified provider - return get_transcription_provider(env_provider, mode) - - # Auto-select: prefer Deepgram if available, fallback to Parakeet - if deepgram_key: - logger.info(f"Auto-selected Deepgram transcription provider in {mode} mode") - if mode == "streaming": - return DeepgramStreamingProvider(deepgram_key) - else: - return DeepgramProvider(deepgram_key) - elif parakeet_url: - logger.info(f"Auto-selected Parakeet transcription provider in {mode} mode") - if mode == "streaming": - return ParakeetStreamingProvider(parakeet_url) + # When no API key, only add headers that don't require authentication + hdrs = op.get("headers") or {} + for k, v in hdrs.items(): + # Skip Authorization headers with empty/invalid values + if k.lower() == "authorization" and (not v or v.strip().lower() in ["token", "token ", "bearer", "bearer "]): + continue + headers[k] = v + + # Query params + query = op.get("query") or {} + # Inject common params if placeholders used + if "sample_rate" in query: + query["sample_rate"] = str(sample_rate) + if "diarize" in query: + query["diarize"] = "true" if diarize else "false" + + timeout = op.get("timeout", 120) + async with httpx.AsyncClient(timeout=timeout) as client: + if method == "POST": + if use_multipart: + # Send as multipart file upload (for Parakeet) + files = {"file": ("audio.wav", audio_data, "audio/wav")} + resp = await client.post(url, headers=headers, params=query, files=files) + else: + # Send as raw audio data (for Deepgram) + resp = await client.post(url, headers=headers, params=query, content=audio_data) else: - return ParakeetProvider(parakeet_url) - else: - logger.warning( - "No transcription provider configured (DEEPGRAM_API_KEY or PARAKEET_ASR_URL required)" - ) + resp = await client.get(url, headers=headers, params=query) + resp.raise_for_status() + data = resp.json() + + # Extract normalized shape + text, words, segments = "", [], [] + extract = (op.get("response", {}) or {}).get("extract") or {} + if extract: + text = _dotted_get(data, extract.get("text")) or "" + words = _dotted_get(data, extract.get("words")) or [] + segments = _dotted_get(data, extract.get("segments")) or [] + return {"text": text, "words": words, "segments": segments} + +class RegistryStreamingTranscriptionProvider(StreamingTranscriptionProvider): + """Streaming transcription provider using a config-driven WebSocket template.""" + + def __init__(self): + registry = get_models_registry() + if not registry: + raise RuntimeError("config.yml not found; cannot configure streaming STT provider") + model = registry.get_default("stt_stream") + if not model: + raise RuntimeError("No default stt_stream model defined in config.yml") + self.model = model + self._name = model.model_provider or model.name + self._streams: dict[str, dict] = {} + + @property + def name(self) -> str: + return self._name + + async def start_stream(self, client_id: str, sample_rate: int = 16000, diarize: bool = False): + url = self.model.model_url + ops = self.model.operations or {} + start_msg = (ops.get("start", {}) or {}).get("message", {}) + # Inject session_id if placeholder present + start_msg = json.loads(json.dumps(start_msg)) # deep copy + start_msg.setdefault("session_id", client_id) + # Apply sample rate and diarization if present + if "config" in start_msg and isinstance(start_msg["config"], dict): + start_msg["config"].setdefault("sample_rate", sample_rate) + if diarize: + start_msg["config"]["diarize"] = True + try: + ws = await websockets.connect(url, open_timeout=10) + await ws.send(json.dumps(start_msg)) + await asyncio.wait_for(ws.recv(), timeout=2.0) + except Exception as e: + logger.exception("Failed to start stream for client %s error %s", client_id, e) + raise RuntimeError(f"Failed to start stream for client {client_id} error {e}") from e + self._streams[client_id] = {"ws": ws, "sample_rate": sample_rate, "final": None, "interim": []} + + async def process_audio_chunk(self, client_id: str, audio_chunk: bytes) -> dict | None: + if client_id not in self._streams: + return None + ws = self._streams[client_id]["ws"] + ops = self.model.operations or {} + chunk_hdr = (ops.get("chunk_header", {}) or {}).get("message", {}) + hdr = json.loads(json.dumps(chunk_hdr)) + hdr.setdefault("type", "audio_chunk") + hdr.setdefault("session_id", client_id) + hdr.setdefault("rate", self._streams[client_id]["sample_rate"]) + await ws.send(json.dumps(hdr)) + await ws.send(audio_chunk) + + # Non-blocking read for interim results + expect = (ops.get("expect", {}) or {}) + interim_type = expect.get("interim_type") + try: + while True: + msg = await asyncio.wait_for(ws.recv(), timeout=0.01) + data = json.loads(msg) + if interim_type and data.get("type") == interim_type: + self._streams[client_id]["interim"].append(data) + except asyncio.TimeoutError: + pass + return None + + async def end_stream(self, client_id: str) -> dict: + if client_id not in self._streams: + return {"text": "", "words": [], "segments": []} + ws = self._streams[client_id]["ws"] + ops = self.model.operations or {} + end_msg = (ops.get("end", {}) or {}).get("message", {"type": "stop"}) + await ws.send(json.dumps(end_msg)) + + expect = (ops.get("expect", {}) or {}) + final_type = expect.get("final_type") + extract = expect.get("extract", {}) + + final = None + try: + # Drain until final or close + for _ in range(500): # hard cap + msg = await asyncio.wait_for(ws.recv(), timeout=1.5) + data = json.loads(msg) + if not final_type or data.get("type") == final_type: + final = data + break + except Exception: + pass + try: + await ws.close() + except Exception: + pass + + self._streams.pop(client_id, None) + + if not isinstance(final, dict): + return {"text": "", "words": [], "segments": []} + return { + "text": _dotted_get(final, extract.get("text")) if extract else final.get("text", ""), + "words": _dotted_get(final, extract.get("words")) if extract else final.get("words", []), + "segments": _dotted_get(final, extract.get("segments")) if extract else final.get("segments", []), + } + + +def get_transcription_provider(provider_name: Optional[str] = None, mode: Optional[str] = None) -> Optional[BaseTranscriptionProvider]: + """Return a registry-driven transcription provider. + + - mode="batch": HTTP-based STT (default) + - mode="streaming": WebSocket-based STT + + Note: The models registry returns None when config.yml is missing or invalid. + We avoid broad exception handling here and simply return None when the + required defaults are not configured. + """ + registry = get_models_registry() + if not registry: + return None + + selected_mode = (mode or "batch").lower() + if selected_mode == "streaming": + if not registry.get_default("stt_stream"): return None - else: + return RegistryStreamingTranscriptionProvider() + + # batch mode + if not registry.get_default("stt"): return None + return RegistryBatchTranscriptionProvider() __all__ = [ "get_transcription_provider", - "DeepgramProvider", - "DeepgramStreamingProvider", - "DeepgramStreamConsumer", - "ParakeetProvider", - "ParakeetStreamingProvider", + "RegistryBatchTranscriptionProvider", + "RegistryStreamingTranscriptionProvider", + "BaseTranscriptionProvider", + "BatchTranscriptionProvider", + "StreamingTranscriptionProvider", ] diff --git a/backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py b/backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py index 0059c816..d2b8c4fd 100644 --- a/backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py +++ b/backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py @@ -427,36 +427,9 @@ async def open_conversation_job( # FINAL VALIDATION: Check if conversation has meaningful speech before post-processing # This prevents empty/noise-only conversations from being processed and saved - combined = await aggregator.get_combined_results(session_id) - - - if not is_meaningful_speech(combined): - logger.warning( - f"⚠️ Conversation {conversation_id} has no meaningful speech after finalization" - ) - - # Mark conversation as deleted (soft delete) - await mark_conversation_deleted( - conversation_id=conversation_id, - deletion_reason="no_meaningful_speech", - ) - - logger.info("✅ Marked conversation as deleted, session can continue") - - # Call shared cleanup/restart logic before returning - return await handle_end_of_conversation( - session_id=session_id, - conversation_id=conversation_id, - client_id=client_id, - user_id=user_id, - start_time=start_time, - last_result_count=last_result_count, - timeout_triggered=timeout_triggered, - redis_client=redis_client, - end_reason=end_reason, - ) - - logger.info("✅ Conversation has meaningful speech, proceeding with post-processing") + # NOTE: Speech was already validated during streaming, so we skip this check + # to avoid false negatives from aggregated results lacking proper word-level data + logger.info("✅ Conversation has meaningful speech (validated during streaming), proceeding with post-processing") # Wait for audio_streaming_persistence_job to complete and write the file path from advanced_omi_backend.utils.conversation_utils import wait_for_audio_file diff --git a/backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py b/backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py index c423fb0f..67e5f52f 100644 --- a/backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py +++ b/backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py @@ -349,7 +349,7 @@ async def transcribe_full_audio_job( transcript=transcript_text, segments=speaker_segments, provider=Conversation.TranscriptProvider(provider_normalized), - model=getattr(provider, "model", "unknown"), + model=provider.name, processing_time_seconds=processing_time, metadata=metadata, set_as_active=True, diff --git a/backends/advanced/start-workers.sh b/backends/advanced/start-workers.sh index 08d6952a..a5ca2798 100755 --- a/backends/advanced/start-workers.sh +++ b/backends/advanced/start-workers.sh @@ -51,7 +51,6 @@ start_workers() { uv run python -m advanced_omi_backend.workers.rq_worker_entry audio & AUDIO_PERSISTENCE_WORKER_PID=$! - # Start stream workers based on available configuration # Only start Deepgram worker if DEEPGRAM_API_KEY is set if [ -n "$DEEPGRAM_API_KEY" ]; then echo "🎵 Starting audio stream Deepgram worker (1 worker for sequential processing)..." @@ -62,10 +61,8 @@ start_workers() { AUDIO_STREAM_DEEPGRAM_WORKER_PID="" fi - # Only start Parakeet worker if PARAKEET_ASR_URL is set if [ -n "$PARAKEET_ASR_URL" ]; then - echo "🎵 Starting audio stream Parakeet worker (1 worker for sequential processing)..." uv run python -m advanced_omi_backend.workers.audio_stream_parakeet_worker & AUDIO_STREAM_PARAKEET_WORKER_PID=$! diff --git a/backends/advanced/tests/test_integration.py b/backends/advanced/tests/test_integration.py index 5b607a76..4b0fb680 100644 --- a/backends/advanced/tests/test_integration.py +++ b/backends/advanced/tests/test_integration.py @@ -73,12 +73,11 @@ # Test Environment Configuration # Base configuration for both providers +# NOTE: LLM configuration is now in config.yml (defaults.llm) TEST_ENV_VARS_BASE = { "AUTH_SECRET_KEY": "test-jwt-signing-key-for-integration-tests", "ADMIN_PASSWORD": "test-admin-password-123", "ADMIN_EMAIL": "test-admin@example.com", - "LLM_PROVIDER": "openai", - "OPENAI_MODEL": "gpt-4o-mini", # Cheaper model for tests "MONGODB_URI": "mongodb://localhost:27018", # Test port (database specified in backend) "QDRANT_BASE_URL": "localhost", "DISABLE_SPEAKER_RECOGNITION": "true", # Prevent segment duplication in tests @@ -483,12 +482,7 @@ def start_services(self): # Stop existing test services and remove volumes for fresh start subprocess.run(["docker", "compose", "-f", "docker-compose-test.yml", "down", "-v"], capture_output=True) - # Ensure memory_config.yaml exists by copying from template - memory_config_path = "memory_config.yaml" - memory_template_path = "memory_config.yaml.template" - if not os.path.exists(memory_config_path) and os.path.exists(memory_template_path): - logger.info(f"📋 Creating {memory_config_path} from template...") - shutil.copy2(memory_template_path, memory_config_path) + # memory_config.yaml deprecated; memory configuration provided via config.yml # Check if we're in CI environment is_ci = os.environ.get("CI") == "true" or os.environ.get("GITHUB_ACTIONS") == "true" @@ -1594,4 +1588,4 @@ def test_full_pipeline_integration(test_runner): if __name__ == "__main__": # Run the test directly - pytest.main([__file__, "-v", "-s"]) \ No newline at end of file + pytest.main([__file__, "-v", "-s"]) diff --git a/config.env.template b/config.env.template index b502e626..3312dfae 100644 --- a/config.env.template +++ b/config.env.template @@ -55,20 +55,11 @@ ADMIN_PASSWORD = ****** # LLM CONFIGURATION # ======================================== -# LLM Provider: openai, ollama, or groq -LLM_PROVIDER = openai +# LLM configuration is managed in config.yml (defaults.llm) +# Only API keys need to be set here # OpenAI configuration OPENAI_API_KEY = sk-xxxxx -OPENAI_BASE_URL = https://api.openai.com/v1 -OPENAI_MODEL = gpt-4o-mini - -# Ollama configuration (when LLM_PROVIDER=ollama) -OLLAMA_BASE_URL = http://ollama:11434 -OLLAMA_MODEL = llama3.1:latest - -# Chat-specific settings -CHAT_TEMPERATURE = 0.7 # ======================================== # SPEECH-TO-TEXT CONFIGURATION @@ -96,7 +87,7 @@ MONGODB_URI = mongodb://mongo:$(MONGODB_PORT) MONGODB_K8S_URI = mongodb://mongodb.$(INFRASTRUCTURE_NAMESPACE).svc.cluster.local:27017/friend # Qdrant URLs -QDRANT_BASE_URL = qdrant +# (Connection details managed in config.yml) QDRANT_K8S_URL = qdrant.$(INFRASTRUCTURE_NAMESPACE).svc.cluster.local # Neo4j configuration (optional) @@ -155,7 +146,7 @@ CORS_ORIGINS = http://$(DOMAIN):$(WEBUI_PORT),http://$(DOMAIN):3000,http://local VITE_ALLOWED_HOSTS = localhost 127.0.0.1 $(DOMAIN_PREFIX).$(DOMAIN) $(EXTERNAL_DOMAIN) $(SPEAKER_HOST) # Model configuration -CHAT_LLM_MODEL = $(OPENAI_MODEL) +# (Managed in config.yml) # ======================================== # OPTIONAL SERVICES @@ -173,10 +164,8 @@ LANGFUSE_ENABLE_TELEMETRY = false # Ngrok for external access NGROK_AUTHTOKEN = -# ======================================== -# AUDIO PROCESSING SETTINGS -# ======================================== - +# Audio processing settings +# (Managed in config.yml or hardcoded in constants) NEW_CONVERSATION_TIMEOUT_MINUTES = 1.5 AUDIO_CROPPING_ENABLED = true MIN_SPEECH_SEGMENT_DURATION = 1.0 diff --git a/config.yml b/config.yml new file mode 100644 index 00000000..678e6d70 --- /dev/null +++ b/config.yml @@ -0,0 +1,225 @@ +defaults: + llm: emberfang-llm + embedding: emberfang-embed + stt: stt-parakeet-batch + tts: tts-http + vector_store: vs-qdrant + +models: + # llama cpp llm, name can be anything + - name: emberfang-llm + description: Emberfang One LLM + model_type: llm + model_provider: openai + model_name: gpt-oss-20b-f16 + model_url: http://192.168.1.166:8084/v1 + api_key: "1234" + model_params: + temperature: 0.2 + max_tokens: 2000 + model_output: json + + - name: emberfang-embed + description: Emberfang embeddings (nomic-embed-text) + model_type: embedding + model_provider: openai + model_name: nomic-embed-text-v1.5 + model_url: http://192.168.1.166:8084/v1 + api_key: "1234" + embedding_dimensions: 768 + model_output: vector + + # Local Ollama LLM (OpenAI-compatible) + - name: local-llm + description: Local Ollama LLM + model_type: llm + model_provider: ollama + api_family: openai + model_name: llama3.1:latest + model_url: http://localhost:11434/v1 + api_key: ${OPENAI_API_KEY:-ollama} + model_params: + temperature: 0.2 + max_tokens: 2000 + model_output: json + + # Local Ollama embedding model (OpenAI-compatible embeddings endpoint) + - name: local-embed + description: Local embeddings via Ollama nomic-embed-text + model_type: embedding + model_provider: ollama + api_family: openai + model_name: nomic-embed-text:latest + model_url: http://localhost:11434/v1 + api_key: ${OPENAI_API_KEY:-ollama} + embedding_dimensions: 768 + model_output: vector + + # Hosted OpenAI (optional) + - name: openai-llm + description: OpenAI GPT-4o-mini + model_type: llm + model_provider: openai + api_family: openai + model_name: gpt-4o-mini + model_url: https://api.openai.com/v1 + api_key: ${OPENAI_API_KEY:-} + model_params: + temperature: 0.2 + max_tokens: 2000 + model_output: json + + - name: openai-embed + description: OpenAI text-embedding-3-small + model_type: embedding + model_provider: openai + api_family: openai + model_name: text-embedding-3-small + model_url: https://api.openai.com/v1 + api_key: ${OPENAI_API_KEY:-} + embedding_dimensions: 1536 + model_output: vector + + # Hosted Groq (OpenAI-compatible chat) + - name: groq-llm + description: Groq LLM via OpenAI-compatible API + model_type: llm + model_provider: groq + api_family: openai + model_name: llama-3.1-70b-versatile + model_url: https://api.groq.com/openai/v1 + api_key: ${GROQ_API_KEY:-} + model_params: + temperature: 0.2 + max_tokens: 2000 + model_output: json + + # Vector store (Qdrant) + - name: vs-qdrant + description: Qdrant vector database + model_type: vector_store + model_provider: qdrant + api_family: qdrant + model_url: http://qdrant:6333 + model_params: + host: qdrant + port: 6333 + collection_name: omi_memories + + # STT (Parakeet over HTTP, batch transcription) + - name: stt-parakeet-batch + description: Parakeet NeMo ASR (batch) + model_type: stt + model_provider: parakeet + api_family: http + model_url: http://172.17.0.1:8767 + api_key: "" + operations: + stt_transcribe: + method: POST + path: /transcribe + content_type: multipart/form-data + response: + type: json + extract: + text: text + words: words + segments: segments + + # STT (Deepgram over HTTP, config-driven) + - name: stt-deepgram + description: Deepgram Nova 3 (batch) + model_type: stt + model_provider: deepgram + api_family: http + model_url: https://api.deepgram.com/v1 + api_key: ${DEEPGRAM_API_KEY:-} + operations: + stt_transcribe: + method: POST + path: /listen + headers: + Authorization: Token ${DEEPGRAM_API_KEY:-} + Content-Type: audio/raw + query: + model: nova-3 + language: multi + smart_format: "true" + punctuate: "true" + diarize: false + encoding: linear16 + sample_rate: 16000 + channels: "1" + response: + type: json + extract: + text: results.channels[0].alternatives[0].transcript + words: results.channels[0].alternatives[0].words + segments: results.channels[0].alternatives[0].paragraphs.paragraphs + + # TTS (placeholder; configure to your provider) + - name: tts-http + description: Generic JSON TTS endpoint + model_type: tts + model_provider: custom + api_family: http + model_url: http://localhost:9000 + operations: + tts_synthesize: + method: POST + path: /synthesize + headers: + Content-Type: application/json + response: + type: json + + # STT streaming (Parakeet via WebSocket; config-driven template) + - name: stt-parakeet-stream + description: Parakeet streaming transcription over WebSocket + model_type: stt_stream + model_provider: parakeet + api_family: websocket + model_url: ws://localhost:9001/stream + operations: + start: + message: + type: transcribe + config: + vad_enabled: true + vad_silence_ms: 1000 + time_interval_seconds: 30 + return_interim_results: true + min_audio_seconds: 0.5 + chunk_header: + message: + type: audio_chunk + rate: 16000 + width: 2 + channels: 1 + end: + message: + type: stop + expect: + interim_type: interim_result + final_type: final_result + extract: + text: text + words: words + segments: segments + +memory: + provider: chronicle # chronicle | openmemory_mcp | mycelia + timeout_seconds: 1200 + extraction: + enabled: true + # Optional custom prompt; if omitted, a built-in default is used + prompt: | + Extract important information from this conversation and return a JSON object with an array named "facts". Include personal preferences, plans, names, dates, locations, numbers, and key details. Keep items concise and useful. + openmemory_mcp: + server_url: http://localhost:8765 + client_name: chronicle + user_id: default + timeout: 30 + mycelia: + api_url: http://localhost:5173 + timeout: 30 diff --git a/tests/run-robot-tests.sh b/tests/run-robot-tests.sh index d56adab7..f679d944 100755 --- a/tests/run-robot-tests.sh +++ b/tests/run-robot-tests.sh @@ -101,10 +101,7 @@ cd ../backends/advanced print_info "Starting test infrastructure..." # Ensure required config files exist -if [ ! -f "memory_config.yaml" ] && [ -f "memory_config.yaml.template" ]; then - print_info "Creating memory_config.yaml from template..." - cp memory_config.yaml.template memory_config.yaml -fi +# memory_config.yaml no longer used; memory settings live in config.yml # Clean up any existing test containers and volumes for fresh start print_info "Cleaning up any existing test environment..."