Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughConsolidates configuration by removing memory_config.yaml and its loader, adding a centralized config.yml and a Pydantic model registry; updates controllers, services, init/setup scripts, Docker/compose, tests, and docs to read memory, LLM, embedding, STT, TTS, and vector-store settings via the new registry. (49 words) Changes
Sequence Diagram(s)sequenceDiagram
participant UI as UI/Client
participant Controller as system_controller
participant Registry as model_registry
participant Provider as Memory/LLM Provider
participant Store as VectorStore/DB
UI->>Controller: GET or UPDATE memory config
Controller->>Registry: _find_config_path() / load_models_config()
Registry-->>Controller: AppModels (includes memory section)
Controller->>Controller: validate & write memory section to config.yml
Controller->>Registry: load_models_config(force_reload=True)
Registry-->>Controller: refreshed AppModels
UI->>Controller: Trigger memory extraction job
Controller->>Registry: get_models_registry() -> ModelDef(s)
Registry-->>Controller: ModelDef(s) (LLM, embeddings, vector-store)
Controller->>Provider: instantiate provider from ModelDef
Provider->>Store: store/read vectors & extraction results
Store-->>Provider: ack
Provider-->>Controller: result
Controller-->>UI: return status/result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes Areas requiring extra attention:
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (7)
backends/advanced/start-workers.sh (1)
54-74: Risk of duplicate Deepgram worker startup.Lines 54-62 start a Deepgram worker if
DEEPGRAM_API_KEYis set, and lines 64-74 also start a Deepgram worker ifTRANSCRIPTION_PROVIDERis not "parakeet" or "offline". This could result in two Deepgram workers running simultaneously if both conditions are met.Additionally, Line 74 has trailing whitespace.
🔎 Apply this diff to fix the duplicate worker issue and remove trailing whitespace:
- # Only start Deepgram worker if DEEPGRAM_API_KEY is set - if [ -n "$DEEPGRAM_API_KEY" ]; then - echo "🎵 Starting audio stream Deepgram worker (1 worker for sequential processing)..." - uv run python -m advanced_omi_backend.workers.audio_stream_deepgram_worker & - AUDIO_STREAM_DEEPGRAM_WORKER_PID=$! - else - echo "⏭️ Skipping Deepgram stream worker (DEEPGRAM_API_KEY not set)" - AUDIO_STREAM_DEEPGRAM_WORKER_PID="" - fi - # Start the appropriate audio stream worker based on TRANSCRIPTION_PROVIDER TRANSCRIPTION_PROVIDER=${TRANSCRIPTION_PROVIDER:-deepgram} if [ "$TRANSCRIPTION_PROVIDER" = "parakeet" ] || [ "$TRANSCRIPTION_PROVIDER" = "offline" ]; then echo "🎵 Starting audio stream Parakeet worker (1 worker for sequential processing)..." uv run python -m advanced_omi_backend.workers.audio_stream_parakeet_worker & AUDIO_STREAM_WORKER_PID=$! else echo "🎵 Starting audio stream Deepgram worker (1 worker for sequential processing)..." uv run python -m advanced_omi_backend.workers.audio_stream_deepgram_worker & AUDIO_STREAM_WORKER_PID=$! + fi - + echo "✅ All workers started:"backends/advanced/Docs/memory-configuration-guide.md (1)
21-22: Incorrect model name:gpt-5-minidoesn't exist.OpenAI doesn't have a model called
gpt-5-mini. Based on other documentation in this PR, this should likely begpt-4o-mini. This typo appears multiple times in this file (lines 21, 47, 64, 109).🔎 Apply this fix across the file
- llm_model: "gpt-5-mini" + llm_model: "gpt-4o-mini"backends/advanced/Docs/quickstart.md (1)
559-561: Same model name typo:gpt-5-minishould begpt-4o-mini.The comment references a non-existent model. This appears at multiple locations in the file (lines 559, 583-584).
Docs/getting-started.md (1)
561-561: Typo:gpt-5-minishould begpt-4o-mini.Same issue as in other documentation files.
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
267-284: Remove redundant nested try-except and uselogging.exception().The outer
try-except(lines 282-284) is unreachable because the inner block (lines 274-280) already catches all exceptions. Additionally, per project conventions, uselogging.exception()instead oflogging.error()to capture the full stack trace.🔎 Apply this fix
async def test_connection(self) -> bool: """Test OpenAI connection. Returns: True if connection successful, False otherwise """ try: - try: - client = _get_openai_client(api_key=self.api_key, base_url=self.base_url, is_async=True) - await client.models.list() - return True - except Exception as e: - memory_logger.error(f"OpenAI connection test failed: {e}") - return False - - except Exception as e: - memory_logger.error(f"OpenAI connection test failed: {e}") + client = _get_openai_client(api_key=self.api_key, base_url=self.base_url, is_async=True) + await client.models.list() + return True + except Exception: + memory_logger.exception("OpenAI connection test failed") return FalseBased on learnings, prefer
logging.exception()inside except blocks to automatically log the full stack trace.backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
297-315: Function has missing return path and incomplete fallback logic.The function only returns a value when the registry successfully provides embedding dimensions (line 309). If the registry is unavailable, or if
emb_defis None or lacksembedding_dimensions, the function falls through without returning, implicitly returningNone. This breaks the declared return type ofint.🔎 Apply this fix
def get_embedding_dims(llm_config: Dict[str, Any]) -> int: """ Query the embedding endpoint and return the embedding vector length. Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama). """ embedding_model = llm_config.get('embedding_model') try: from advanced_omi_backend.model_registry import get_models_registry reg = get_models_registry() if reg: emb_def = reg.get_default("embedding") if emb_def and emb_def.embedding_dimensions: return int(emb_def.embedding_dimensions) - except Exception: - memory_logger.error( - f"Failed to get embedding dimensions for model '{embedding_model}': {e}. " - "Falling back to default dimensions." - ) - raise + except Exception as e: + memory_logger.exception( + f"Failed to get embedding dimensions for model '{embedding_model}'" + ) + raise RuntimeError( + f"Failed to get embedding dimensions for '{embedding_model}'" + ) from e + + # Fallback to common defaults + defaults = { + 'text-embedding-3-small': 1536, + 'text-embedding-3-large': 3072, + 'text-embedding-ada-002': 1536, + 'nomic-embed-text': 768, + } + return defaults.get(embedding_model, 1536)backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
275-282: Update stale__all__exports.The
__all__list referencesDeepgramProvider,DeepgramStreamingProvider,DeepgramStreamConsumer,ParakeetProvider, andParakeetStreamingProvider, which are no longer defined in this file. This will causeImportErrorfor any code importing these names. Update to reflect the new registry-driven providers.🔎 Apply this fix
__all__ = [ "get_transcription_provider", - "DeepgramProvider", - "DeepgramStreamingProvider", - "DeepgramStreamConsumer", - "ParakeetProvider", - "ParakeetStreamingProvider", + "RegistryBatchTranscriptionProvider", + "RegistryStreamingTranscriptionProvider", + "BaseTranscriptionProvider", + "BatchTranscriptionProvider", + "StreamingTranscriptionProvider", ]
🧹 Nitpick comments (13)
backends/advanced/init-https.sh (1)
74-74: Consider verifying config.yml existence.The script now displays an informational message about
config.ymlbut doesn't verify that the file exists or is properly configured. This could lead to runtime issues if the configuration is missing.🔎 Consider adding a verification step:
echo "" -echo "📄 Step 4: Memory configuration now lives in config.yml (memory section)" +echo "📄 Step 4: Memory Configuration" +if [ ! -f "config.yml" ]; then + print_warning "config.yml not found. Memory configuration may be incomplete." + echo " Please ensure config.yml exists with proper memory section." +else + print_info "Memory configuration managed in config.yml (memory section)" +fibackends/advanced/setup-https.sh (1)
137-137: Consider verifying config.yml existence.Similar to
init-https.sh, this script now displays an informational message without verifying thatconfig.ymlexists. For consistency and robustness, consider adding a check to ensure the configuration file is present.🔎 Consider adding a verification step:
# Step 2: Memory configuration print_header "Step 2: Memory Configuration" -print_info "Memory settings are managed in config.yml (memory section)." +if [ ! -f "config.yml" ]; then + print_warning "config.yml not found. Memory configuration may be incomplete." + echo " Please ensure config.yml exists with proper memory section." +else + print_success "Memory configuration managed in config.yml (memory section)" +fitests/run-robot-tests.sh (1)
104-104: Consider making the comment more actionable.The comment correctly notes that memory_config.yaml is deprecated, but could be more helpful by pointing users to the replacement location.
🔎 Suggested improvement:
-# memory_config.yaml no longer used; memory settings live in config.yml +# memory_config.yaml no longer used; memory settings now configured in ../../config.yml (memory section) +# See Docs/memory-configuration-guide.md for configuration detailsbackends/advanced/Dockerfile.k8s (1)
36-37: Update or remove outdated comment.The comment on line 36 references copying memory_config.yaml "created by init.sh from template", but this file is no longer copied or used. The comment should be updated to reflect the current configuration approach or removed entirely.
🔎 Apply this diff to remove the outdated comment:
-# Copy memory config (created by init.sh from template) -Or update it to reference the new approach:
-# Copy memory config (created by init.sh from template) +# Memory configuration now sourced from mounted config.yml at runtimebackends/advanced/init.py (1)
458-464: Consider updating function name for clarity.The
copy_config_templates()function now only copiesdiarization_config.json, making the plural "templates" in the function name and docstring somewhat misleading. Consider renaming for clarity.🔎 Optional refactor:
- def copy_config_templates(self): - """Copy other configuration files""" + def copy_config_files(self): + """Copy configuration files from templates if they don't exist""" if not Path("diarization_config.json").exists() and Path("diarization_config.json.template").exists(): shutil.copy2("diarization_config.json.template", "diarization_config.json") self.console.print("[green][SUCCESS][/green] diarization_config.json created")Don't forget to update the call site on line 544.
backends/advanced/run-test.sh (1)
70-70: Deprecation note could be more actionable.The comment indicates deprecation but doesn't guide developers on what to do. Consider expanding to help users understand the migration path.
🔎 Suggested improvement
-# memory_config.yaml deprecated; using config.yml for memory settings +# NOTE: memory_config.yaml is deprecated and no longer used. +# Memory settings are now configured in the 'memory' section of config.yml. +# See Docs/memory-configuration-guide.md for details.backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (2)
146-171: Remove or use theconfigparameter.The
configparameter is accepted but never used. Either remove it if the registry is now the sole configuration source, or document why it's retained for backward compatibility.🔎 Option 1: Remove unused parameter
- def __init__(self, config: Dict[str, Any]): + def __init__(self):🔎 Option 2: Keep for interface compatibility (add docstring note)
def __init__(self, config: Dict[str, Any]): + """Initialize OpenAI provider. + + Args: + config: Deprecated, retained for interface compatibility. + Configuration is now sourced from the model registry. + """ + _ = config # Retained for interface compatibility
205-225: Update stale docstring.The docstring at lines 212-213 still references a
clientparameter that was removed from the method signature. Update the docstring to reflect the current implementation.🔎 Apply this fix
"""Process a single text chunk to extract memories using OpenAI API. This private method handles the LLM interaction for a single chunk of text, sending it to OpenAI's chat completion API with the specified system prompt to extract structured memory facts. Args: - client: OpenAI async client instance for API communication system_prompt: System prompt that guides the memory extraction behavior chunk: Individual text chunk to process for memory extraction index: Index of the chunk for logging and error tracking purposesbackends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
190-206: Log exception instead of silently catching.Lines 193-194 silently catch and ignore exceptions when loading the registry for Mycelia temporal extraction. Per project conventions, log the exception to aid debugging.
🔎 Apply this fix
try: from advanced_omi_backend.model_registry import get_models_registry reg = get_models_registry() - except Exception: + except Exception as e: + memory_logger.warning(f"Failed to load registry for Mycelia: {e}") reg = NoneBased on learnings, prefer logging exceptions to improve debuggability.
backends/advanced/src/advanced_omi_backend/model_registry.py (1)
277-281: Moveloggingimport to top of file.The
loggingmodule is imported inside the except block at line 279. Per coding guidelines, imports should be at the top of the file. Since this module handles configuration loading and errors are expected during development, logging should be readily available.🔎 Apply this fix
Add at top of file (around line 15):
import loggingThen update the except block:
except Exception as e: # Log but don't fail the entire registry load - import logging logging.warning(f"Failed to load model '{m.get('name', 'unknown')}': {e}") continueAs per coding guidelines, imports should be at the top of the file.
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (3)
101-122: Extract duplicate_extracthelper to module level.The
_extractfunction is duplicated verbatim in bothRegistryBatchTranscriptionProvider.transcribe(lines 101-122) andRegistryStreamingTranscriptionProvider.end_stream(lines 228-249). Extract this to a module-level helper to eliminate duplication.🔎 Add module-level helper
Add near the top of the file (after imports):
def _extract_dotted_path(data: dict, dotted: str): """Extract value from nested dict using dotted path notation (e.g., 'results[0].text').""" if not dotted: return None cur = data for part in dotted.split('.'): if not part: continue if '[' in part and part.endswith(']'): name, idx_str = part[:-1].split('[', 1) if name: cur = cur.get(name, {}) if isinstance(cur, dict) else {} try: idx = int(idx_str) except ValueError: return None if isinstance(cur, list) and 0 <= idx < len(cur): cur = cur[idx] else: return None else: cur = cur.get(part, None) if isinstance(cur, dict) else None if cur is None: return None return curThen replace both inline
_extractdefinitions with calls to_extract_dotted_path.Also applies to: 228-249
166-170: Add debug logging for WebSocket errors.Lines 168-169 silently swallow exceptions when waiting for WebSocket confirmation. While non-fatal, logging at debug level would aid troubleshooting connection issues.
🔎 Apply this fix
try: await asyncio.wait_for(ws.recv(), timeout=2.0) - except Exception: - pass + except asyncio.TimeoutError: + logger.debug(f"No confirmation received for client {client_id}, continuing") + except Exception as e: + logger.debug(f"WebSocket confirmation error for {client_id}: {e}")
260-272: Unusedprovider_nameparameter.The
provider_nameparameter is accepted but never used since the registry determines the provider. Either remove it or document that it's deprecated and retained for backward compatibility.🔎 Option: Add deprecation note
def get_transcription_provider(provider_name: Optional[str] = None, mode: Optional[str] = None) -> Optional[BaseTranscriptionProvider]: """Return a registry-driven transcription provider. + Args: + provider_name: Deprecated, ignored. Provider is determined by config.yml. + mode: "batch" (default) for HTTP-based STT, "streaming" for WebSocket-based STT. + - mode="batch": HTTP-based STT (default) - mode="streaming": WebSocket-based STT """ + del provider_name # Unused, retained for interface compatibility
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (31)
Docs/getting-started.md(4 hunks)README-K8S.md(2 hunks)README.md(1 hunks)backends/advanced/Dockerfile(1 hunks)backends/advanced/Dockerfile.k8s(1 hunks)backends/advanced/Docs/README.md(7 hunks)backends/advanced/Docs/contribution.md(2 hunks)backends/advanced/Docs/memories.md(3 hunks)backends/advanced/Docs/memory-configuration-guide.md(2 hunks)backends/advanced/Docs/quickstart.md(4 hunks)backends/advanced/SETUP_SCRIPTS.md(2 hunks)backends/advanced/docker-compose.yml(2 hunks)backends/advanced/init-https.sh(2 hunks)backends/advanced/init.py(1 hunks)backends/advanced/memory_config.yaml.template(0 hunks)backends/advanced/run-test.sh(2 hunks)backends/advanced/setup-https.sh(3 hunks)backends/advanced/src/advanced_omi_backend/app_config.py(3 hunks)backends/advanced/src/advanced_omi_backend/controllers/system_controller.py(1 hunks)backends/advanced/src/advanced_omi_backend/memory_config_loader.py(0 hunks)backends/advanced/src/advanced_omi_backend/model_registry.py(1 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py(4 hunks)backends/advanced/src/advanced_omi_backend/services/memory/config.py(4 hunks)backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py(8 hunks)backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py(4 hunks)backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py(1 hunks)backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py(1 hunks)backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py(1 hunks)backends/advanced/start-workers.sh(1 hunks)backends/advanced/tests/test_integration.py(2 hunks)tests/run-robot-tests.sh(1 hunks)
💤 Files with no reviewable changes (2)
- backends/advanced/src/advanced_omi_backend/memory_config_loader.py
- backends/advanced/memory_config.yaml.template
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for import sorting in Python code
ALL imports must be at the top of the file after the docstring. NEVER import modules in the middle of functions or files. Use lazy imports sparingly and only when absolutely necessary for circular import issues.
Group imports in order: standard library, third-party, local imports
Always raise errors, never silently ignore. Use explicit error handling with proper exceptions rather than silent failures.
Do not add defensivehasattr()checks. Research and understand input/response or class structure instead.
Useuv run pythonoruv run python3instead of directpythonorpython3commands for Python execution
Files:
backends/advanced/tests/test_integration.pybackends/advanced/init.pybackends/advanced/src/advanced_omi_backend/controllers/system_controller.pybackends/advanced/src/advanced_omi_backend/app_config.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.pybackends/advanced/src/advanced_omi_backend/services/transcription/__init__.pybackends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/model_registry.pybackends/advanced/src/advanced_omi_backend/workers/transcription_jobs.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.pybackends/advanced/src/advanced_omi_backend/workers/conversation_jobs.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
**/Dockerfile
📄 CodeRabbit inference engine (CLAUDE.md)
Use
docker compose buildwithout--no-cacheby default for faster builds. Only use--no-cachewhen explicitly needed (e.g., if cached layers are causing issues or when troubleshooting build problems).
Files:
backends/advanced/Dockerfile
**/{auth,permission,security,model,database,db}*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Scope all data by user_id with efficient permission checking for data isolation
Files:
backends/advanced/src/advanced_omi_backend/model_registry.py
**/{conversation,audio,memory,processing}*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Only create conversations when speech is detected, eliminating noise-only sessions. Store all audio sessions in
audio_chunkscollection, conversations only inconversationscollection when speech is detected.
Files:
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
**/docker-compose*.{yml,yaml}
📄 CodeRabbit inference engine (CLAUDE.md)
Check if src/ is volume mounted. If not, run
docker compose buildso that code changes are reflected. Do not simply rundocker compose restartas it will not rebuild the image.
Files:
backends/advanced/docker-compose.yml
🧠 Learnings (7)
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/{job,queue,pipeline,processor}*.py : Use a Job-based tracking system for all audio processing (WebSocket and file uploads) with stage events (audio → transcription → memory) and completion status
Applied to files:
backends/advanced/start-workers.sh
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/{memory,provider}*.py : Implement pluggable memory providers supporting both Chronicle native (individual facts) and OpenMemory MCP delegation
Applied to files:
backends/advanced/Docs/README.mdbackends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/tests/test_integration.pybackends/advanced/init.pybackends/advanced/src/advanced_omi_backend/controllers/system_controller.pybackends/advanced/src/advanced_omi_backend/app_config.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.pybackends/advanced/src/advanced_omi_backend/services/transcription/__init__.pybackends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/model_registry.pybackends/advanced/src/advanced_omi_backend/workers/transcription_jobs.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.pybackends/advanced/src/advanced_omi_backend/workers/conversation_jobs.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/{conversation,audio,memory,processing}*.py : Only create conversations when speech is detected, eliminating noise-only sessions. Store all audio sessions in `audio_chunks` collection, conversations only in `conversations` collection when speech is detected.
Applied to files:
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/docker-compose*.{yml,yaml} : Check if src/ is volume mounted. If not, run `docker compose build` so that code changes are reflected. Do not simply run `docker compose restart` as it will not rebuild the image.
Applied to files:
backends/advanced/docker-compose.yml
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/*.robot : When writing or modifying Robot Framework tests, follow the testing guidelines in [tests/TESTING_GUIDELINES.md](tests/TESTING_GUIDELINES.md) and approved tags in [tests/tags.md](tests/tags.md). Only use the 11 approved tags, tab-separated.
Applied to files:
tests/run-robot-tests.sh
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/*.robot : Only create Robot Framework keywords for reusable setup/action operations AFTER confirming no existing keyword exists. Keep verifications inline in tests, not in resource keywords.
Applied to files:
tests/run-robot-tests.sh
🧬 Code graph analysis (10)
backends/advanced/run-test.sh (4)
backends/advanced/setup-https.sh (1)
print_success(19-21)tests/run-robot-tests.sh (1)
print_success(20-22)run-test.sh (1)
print_success(21-23)extras/speaker-recognition/run-test.sh (1)
print_success(38-40)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (2)
backends/advanced/src/advanced_omi_backend/model_registry.py (2)
_find_config_path(197-231)load_models_config(234-289)backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (3)
update_memory_config_raw(84-89)validate_memory_config(93-98)reload_memory_config(102-104)
backends/advanced/setup-https.sh (3)
backends/advanced/run-test.sh (2)
print_info(16-18)print_success(20-22)tests/run-robot-tests.sh (2)
print_info(16-18)print_success(20-22)run-test.sh (2)
print_info(17-19)print_success(21-23)
backends/advanced/src/advanced_omi_backend/app_config.py (2)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
get_transcription_provider(260-272)backends/advanced/src/advanced_omi_backend/model_registry.py (1)
get_models_registry(292-307)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (1)
backends/advanced/src/advanced_omi_backend/model_registry.py (2)
get_models_registry(292-307)get_default(149-171)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (2)
backends/advanced/src/advanced_omi_backend/services/transcription/base.py (13)
BaseTranscriptionProvider(42-81)BatchTranscriptionProvider(118-134)StreamingTranscriptionProvider(84-115)name(65-67)transcribe(46-61)transcribe(126-134)start_stream(92-100)connect(75-77)process_audio_chunk(103-110)end_stream(113-115)mode(71-73)mode(88-89)mode(122-123)backends/advanced/src/advanced_omi_backend/model_registry.py (2)
get_models_registry(292-307)get_default(149-171)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (3)
backends/advanced/src/advanced_omi_backend/model_registry.py (2)
get_models_registry(292-307)get_default(149-171)backends/advanced/src/advanced_omi_backend/models/user.py (1)
user_id(71-73)backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
MemoryProvider(29-34)
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py (4)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (2)
name(36-37)name(147-148)backends/advanced/src/advanced_omi_backend/services/transcription/base.py (1)
name(65-67)backends/advanced/src/advanced_omi_backend/services/transcription/parakeet.py (2)
name(35-36)name(106-107)backends/advanced/src/advanced_omi_backend/services/transcription/deepgram.py (2)
name(31-32)name(226-227)
backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py (3)
backends/advanced/src/advanced_omi_backend/model_registry.py (2)
get_models_registry(292-307)get_default(149-171)backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
_get_openai_client(48-72)backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
transcript(191-193)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
backends/advanced/src/advanced_omi_backend/model_registry.py (3)
get_models_registry(292-307)ModelDef(43-111)get_default(149-171)
🪛 LanguageTool
backends/advanced/Docs/README.md
[style] ~91-~91: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...yml` - Configuration overview ### "I want to work on memory extraction" 1. [memori...
(REP_WANT_TO_VB)
🪛 markdownlint-cli2 (0.18.1)
backends/advanced/Docs/README.md
74-74: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
README.md
7-7: Bare URL used
(MD034, no-bare-urls)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
307-307: Do not catch blind exception: Exception
(BLE001)
308-308: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
309-309: Use explicit conversion flag
Replace with conversion flag
(RUF010)
322-322: Use explicit conversion flag
Replace with conversion flag
(RUF010)
348-348: Do not catch blind exception: Exception
(BLE001)
349-349: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
350-350: Use explicit conversion flag
Replace with conversion flag
(RUF010)
360-360: Use explicit conversion flag
Replace with conversion flag
(RUF010)
365-365: Consider moving this statement to an else block
(TRY300)
366-366: Do not catch blind exception: Exception
(BLE001)
367-367: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
368-368: Use explicit conversion flag
Replace with conversion flag
(RUF010)
378-378: Do not catch blind exception: Exception
(BLE001)
379-379: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
380-380: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backends/advanced/src/advanced_omi_backend/app_config.py
70-70: Do not catch blind exception: Exception
(BLE001)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
28-28: Avoid specifying long messages outside the exception class
(TRY003)
31-31: Avoid specifying long messages outside the exception class
(TRY003)
112-112: Do not catch blind exception: Exception
(BLE001)
138-138: Avoid specifying long messages outside the exception class
(TRY003)
141-141: Avoid specifying long messages outside the exception class
(TRY003)
168-169: try-except-pass detected, consider logging the exception
(S110)
168-168: Do not catch blind exception: Exception
(BLE001)
219-220: try-except-pass detected, consider logging the exception
(S110)
219-219: Do not catch blind exception: Exception
(BLE001)
223-224: try-except-pass detected, consider logging the exception
(S110)
223-223: Do not catch blind exception: Exception
(BLE001)
239-239: Do not catch blind exception: Exception
(BLE001)
260-260: Unused function argument: provider_name
(ARG001)
270-270: Do not catch blind exception: Exception
(BLE001)
backends/advanced/src/advanced_omi_backend/services/memory/config.py
193-193: Do not catch blind exception: Exception
(BLE001)
226-226: Avoid specifying long messages outside the exception class
(TRY003)
230-230: Avoid specifying long messages outside the exception class
(TRY003)
252-252: Avoid specifying long messages outside the exception class
(TRY003)
255-255: Avoid specifying long messages outside the exception class
(TRY003)
311-314: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
312-312: Undefined name e
(F821)
backends/advanced/src/advanced_omi_backend/model_registry.py
223-224: try-except-pass detected, consider logging the exception
(S110)
223-223: Do not catch blind exception: Exception
(BLE001)
277-277: Do not catch blind exception: Exception
(BLE001)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
146-146: Unused method argument: config
(ARG002)
150-150: Avoid specifying long messages outside the exception class
(TRY003)
157-157: Avoid specifying long messages outside the exception class
(TRY003)
277-277: Consider moving this statement to an else block
(TRY300)
278-278: Do not catch blind exception: Exception
(BLE001)
279-279: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🔇 Additional comments (23)
backends/advanced/SETUP_SCRIPTS.md (1)
9-9: LGTM!Documentation accurately reflects the migration to centralized
config.ymlfor memory configuration.backends/advanced/Docs/contribution.md (1)
4-4: LGTM!Documentation correctly updated to reference the centralized
config.ymlwith appropriate section paths.Also applies to: 9-9
backends/advanced/Docs/README.md (1)
16-16: LGTM!Documentation comprehensively updated to reflect the centralized configuration approach with
config.ymlandmodel_registry.py. All references are consistent throughout the document.Also applies to: 73-79, 89-94, 133-133, 150-152, 165-165, 178-178, 185-185, 191-191
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py (1)
430-432: The removal is appropriate; streaming validation during job initiation is sufficient.Speech is validated in
stream_speech_detection_jobusinganalyze_speech()before the conversation job is even enqueued. Theopen_conversation_jobonly runs after meaningful speech has already been detected, so the removed check was redundant. Additionally, the aggregator explicitly preserves word-level data across chunks, andanalyze_speech()includes fallback logic for text-only analysis, so there is no risk of data loss during aggregation. The code comment about "aggregated results lacking proper word-level data" is misleading—the real issue was that the removed check duplicated validation that already occurred upstream.Likely an incorrect or invalid review comment.
README.md (1)
7-7: LGTM!The updated quick start instruction accurately reflects the migration to config.yml-based configuration and is more concise.
Note: The static analysis hint about "bare URL" is a false positive—the URL is part of the instruction text, not an improperly formatted link.
backends/advanced/docker-compose.yml (2)
15-15: LGTM!The config.yml volume mounts are correctly configured as read-only and consistently applied to both the backend and workers services. This properly centralizes configuration in the repository-root config.yml file.
Also applies to: 72-72
78-78: Good consistency fix!Adding
PARAKEET_ASR_URLto the workers service environment ensures both the backend and workers have consistent access to the Parakeet ASR service configuration.backends/advanced/Dockerfile (1)
36-39: LGTM!The removal of
memory_config.yamlcopying and retention ofdiarization_config.jsoncorrectly reflects the migration to centralized config.yml while preserving other necessary configuration files. The wildcard pattern appropriately handles cases where the file might not exist.backends/advanced/Docs/memories.md (2)
13-13: LGTM!The updated configuration reference correctly points users to
config.yml(memory section) and adds helpful context aboutmodel_registry.py, making it easier for users to find where memory settings are configured.
183-183: LGTM!The configuration instruction correctly directs users to configure memory settings via
config.yml(memory block), consistent with the centralized configuration approach.backends/advanced/tests/test_integration.py (1)
486-486: LGTM!The comment clearly documents the deprecation of memory_config.yaml and explains that configuration is now provided via config.yml, which helps future test maintainers understand why the previous config file creation step was removed.
backends/advanced/Docs/memory-configuration-guide.md (1)
130-132: LGTM!The Next Steps section correctly guides users to
config.ymlfor memory configuration, aligning with the centralized configuration approach.backends/advanced/Docs/quickstart.md (1)
542-549: LGTM!The configuration file location and override mechanism via
CONFIG_FILEenvironment variable is clearly documented, with appropriate hot-reload and fallback behavior noted.Docs/getting-started.md (1)
544-549: LGTM!The centralized configuration documentation is clear and consistent with the memory configuration guide. The
CONFIG_FILEenvironment variable override is properly documented.backends/advanced/src/advanced_omi_backend/app_config.py (1)
54-61: LGTM!The registry-based transcription provider initialization with appropriate logging for both success and warning cases is well-implemented.
backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py (2)
246-265: LGTM with minor observation.The registry-driven LLM configuration is well-implemented with appropriate fallbacks. The early returns with warnings when registry or LLM definition is unavailable prevent runtime errors.
Consider extracting the repeated registry-lookup pattern (lines 247-255 and 302-310) into a helper method if this pattern is used elsewhere, but it's acceptable as-is for two usages.
301-322: LGTM!Consistent with the pattern in
_extract_memories_via_llm. The registry-driven approach properly handles missing configuration with graceful degradation.backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (3)
120-122: LGTM!The health endpoint now displays registry-driven LLM configuration, providing accurate reflection of the centralized config.yml settings.
130-132: LGTM!Memory provider resolution from the registry is consistent with the centralized configuration approach.
42-52: LGTM with a consideration for startup order.The registry initialization at module level is appropriate since
get_models_registry()caches the result. The defensive None handling for_llm_def,_embed_def, and_vs_defensures graceful degradation if config.yml is unavailable at module import time. Fallback defaults for QDRANT configuration are explicit and reasonable.backends/advanced/src/advanced_omi_backend/model_registry.py (3)
20-31: LGTM!The environment variable resolution with
${VAR:-default}pattern support is well-implemented. The regex pattern correctly handles both cases with and without default values.
149-171: LGTM!The
get_defaultmethod has sensible fallback behavior: it first checks explicit defaults, then falls back to the first available model of that type. This provides resilience while maintaining predictable behavior.
250-256: Registry is thread-safe in this context; no action needed.The global
_REGISTRYis accessed primarily during application startup (module imports and service initialization), which happens sequentially. Whileforce_reloadendpoints exist in system_controller.py, FastAPI's default single event-loop model processes concurrent requests sequentially, and multi-process deployments each have their own global state. Thread-safety is not a concern here.
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/memory/config.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
backends/advanced/src/advanced_omi_backend/model_registry.py (1)
264-270: Add logging for path resolution failures.The bare
Exceptioncatch silently suppresses all errors during path resolution, making it difficult to diagnose configuration issues.🔎 Apply this diff to add logging:
At the top of the file, ensure logging is imported:
import os import re import yaml +import logging from pathlib import Path from typing import Any, Dict, List, OptionalThen update the exception handler:
# Also walk up from current file's parents defensively try: for parent in Path(__file__).resolve().parents: c = parent / "config.yml" if c.exists(): return c - except Exception: - pass + except Exception as e: + logging.debug(f"Could not walk parent directories for config.yml: {e}")Based on learnings, when re-raising exceptions, always chain with 'raise ... from e' to preserve context; here we're logging and continuing, which is appropriate.
config.yml (1)
10-30: Consider documenting hardcoded values for development.The emberfang models use hardcoded local IP addresses (
192.168.1.166:8084) and dummy API keys ("1234"). While this is acceptable for development and example configurations, consider adding a comment noting these should be customized for each deployment.💡 Example documentation addition:
models: - # llama cpp llm, name can be anything + # Example: llama.cpp LLM on local network + # Replace IP address and API key with your own deployment - name: emberfang-llm description: Emberfang One LLM
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backends/advanced/src/advanced_omi_backend/model_registry.py(1 hunks)config.yml(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for import sorting in Python code
ALL imports must be at the top of the file after the docstring. NEVER import modules in the middle of functions or files. Use lazy imports sparingly and only when absolutely necessary for circular import issues.
Group imports in order: standard library, third-party, local imports
Always raise errors, never silently ignore. Use explicit error handling with proper exceptions rather than silent failures.
Do not add defensivehasattr()checks. Research and understand input/response or class structure instead.
Useuv run pythonoruv run python3instead of directpythonorpython3commands for Python execution
Files:
backends/advanced/src/advanced_omi_backend/model_registry.py
**/{auth,permission,security,model,database,db}*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Scope all data by user_id with efficient permission checking for data isolation
Files:
backends/advanced/src/advanced_omi_backend/model_registry.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/model_registry.py
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/model_registry.py
269-270: try-except-pass detected, consider logging the exception
(S110)
269-269: Do not catch blind exception: Exception
(BLE001)
323-323: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (9)
backends/advanced/src/advanced_omi_backend/model_registry.py (6)
1-17: LGTM: Clean module setup with proper imports.The module docstring is clear, and imports are properly organized (standard library, then third-party). Using Pydantic for validation provides robust type safety.
19-54: LGTM: Robust environment variable resolution.The regex pattern correctly handles both
${VAR}and${VAR:-default}forms, and the function appropriately leaves non-string values unchanged. Documentation with examples is excellent.
57-86: LGTM: Correct recursive environment resolution.The function properly handles all YAML data structures (dicts, lists, scalars) and applies
_resolve_envat the appropriate level.
89-157: LGTM: Well-structured Pydantic model with sensible validation.The field validators handle edge cases appropriately (URL trimming, API key sanitization), and the model validator provides sensible defaults for common embedding models.
160-236: LGTM: Clean registry API with sensible fallback logic.The public methods provide intuitive access patterns, and the fallback in
get_default(returning the first model of a type if no explicit default exists) is a reasonable design choice.
338-353: LGTM: Clean public API for registry access.The function provides a simple, cached interface to the global registry and includes helpful usage examples in the docstring.
config.yml (3)
1-6: LGTM: Clear default model selections.The defaults section provides a clean way to specify which models should be used by default for each service type.
210-225: LGTM: Well-organized memory configuration.The memory section cleanly separates provider selection, timeout settings, extraction configuration, and provider-specific connection details. The structure supports multiple memory backends without coupling.
80-80: No action required—type coercion for embedding_dimensions works correctly.The
${OPENAI_EMBEDDING_DIMS:-1536}environment variable pattern resolves to the string"1536"via the_resolve_envfunction. Pydantic then receives this string and automatically coerces it to the integer type expected byModelDef.embedding_dimensionsbecause the model uses the default lax validation mode (strict mode is not configured). This is standard Pydantic behavior and functions correctly.
|
I like this much better than a massive raw .env file One thing, for example there isn't a need for configuring openai, it being in the template is just noisy, can we not cascade this? |
|
All env will change later. This fixes the issue |
Not clear, this PR is also changing the env vars |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backends/advanced/start-workers.sh (1)
54-74: Duplicate Deepgram worker will be started.When
TRANSCRIPTION_PROVIDERis set todeepgram(or defaults to it) ANDDEEPGRAM_API_KEYis set, two Deepgram audio stream workers will be launched:
- Lines 56-58: Starts worker because
DEEPGRAM_API_KEYis set- Lines 71-73: Starts another worker because
TRANSCRIPTION_PROVIDERis notparakeet/offlineThis results in duplicate workers competing for the same audio stream, which can cause race conditions or duplicated transcriptions.
🔎 Suggested fix - remove the redundant first block:
- # Only start Deepgram worker if DEEPGRAM_API_KEY is set - if [ -n "$DEEPGRAM_API_KEY" ]; then - echo "🎵 Starting audio stream Deepgram worker (1 worker for sequential processing)..." - uv run python -m advanced_omi_backend.workers.audio_stream_deepgram_worker & - AUDIO_STREAM_DEEPGRAM_WORKER_PID=$! - else - echo "⏭️ Skipping Deepgram stream worker (DEEPGRAM_API_KEY not set)" - AUDIO_STREAM_DEEPGRAM_WORKER_PID="" - fi - # Start the appropriate audio stream worker based on TRANSCRIPTION_PROVIDER TRANSCRIPTION_PROVIDER=${TRANSCRIPTION_PROVIDER:-deepgram} if [ "$TRANSCRIPTION_PROVIDER" = "parakeet" ] || [ "$TRANSCRIPTION_PROVIDER" = "offline" ]; then echo "🎵 Starting audio stream Parakeet worker (1 worker for sequential processing)..." uv run python -m advanced_omi_backend.workers.audio_stream_parakeet_worker & AUDIO_STREAM_PARAKEET_WORKER_PID=$! - else + elif [ -n "$DEEPGRAM_API_KEY" ]; then echo "🎵 Starting audio stream Deepgram worker (1 worker for sequential processing)..." uv run python -m advanced_omi_backend.workers.audio_stream_deepgram_worker & - AUDIO_STREAM_WORKER_PID=$! + AUDIO_STREAM_DEEPGRAM_WORKER_PID=$! + else + echo "⏭️ Skipping Deepgram stream worker (DEEPGRAM_API_KEY not set)" + AUDIO_STREAM_DEEPGRAM_WORKER_PID="" fiAlso note: the new branch uses
AUDIO_STREAM_WORKER_PID(line 73) but the shutdown/cleanup logic referencesAUDIO_STREAM_DEEPGRAM_WORKER_PID(lines 126, 152, 186), so this worker won't be properly terminated on shutdown.
🧹 Nitpick comments (5)
backends/advanced/src/advanced_omi_backend/llm_client.py (1)
146-166: LGTM with minor suggestion.The registry-based client creation is clean and follows the centralized configuration approach. The explicit
ValueErrorwhen no default LLM is configured is appropriate.Consider distinguishing between "config.yml not found" and "no default LLM defined" for easier troubleshooting:
🔎 Optional: More specific error messages
if registry: llm_def = registry.get_default("llm") if llm_def: logger.info(f"Creating LLM client from registry: {llm_def.name} ({llm_def.model_provider})") params = llm_def.model_params or {} return OpenAILLMClient( api_key=llm_def.api_key, base_url=llm_def.model_url, model=llm_def.model_name, temperature=params.get("temperature", 0.1), ) + raise ValueError("No default LLM model defined in config.yml (set defaults.llm)") - raise ValueError("No default LLM defined in config.yml") + raise ValueError("config.yml not found or failed to load")config.yml (2)
115-115: Update placeholder URL to use localhost or environment variable.Line 115 uses
http://172.17.0.1:8767which is a Docker bridge network address. For better portability, consider usinghost.docker.internalor an environment variable.🔎 Consider this change:
- model_url: http://172.17.0.1:8767 + model_url: ${PARAKEET_ASR_URL:-http://host.docker.internal:8767}
216-217: Clarify extraction prompt guidance.The comment on line 215 states "Optional custom prompt; if omitted, a built-in default is used," but a prompt is provided on lines 216-217. This may confuse users about whether they need to customize this prompt or if they can remove it to use the built-in default.
Consider adding a clearer comment:
- # Optional custom prompt; if omitted, a built-in default is used + # Custom extraction prompt. To use the built-in default, delete the 'prompt' key entirely. prompt: | Extract important information from this conversation and return a JSON object with an array named "facts". Include personal preferences, plans, names, dates, locations, numbers, and key details. Keep items concise and useful.backends/advanced/init.py (1)
242-246: Enhance Ollama guidance with configuration example.The guidance messages are clear, but users might benefit from seeing an example of what the config.yml entry should look like, especially for the model_url field which needs to match their local Ollama installation.
Consider adding:
self.console.print("[blue][INFO][/blue] Configure Ollama in config.yml:") self.console.print(" 1. Set 'defaults.llm: local-llm'") self.console.print(" 2. Edit the 'local-llm' model definition with your Ollama URL and model") + self.console.print(" 3. Example: model_url: http://localhost:11434/v1") self.console.print("[green][SUCCESS][/green] See config.yml for Ollama configuration")config.env.template (1)
167-172: Document the purpose and impact of CROPPING_CONTEXT_PADDING.The new variable CROPPING_CONTEXT_PADDING is introduced with a value of 0.1, but there's no inline documentation explaining what this padding value represents (seconds? percentage? samples?) or how it affects audio processing behavior.
Add a brief comment:
+# Audio cropping context padding (seconds added before/after speech segments) CROPPING_CONTEXT_PADDING = 0.1
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
backends/advanced/docker-compose-test.yml(2 hunks)backends/advanced/docker-compose.yml(3 hunks)backends/advanced/init.py(4 hunks)backends/advanced/src/advanced_omi_backend/chat_service.py(2 hunks)backends/advanced/src/advanced_omi_backend/llm_client.py(3 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py(5 hunks)backends/advanced/start-workers.sh(1 hunks)backends/advanced/tests/test_integration.py(3 hunks)config.env.template(4 hunks)config.yml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backends/advanced/tests/test_integration.py
🧰 Additional context used
📓 Path-based instructions (2)
**/docker-compose*.{yml,yaml}
📄 CodeRabbit inference engine (CLAUDE.md)
Check if src/ is volume mounted. If not, run
docker compose buildso that code changes are reflected. Do not simply rundocker compose restartas it will not rebuild the image.
Files:
backends/advanced/docker-compose-test.ymlbackends/advanced/docker-compose.yml
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for import sorting in Python code
ALL imports must be at the top of the file after the docstring. NEVER import modules in the middle of functions or files. Use lazy imports sparingly and only when absolutely necessary for circular import issues.
Group imports in order: standard library, third-party, local imports
Always raise errors, never silently ignore. Use explicit error handling with proper exceptions rather than silent failures.
Do not add defensivehasattr()checks. Research and understand input/response or class structure instead.
Useuv run pythonoruv run python3instead of directpythonorpython3commands for Python execution
Files:
backends/advanced/src/advanced_omi_backend/chat_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.pybackends/advanced/src/advanced_omi_backend/llm_client.pybackends/advanced/init.py
🧠 Learnings (3)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/chat_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.pybackends/advanced/src/advanced_omi_backend/llm_client.pybackends/advanced/init.py
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/{conversation,audio,memory,processing}*.py : Only create conversations when speech is detected, eliminating noise-only sessions. Store all audio sessions in `audio_chunks` collection, conversations only in `conversations` collection when speech is detected.
Applied to files:
config.env.template
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/{job,queue,pipeline,processor}*.py : Use a Job-based tracking system for all audio processing (WebSocket and file uploads) with stage events (audio → transcription → memory) and completion status
Applied to files:
backends/advanced/start-workers.sh
🧬 Code graph analysis (3)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (1)
backends/advanced/src/advanced_omi_backend/model_registry.py (2)
get_models_registry(338-353)get_default(195-217)
backends/advanced/src/advanced_omi_backend/llm_client.py (1)
backends/advanced/src/advanced_omi_backend/model_registry.py (2)
get_models_registry(338-353)get_default(195-217)
backends/advanced/init.py (1)
extras/speaker-recognition/init.py (2)
print_section(42-46)prompt_choice(89-104)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
235-235: Do not catch blind exception: Exception
(BLE001)
237-237: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backends/advanced/src/advanced_omi_backend/llm_client.py
61-61: Avoid specifying long messages outside the exception class
(TRY003)
166-166: Avoid specifying long messages outside the exception class
(TRY003)
backends/advanced/init.py
460-460: f-string without any placeholders
Remove extraneous f prefix
(F541)
🔇 Additional comments (14)
backends/advanced/docker-compose-test.yml (1)
27-27: LGTM!The
GROQ_API_KEYenvironment variable additions follow the existing pattern for API key passthrough and are consistently applied to bothchronicle-backend-testandworkers-testservices.Also applies to: 140-140
backends/advanced/src/advanced_omi_backend/chat_service.py (2)
30-32: LGTM!The new constants
MAX_MEMORY_CONTEXTandMAX_CONVERSATION_HISTORYprovide clear, documented defaults for context management.
382-384: Verify temperature is configured in config.yml.The temperature parameter was removed from the
generate()call, which means the LLM client will now use its default temperature from the model registry configuration. Ensure yourconfig.ymlincludes the appropriatetemperatureinmodel_paramsfor the default LLM, otherwise it will fall back to0.1.backends/advanced/src/advanced_omi_backend/llm_client.py (2)
14-15: LGTM!Import follows coding guidelines - placed at the top of the file with local imports grouped after standard library and third-party imports.
59-61: LGTM!The enhanced error message with per-field status is helpful for debugging configuration issues. While static analysis flags TRY003, the diagnostic clarity outweighs the verbosity concern for this initialization-time error.
backends/advanced/docker-compose.yml (2)
15-15: LGTM!The
config.ymlmount to/app/config.yml:rocorrectly enables the centralized configuration approach. Bothchronicle-backendandworkersservices have consistent mounts.Also applies to: 68-68
225-228: Verify network creation is documented.Setting
external: trueonchronicle-networkmeans the network must be created manually before runningdocker-compose up:docker network create chronicle-networkEnsure this is documented in the setup instructions or handled by an initialization script, otherwise first-time users will encounter:
Network chronicle-network declared as external, but could not be foundbackends/advanced/init.py (1)
234-236: LGTM: API key collection preserved while delegating model configuration to config.yml.The script correctly maintains API key collection in the environment while guiding users to configure model selection in config.yml. This separation of concerns (secrets in .env, configuration in config.yml) aligns well with the PR objectives.
config.env.template (1)
58-59: LGTM: Clear guidance on configuration migration.The updated comments clearly indicate that LLM configuration has moved to config.yml while API keys remain in the environment file. This maintains good separation of secrets and configuration.
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (4)
42-52: Registry initialization and fallback handling looks correct.The registry-driven configuration properly initializes with None fallbacks when the registry or specific defaults are unavailable. The approach of extracting defaults at module level is appropriate for read-only configuration.
235-242: Catch-all exception handling is acceptable in health check context.The static analysis tool flags catching bare
Exceptionon line 235, but this is appropriate for a health check endpoint where we want to report all failures gracefully without crashing. The exception is logged and reported in the health status.The broad exception handling is intentional and correct for health monitoring. As per learnings, the exception is logged with proper context.
51-52: No type safety issue here.Lines 51-52 safely access
model_paramsbecauseModelDef.model_paramsis always initialized as a dict viadefault_factory=dictand can never be None. The code correctly guards the access with a check for_vs_defitself, which is the only object that could be None.
130-132: No changes needed. The code at lines 131-132 is safe fromAttributeError. Thememoryattribute onAppModelsis defined withField(default_factory=dict, ...), ensuring it always exists on any instance. The guardif REGISTRY else {}already prevents accessing attributes onNone, so whenREGISTRYis truthy, accessingREGISTRY.memoryis guaranteed to succeed.Likely an incorrect or invalid review comment.
config.yml (1)
142-142: No action needed - environment variables in nested headers are correctly resolved.The
_deep_resolve_envfunction in model_registry.py is designed to recursively traverse nested dictionaries and lists, applying environment variable resolution to every string value it encounters. Line 142'sAuthorization: Token ${DEEPGRAM_API_KEY:-}will be properly resolved regardless of its nesting depth within the operations structure. The implementation handles this scenario correctly.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (7)
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py (1)
352-352: LGTM! Defensive checks removed per coding guidelines.The change correctly removes the defensive
getattr()fallback and usesprovider.namedirectly, consistent with line 179 and the provider interface contract.backends/advanced/src/advanced_omi_backend/app_config.py (2)
18-18: LGTM! Import correctly moved to top of file.The import of
get_models_registryis now at the module level, addressing the previous review feedback.
67-70: LGTM! Registry-based configuration without unnecessary exception handling.The memory provider resolution now uses the registry directly without a try/except wrapper, as
get_models_registry()already returnsNonewhen config.yml is not found. This addresses the previous review feedback.backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
9-9: LGTM! Import correctly placed at top of file.The import has been moved to the module level per coding guidelines.
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (1)
11-11: LGTM! Imports correctly moved to module level.The
yamlimport and model_registry imports are now at the top of the file per coding guidelines.Also applies to: 18-18
backends/advanced/src/advanced_omi_backend/model_registry.py (2)
17-18: LGTM! All imports at module level per coding guidelines.The
loggingimport and other dependencies are properly placed at the top of the file.
319-327: LGTM! Specific exception handling with graceful degradation.The code correctly catches
ValidationErrorspecifically and logs individual model failures while continuing to load other models. This provides robust graceful degradation.
🧹 Nitpick comments (2)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
299-303: Use bareraiseto preserve the original traceback.Line 303 should use bare
raiseinstead ofraise eto preserve the full exception traceback chain.🔎 Apply this diff:
except Exception as e: memory_logger.exception( f"Failed to get embedding dimensions from registry for model '{embedding_model}'" ) - raise e + raisebackends/advanced/src/advanced_omi_backend/controllers/system_controller.py (1)
42-42: Replaceraise ewith bareraisethroughout the file.Multiple exception handlers use
raise ewhich breaks the traceback chain. Use bareraiseinstead to preserve the full exception context.Affected lines: 42, 75, 126, 139, 175, 208, 251, 277, 315, 332, 343, 367, 391, 454.
🔎 Example fix pattern:
except Exception as e: logger.exception("Error message") - raise e + raiseAlso applies to: 75-75, 126-126, 139-139, 175-175, 208-208, 251-251, 277-277, 315-315, 332-332, 343-343, 367-367, 391-391, 454-454
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
backends/advanced/src/advanced_omi_backend/app_config.py(4 hunks)backends/advanced/src/advanced_omi_backend/controllers/system_controller.py(14 hunks)backends/advanced/src/advanced_omi_backend/model_registry.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/memory/config.py(5 hunks)backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for import sorting in Python code
ALL imports must be at the top of the file after the docstring. NEVER import modules in the middle of functions or files. Use lazy imports sparingly and only when absolutely necessary for circular import issues.
Group imports in order: standard library, third-party, local imports
Always raise errors, never silently ignore. Use explicit error handling with proper exceptions rather than silent failures.
Do not add defensivehasattr()checks. Research and understand input/response or class structure instead.
Useuv run pythonoruv run python3instead of directpythonorpython3commands for Python execution
Files:
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.pybackends/advanced/src/advanced_omi_backend/model_registry.pybackends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/controllers/system_controller.pybackends/advanced/src/advanced_omi_backend/app_config.py
**/{auth,permission,security,model,database,db}*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Scope all data by user_id with efficient permission checking for data isolation
Files:
backends/advanced/src/advanced_omi_backend/model_registry.py
🧠 Learnings (4)
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/*.py : Do not add defensive `hasattr()` checks. Research and understand input/response or class structure instead.
Applied to files:
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.pybackends/advanced/src/advanced_omi_backend/model_registry.pybackends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/controllers/system_controller.pybackends/advanced/src/advanced_omi_backend/app_config.py
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/*.py : ALL imports must be at the top of the file after the docstring. NEVER import modules in the middle of functions or files. Use lazy imports sparingly and only when absolutely necessary for circular import issues.
Applied to files:
backends/advanced/src/advanced_omi_backend/model_registry.pybackends/advanced/src/advanced_omi_backend/app_config.py
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/{memory,provider}*.py : Implement pluggable memory providers supporting both Chronicle native (individual facts) and OpenMemory MCP delegation
Applied to files:
backends/advanced/src/advanced_omi_backend/services/memory/config.py
🧬 Code graph analysis (3)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
backends/advanced/src/advanced_omi_backend/model_registry.py (2)
get_models_registry(338-353)get_default(196-218)backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
MemoryProvider(29-34)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (3)
backends/advanced/src/advanced_omi_backend/config.py (2)
load_diarization_settings_from_file(72-107)save_diarization_settings_to_file(110-131)backends/advanced/src/advanced_omi_backend/model_registry.py (2)
_find_config_path(244-278)load_models_config(281-335)backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (2)
get_memory_config_raw(78-80)update_memory_config_raw(84-89)
backends/advanced/src/advanced_omi_backend/app_config.py (2)
backends/advanced/src/advanced_omi_backend/model_registry.py (1)
get_models_registry(338-353)backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
get_transcription_provider(260-272)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/model_registry.py
270-271: try-except-pass detected, consider logging the exception
(S110)
270-270: Do not catch blind exception: Exception
(BLE001)
backends/advanced/src/advanced_omi_backend/services/memory/config.py
220-220: Avoid specifying long messages outside the exception class
(TRY003)
224-224: Avoid specifying long messages outside the exception class
(TRY003)
245-245: Avoid specifying long messages outside the exception class
(TRY003)
303-303: Use raise without specifying exception name
Remove exception name
(TRY201)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
42-42: Use raise without specifying exception name
Remove exception name
(TRY201)
75-75: Use raise without specifying exception name
Remove exception name
(TRY201)
89-89: Abstract raise to an inner function
(TRY301)
89-89: Avoid specifying long messages outside the exception class
(TRY003)
94-94: Abstract raise to an inner function
(TRY301)
94-94: Avoid specifying long messages outside the exception class
(TRY003)
97-97: Abstract raise to an inner function
(TRY301)
97-97: Avoid specifying long messages outside the exception class
(TRY003)
100-100: Abstract raise to an inner function
(TRY301)
100-100: Avoid specifying long messages outside the exception class
(TRY003)
126-126: Use raise without specifying exception name
Remove exception name
(TRY201)
139-139: Use raise without specifying exception name
Remove exception name
(TRY201)
148-148: Prefer TypeError exception for invalid type
(TRY004)
148-148: Abstract raise to an inner function
(TRY301)
148-148: Avoid specifying long messages outside the exception class
(TRY003)
153-153: Abstract raise to an inner function
(TRY301)
153-153: Avoid specifying long messages outside the exception class
(TRY003)
175-175: Use raise without specifying exception name
Remove exception name
(TRY201)
208-208: Use raise without specifying exception name
Remove exception name
(TRY201)
251-251: Use raise without specifying exception name
Remove exception name
(TRY201)
262-262: Abstract raise to an inner function
(TRY301)
262-262: Avoid specifying long messages outside the exception class
(TRY003)
277-277: Use raise without specifying exception name
Remove exception name
(TRY201)
287-287: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
287-287: Avoid specifying long messages outside the exception class
(TRY003)
287-287: Use explicit conversion flag
Replace with conversion flag
(RUF010)
291-291: Abstract raise to an inner function
(TRY301)
291-291: Avoid specifying long messages outside the exception class
(TRY003)
315-315: Use raise without specifying exception name
Remove exception name
(TRY201)
324-324: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
324-324: Avoid specifying long messages outside the exception class
(TRY003)
324-324: Use explicit conversion flag
Replace with conversion flag
(RUF010)
326-326: Prefer TypeError exception for invalid type
(TRY004)
326-326: Abstract raise to an inner function
(TRY301)
326-326: Avoid specifying long messages outside the exception class
(TRY003)
329-329: Consider moving this statement to an else block
(TRY300)
332-332: Use raise without specifying exception name
Remove exception name
(TRY201)
343-343: Use raise without specifying exception name
Remove exception name
(TRY201)
367-367: Use raise without specifying exception name
Remove exception name
(TRY201)
391-391: Use raise without specifying exception name
Remove exception name
(TRY201)
402-402: Abstract raise to an inner function
(TRY301)
402-402: Avoid specifying long messages outside the exception class
(TRY003)
408-408: Abstract raise to an inner function
(TRY301)
408-408: Avoid specifying long messages outside the exception class
(TRY003)
454-454: Use raise without specifying exception name
Remove exception name
(TRY201)
🔇 Additional comments (1)
backends/advanced/src/advanced_omi_backend/model_registry.py (1)
265-271: Defensive exception handling is appropriate here.The bare
except Exception: passis intentional for the defensive directory-walking fallback (as noted in the comment "defensively"). Any filesystem error during parent traversal should be suppressed to allow fallback to the candidate list.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backends/advanced/init.py (1)
460-460: Fix f-string without placeholders.This line uses an f-string prefix but contains no placeholders. Remove the unnecessary
fprefix.🔎 Apply this diff:
- self.console.print(f"✅ LLM: Configured in config.yml (defaults.llm)") + self.console.print("✅ LLM: Configured in config.yml (defaults.llm)")As per coding guidelines and static analysis hint.
🧹 Nitpick comments (1)
backends/advanced/init.py (1)
204-250: Consider validating config.yml existence and providing clearer guidance.The function now guides users to manually edit config.yml for LLM configuration, but doesn't verify that:
- The config.yml file exists at the expected location
- The file has the correct structure (e.g.,
defaults.llmfield exists)- Referenced model definitions are present
This could lead to user confusion if they complete the setup wizard but the services fail to start due to missing or malformed config.yml.
Consider adding a validation step or clearer documentation:
🔎 Suggested improvements
Option 1: Add config.yml validation (recommended)
def validate_config_yml(self): """Verify config.yml exists and has required structure""" config_path = Path("../../config.yml") if not config_path.exists(): self.console.print("[red][ERROR][/red] config.yml not found at repository root") self.console.print("[yellow][INFO][/yellow] Please ensure config.yml exists before continuing") return False # Optional: Add YAML structure validation here return TrueThen call it in
setup_llm:def setup_llm(self): """Configure LLM provider - shows guidance for config.yml""" self.print_section("LLM Provider Configuration") # Validate config.yml exists if not self.validate_config_yml(): self.console.print("[yellow][WARNING][/yellow] config.yml validation failed") self.console.print("[blue][INFO][/blue] LLM configuration is now managed in config.yml") # ... rest of functionOption 2: Enhance user guidance
Add example config.yml snippets in the console output to show users exactly what to edit.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backends/advanced/docker-compose.yml(3 hunks)backends/advanced/init.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for import sorting in Python code
ALL imports must be at the top of the file after the docstring. NEVER import modules in the middle of functions or files. Use lazy imports sparingly and only when absolutely necessary for circular import issues.
Group imports in order: standard library, third-party, local imports
Always raise errors, never silently ignore. Use explicit error handling with proper exceptions rather than silent failures.
Do not add defensivehasattr()checks. Research and understand input/response or class structure instead.
Useuv run pythonoruv run python3instead of directpythonorpython3commands for Python execution
Files:
backends/advanced/init.py
**/docker-compose*.{yml,yaml}
📄 CodeRabbit inference engine (CLAUDE.md)
Check if src/ is volume mounted. If not, run
docker compose buildso that code changes are reflected. Do not simply rundocker compose restartas it will not rebuild the image.
Files:
backends/advanced/docker-compose.yml
🧠 Learnings (2)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/init.py
📚 Learning: 2025-12-16T14:24:28.301Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T14:24:28.301Z
Learning: Applies to **/docker-compose*.{yml,yaml} : Check if src/ is volume mounted. If not, run `docker compose build` so that code changes are reflected. Do not simply run `docker compose restart` as it will not rebuild the image.
Applied to files:
backends/advanced/docker-compose.yml
🧬 Code graph analysis (1)
backends/advanced/init.py (1)
extras/speaker-recognition/init.py (2)
print_section(42-46)prompt_choice(89-104)
🔇 Additional comments (2)
backends/advanced/docker-compose.yml (2)
21-21: LGTM! Environment variables properly configured.The new environment variables align with the PR's support for additional providers:
PARAKEET_ASR_URLadded to both chronicle-backend and workers for offline transcriptionGROQ_API_KEYadded to workers for LLM supportThe configuration is consistent and properly placed.
Also applies to: 74-74, 76-76
15-15: The config.yml file exists at the repository root and is correctly mounted in both services with the proper relative path. The read-only mounts are appropriate for configuration files. No changes needed.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
282-289: Remove non-existent provider names from__all__.The
__all__list exportsDeepgramProvider,DeepgramStreamingProvider,DeepgramStreamConsumer,ParakeetProvider, andParakeetStreamingProvider, but these are not imported into this module and will causeImportErrorwhen other modules attempt to import them from the package. Replace with the registry-driven providers that are actually defined here.__all__ = [ "get_transcription_provider", - "DeepgramProvider", - "DeepgramStreamingProvider", - "DeepgramStreamConsumer", - "ParakeetProvider", - "ParakeetStreamingProvider", + "RegistryBatchTranscriptionProvider", + "RegistryStreamingTranscriptionProvider", + "BaseTranscriptionProvider", + "BatchTranscriptionProvider", + "StreamingTranscriptionProvider", ]
🧹 Nitpick comments (4)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (4)
112-112: Specify the exception type being caught.Catching bare
Exceptionhides errors and makes debugging difficult. SpecifyValueErrororTypeErrorif you're guarding against invalid index conversions.try: idx = int(idx_str) - except Exception: + except (ValueError, TypeError): return NoneBased on learnings, this improves debuggability by allowing unexpected exceptions to propagate with proper context.
162-165: Log exceptions instead of silently ignoring them.Multiple
try-except-passblocks (lines 162-165, 215-220) silently suppress errors, making debugging difficult. At minimum, log the exceptions.🔎 Apply these fixes:
# Wait for confirmation; non-fatal if not provided try: await asyncio.wait_for(ws.recv(), timeout=2.0) - except Exception: - pass + except Exception as e: + logger.debug("No confirmation received from streaming STT: %s", e)try: # Drain until final or close for _ in range(500): # hard cap msg = await asyncio.wait_for(ws.recv(), timeout=1.5) data = json.loads(msg) if not final_type or data.get("type") == final_type: final = data break - except Exception: - pass + except Exception as e: + logger.warning("Failed to receive final transcription: %s", e) try: await ws.close() - except Exception: - pass + except Exception as e: + logger.debug("Failed to close WebSocket cleanly: %s", e)Based on learnings, logging exceptions improves debuggability. Consider using
logger.exception()inside except blocks for automatic stack trace logging when appropriate.Also applies to: 215-220
151-151: Usecopy.deepcopy()for more efficient deep copying.The pattern
json.loads(json.dumps(obj))works but is inefficient compared to Python's standard library.Add
import copyat the top, then:- start_msg = json.loads(json.dumps(start_msg)) # deep copy + start_msg = copy.deepcopy(start_msg)
256-256: Remove or document the unusedprovider_nameparameter.The
provider_nameparameter is not used in the function body. If it's reserved for future extensibility or API compatibility, add a comment explaining this; otherwise, remove it.-def get_transcription_provider(provider_name: Optional[str] = None, mode: Optional[str] = None) -> Optional[BaseTranscriptionProvider]: +def get_transcription_provider(mode: Optional[str] = None) -> Optional[BaseTranscriptionProvider]:
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for import sorting in Python code
ALL imports must be at the top of the file after the docstring. NEVER import modules in the middle of functions or files. Use lazy imports sparingly and only when absolutely necessary for circular import issues.
Group imports in order: standard library, third-party, local imports
Always raise errors, never silently ignore. Use explicit error handling with proper exceptions rather than silent failures.
Do not add defensivehasattr()checks. Research and understand input/response or class structure instead.
Useuv run pythonoruv run python3instead of directpythonorpython3commands for Python execution
Files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
28-28: Avoid specifying long messages outside the exception class
(TRY003)
31-31: Avoid specifying long messages outside the exception class
(TRY003)
112-112: Do not catch blind exception: Exception
(BLE001)
134-134: Avoid specifying long messages outside the exception class
(TRY003)
137-137: Avoid specifying long messages outside the exception class
(TRY003)
164-165: try-except-pass detected, consider logging the exception
(S110)
164-164: Do not catch blind exception: Exception
(BLE001)
215-216: try-except-pass detected, consider logging the exception
(S110)
215-215: Do not catch blind exception: Exception
(BLE001)
219-220: try-except-pass detected, consider logging the exception
(S110)
219-219: Do not catch blind exception: Exception
(BLE001)
235-235: Do not catch blind exception: Exception
(BLE001)
256-256: Unused function argument: provider_name
(ARG001)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (3)
23-51: Good: Centralized dotted-path extraction.The
_dotted_gethelper addresses the previous code duplication concern. The implementation correctly handles nested dicts, lists, and bracketed indices.Optional: Narrow the exception type
Line 41 could catch
ValueErrorspecifically instead of bareException, sinceint()only raisesValueError:try: idx = int(idx_str) - except Exception: + except ValueError: return None
71-136: Transcription implementation is sound.The method correctly builds requests from registry config, handles both multipart and raw audio formats, and normalizes responses.
Optional: Consider more robust templating
Line 94 uses simple string replacement for API key injection. If more complex templating is needed in the future, consider using
string.Templateor a similar mechanism:from string import Template # ... template = Template(v) headers[k] = template.safe_substitute(DEEPGRAM_API_KEY=self.model.api_key)
243-266: Remove unusedprovider_nameparameter.The
provider_nameparameter is not used in the function body. Either remove it or document why it's retained for backward compatibility.🔎 Remove unused parameter
-def get_transcription_provider(provider_name: Optional[str] = None, mode: Optional[str] = None) -> Optional[BaseTranscriptionProvider]: +def get_transcription_provider(mode: Optional[str] = None) -> Optional[BaseTranscriptionProvider]: """Return a registry-driven transcription provider. - mode="batch": HTTP-based STT (default) - mode="streaming": WebSocket-based STT
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for import sorting in Python code
ALL imports must be at the top of the file after the docstring. NEVER import modules in the middle of functions or files. Use lazy imports sparingly and only when absolutely necessary for circular import issues.
Group imports in order: standard library, third-party, local imports
Always raise errors, never silently ignore. Use explicit error handling with proper exceptions rather than silent failures.
Do not add defensivehasattr()checks. Research and understand input/response or class structure instead.
Useuv run pythonoruv run python3instead of directpythonorpython3commands for Python execution
Files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
41-41: Do not catch blind exception: Exception
(BLE001)
60-60: Avoid specifying long messages outside the exception class
(TRY003)
63-63: Avoid specifying long messages outside the exception class
(TRY003)
144-144: Avoid specifying long messages outside the exception class
(TRY003)
147-147: Avoid specifying long messages outside the exception class
(TRY003)
174-175: try-except-pass detected, consider logging the exception
(S110)
174-174: Do not catch blind exception: Exception
(BLE001)
225-226: try-except-pass detected, consider logging the exception
(S110)
225-225: Do not catch blind exception: Exception
(BLE001)
229-230: try-except-pass detected, consider logging the exception
(S110)
229-229: Do not catch blind exception: Exception
(BLE001)
243-243: Unused function argument: provider_name
(ARG001)
🔇 Additional comments (5)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (5)
9-18: Import ordering is now correct.The imports are properly grouped per coding guidelines: standard library, third-party, then local imports with appropriate spacing. This addresses the previous review feedback.
54-70: Registry-driven initialization looks good.The class correctly loads the default STT model from the registry and provides clear error messages when configuration is missing.
138-154: Streaming provider initialization follows the same registry pattern.Correctly loads the streaming STT model and initializes session tracking.
178-202: Audio chunk processing looks correct.The non-blocking read pattern with
asyncio.TimeoutErrorat line 200 is appropriate for polling interim results without blocking the audio stream.
269-276: Missing imports for__all__exports.The classes declared in
__all__(DeepgramProvider,DeepgramStreamingProvider,DeepgramStreamConsumer,ParakeetProvider,ParakeetStreamingProvider) are defined in sibling modules but not imported in this file. Add imports at the top after the existing imports:from .deepgram import DeepgramProvider, DeepgramStreamingProvider, DeepgramStreamConsumer from .parakeet import ParakeetProvider, ParakeetStreamingProviderLikely an incorrect or invalid review comment.
| async def start_stream(self, client_id: str, sample_rate: int = 16000, diarize: bool = False): | ||
| url = self.model.model_url | ||
| ops = self.model.operations or {} | ||
| start_msg = (ops.get("start", {}) or {}).get("message", {}) | ||
| # Inject session_id if placeholder present | ||
| start_msg = json.loads(json.dumps(start_msg)) # deep copy | ||
| start_msg.setdefault("session_id", client_id) | ||
| # Apply sample rate and diarization if present | ||
| if "config" in start_msg and isinstance(start_msg["config"], dict): | ||
| start_msg["config"].setdefault("sample_rate", sample_rate) | ||
| if diarize: | ||
| start_msg["config"]["diarize"] = True | ||
|
|
||
| ws = await websockets.connect(url, open_timeout=10) | ||
| await ws.send(json.dumps(start_msg)) | ||
| # Wait for confirmation; non-fatal if not provided | ||
| try: | ||
| await asyncio.wait_for(ws.recv(), timeout=2.0) | ||
| except Exception: | ||
| pass | ||
| self._streams[client_id] = {"ws": ws, "sample_rate": sample_rate, "final": None, "interim": []} | ||
|
|
There was a problem hiding this comment.
WebSocket timeout now present; consider logging exceptions.
Line 169 correctly adds open_timeout=10, addressing the previous review concern about indefinite hangs.
However, lines 174-175 silently ignore exceptions when waiting for confirmation. Per coding guidelines and learnings, exceptions should be logged for debuggability.
🔎 Add logging to the exception handler
try:
await asyncio.wait_for(ws.recv(), timeout=2.0)
- except Exception:
- pass
+ except Exception as e:
+ logger.debug("No confirmation received from streaming STT: %s", e)Based on learnings, prefer logging.exception() inside except blocks or log with context when suppressing errors.
🧰 Tools
🪛 Ruff (0.14.8)
174-175: try-except-pass detected, consider logging the exception
(S110)
174-174: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
around lines 156-177, the try/except that awaits the WebSocket confirmation
currently swallows all exceptions; update the except block to log the exception
(e.g., logging.exception or module logger) including contextual fields like
client_id and url so failures are visible, and add an import for logging or use
the existing logger if present; keep behavior non-fatal but ensure the caught
exception and context are emitted to logs.
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
Show resolved
Hide resolved
|
Sorry wrong thread! My bad
…On Thu, 18 Dec 2025 at 15:48, Ankush Malaker ***@***.***> wrote:
*AnkushMalaker* left a comment (SimpleOpenSoftware/chronicle#208)
<#208 (comment)>
@thestumonkey <https://github.com/thestumonkey>
“All env will change later. This fixes the issue”
Not clear, this PR is also changing the env vars
If you have suggestions on this work please add it here and review this PR
so you dont redo or have excess conflicts
Or wait till this is merged to start your work on env vars
—
Reply to this email directly, view it on GitHub
<#208 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADKNIVTL45PY6XSSTK6WAL4CLED5AVCNFSM6AAAAACPN7KZ6CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTMNZQHEZTCNJTHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Pushed a branch with resolved conflicts if that helps you :) |
|
#208 |
|
Thanks! #219 is merged which has these commits! |
Summary by CodeRabbit
Configuration
Documentation
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.