feat: add obsidian ingestion support#222
feat: add obsidian ingestion support#2220xrushi wants to merge 28 commits intoSimpleOpenSoftware:devfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughAdds full Obsidian ingestion: new frontend components and UI, backend API routes for upload/start/status, Redis/RQ enqueueing and worker job, Obsidian parsing/chunking/embedding service, Neo4j client/storage, chat integration flag, utilities, tests, and deployment/profile adjustments. Changes
Sequence Diagram(s)sequenceDiagram
participant User as Frontend User
participant UI as WebUI / Mobile UI
participant API as Backend API
participant Redis as Redis / RQ
participant Worker as RQ Worker
participant Neo4j as Neo4j DB
Note over User,Neo4j: Obsidian Ingest — Upload → Queue → Worker → DB
User->>UI: Upload vault.zip
UI->>API: POST /api/obsidian/upload_zip (multipart)
API->>API: extract zip, count .md, set pending key
API-->>UI: { job_id, vault_path, total_files }
User->>UI: Start ingestion
UI->>API: POST /api/obsidian/start { job_id }
API->>Redis: enqueue ingest_obsidian_vault_job(job_id, vault_path)
API->>Redis: delete pending key
API-->>UI: queued job_id
loop Poll status
UI->>API: GET /api/obsidian/status?job_id
API->>Redis: fetch RQ job + meta
API-->>UI: { status, processed, total, last_file, errors }
end
par Worker execution
Redis->>Worker: run ingest_obsidian_vault_job
Worker->>Worker: iterate .md files → parse → chunk → embed
Worker->>Neo4j: MERGE nodes/relationships / write vectors
Worker->>Redis: update job metadata (processed, last_file)
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (6)
backends/advanced/src/advanced_omi_backend/services/memory/config.py-165-174 (1)
165-174: Potential TypeError ifembedding_dimensionsis not set.If
embed_def.get('embedding_dimensions')returnsNone, theint()conversion on line 167 will raise aTypeError. Consider providing a default value or adding validation.🔎 Proposed fix with default value
- embedding_dims = int(resolve_value(embed_def.get('embedding_dimensions'))) + raw_dims = embed_def.get('embedding_dimensions') + if raw_dims is None: + raise ValueError("'embedding_dimensions' must be specified in the embedding model config") + embedding_dims = int(resolve_value(raw_dims))Or with a sensible default:
- embedding_dims = int(resolve_value(embed_def.get('embedding_dimensions'))) + embedding_dims = int(resolve_value(embed_def.get('embedding_dimensions', 1536)))backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py-23-24 (1)
23-24: Reconsider the underscore prefixes in import aliases for public functions.The imported functions
load_config_ymlandresolve_valueare defined as public APIs in the source module (no underscore prefix), but the aliases use underscore prefixes (_load_root_config,_resolve_value), which conventionally marks them as private/internal. Either import them without the underscore aliases for clarity, or document why these public APIs are being treated as internal in this module.extras/obsidian-injest/config.py-51-54 (1)
51-54: Missing config.yml doesn't raise — silent empty config.If
config.ymldoesn't exist,config_dataremains an empty dict and the code continues until it fails atget_model_config. Consider raising early with a clear error message.🔎 Proposed fix
config_data = {} if CONFIG_YML_PATH.exists(): with open(CONFIG_YML_PATH, 'r') as f: config_data = yaml.safe_load(f) +else: + raise FileNotFoundError(f"config.yml not found at {CONFIG_YML_PATH}")backends/advanced/src/advanced_omi_backend/llm_client.py-57-62 (1)
57-62: Error message references environment variables but config is from YAML.The error message mentions
OPENAI_API_KEY, OPENAI_BASE_URL, and OPENAI_MODELwhich are environment variable names, but the values now come fromconfig.yml. Update the message to reflect the actual config source.🔎 Proposed fix
self.api_key = api_key self.base_url = base_url self.model = model if not self.api_key or not self.base_url or not self.model: - raise ValueError("OPENAI_API_KEY, OPENAI_BASE_URL, and OPENAI_MODEL must be set") + raise ValueError("api_key, base_url, and model must be provided (check config.yml)")extras/obsidian-injest/config.py-84-85 (1)
84-85: Type annotation mismatch —.get()can returnNone.
NEO4J_USERandNEO4J_PASSWORDare typed asstrbutdict.get()returnsOptional[str]. This could cause runtime issues if the values are missing.🔎 Proposed fix
-NEO4J_USER: str = env_data.get("NEO4J_USER") -NEO4J_PASSWORD: str = env_data.get("NEO4J_PASSWORD") +NEO4J_USER: str = env_data.get("NEO4J_USER", "neo4j") +NEO4J_PASSWORD: str = env_data.get("NEO4J_PASSWORD", "")Or add validation similar to NEO4J_HOST if these are required.
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py-192-200 (1)
192-200: Cancellation endpoint sets status but worker never checks it.The
/cancelendpoint setsstatusto"cancel_requested", but the_ingest_workerloop never checks this flag. Users calling this endpoint will receive a success response but ingestion will continue until completion. Consider either implementing cooperative cancellation in the worker or removing this endpoint to avoid user confusion.Option 1: Implement cooperative cancellation in worker
Add a check in the worker loop:
for file in files: + # Check for cancellation + with _jobs_lock: + if _jobs.get(job_id, {}).get("status") == "cancel_requested": + _jobs[job_id]["status"] = "cancelled" + logger.info(f"Job {job_id} cancelled by user") + return if not file.endswith('.md'): continueCommittable suggestion skipped: line range outside the PR's diff.
🧹 Nitpick comments (14)
backends/advanced/webui/src/components/MemorySettings.tsx (2)
4-4: Admin gating implemented correctly.The component properly restricts access to admins using the auth context and returns null for non-admin users, effectively hiding the entire UI. This is a solid security pattern.
💡 Optional: Prevent unnecessary API call for non-admins
The
useEffecthook at line 21 callsloadMemoryConfig()even when the user is not an admin (hooks execute before the early return). While this is likely protected by routing, you could optimize it by guarding the effect:useEffect(() => { + if (!isAdmin) return loadMemoryConfig() -}, []) +}, [isAdmin])This prevents the API call from being made for non-admin users who might briefly access this component.
Also applies to: 18-18, 134-137
210-215: Contextual admin guidance is helpful.The conditional helper text guides admins when authorization or endpoint availability errors occur, improving the troubleshooting experience.
💡 Optional: Use error codes instead of string matching
The current implementation uses string matching (
error.includes('does not expose')) to determine which guidance to show. While this works, it's fragile—if the error message text changes (lines 37, 39), the condition will silently break.Consider introducing error codes or flags:
const [errorCode, setErrorCode] = useState<'unauthorized' | 'no_endpoints' | 'generic' | null>(null) // In loadMemoryConfig catch block: if (status === 401) { setErrorCode('unauthorized') setError('Unauthorized: admin privileges required...') } else if (status === 404 || status === 405) { setErrorCode('no_endpoints') setError('Backend does not expose memory configuration endpoints') } // In helper text: {errorCode && ['unauthorized', 'no_endpoints'].includes(errorCode) && ( // ... guidance )}This decouples the UI logic from the exact error message wording.
backends/advanced/webui/src/pages/Upload.tsx (2)
36-36: Consider adding a proper type forobsidianStatus.Using
anytype loses type safety. Define an interface for the status response to improve maintainability.🔎 Suggested type definition
interface ObsidianIngestionStatus { status: 'pending' | 'running' | 'completed' | 'failed' processed: number total: number percent: number last_file?: string error?: string errors?: string[] } // Then use: const [obsidianStatus, setObsidianStatus] = useState<ObsidianIngestionStatus | null>(null)
190-212: Minor: Consider adding a mounted flag to prevent state updates after unmount.If the component unmounts while a
poll()request is in flight,setObsidianStatusorsetObsidianPollingmay be called on an unmounted component.🔎 Suggested fix with mounted flag
useEffect(() => { let interval: number | undefined + let mounted = true if (obsidianPolling && obsidianJobId) { const poll = async () => { try { const res = await obsidianApi.status(obsidianJobId) + if (!mounted) return setObsidianStatus(res.data) if (res.data.status === 'completed' || res.data.status === 'failed') { setObsidianPolling(false) setObsidianMessage(res.data.status === 'completed' ? '✅ Ingestion completed' : `❌ Failed: ${res.data.error || 'unknown error'}`) } } catch (err) { + if (!mounted) return setObsidianPolling(false) setObsidianError('Failed to fetch status') } } poll() interval = window.setInterval(poll, 1500) } return () => { + mounted = false if (interval) window.clearInterval(interval) } }, [obsidianPolling, obsidianJobId])app/app/components/ObsidianIngest.tsx (2)
1-4: Remove unused import and blank line.
ActivityIndicatoris imported but never used in the component. Also, there's an unnecessary blank line at the start of the file.🔎 Proposed fix
- -import React, { useState } from 'react'; -import { View, Text, TextInput, TouchableOpacity, StyleSheet, Alert, ActivityIndicator } from 'react-native'; +import React, { useState } from 'react'; +import { View, Text, TextInput, TouchableOpacity, StyleSheet, Alert } from 'react-native';
39-46: Consider adding a timeout for the fetch request.The ingestion operation could take a while for large vaults. Without a timeout, the request could hang indefinitely if the server becomes unresponsive. Consider using
AbortControllerwith a timeout.🔎 Proposed fix using AbortController
+ const controller = new AbortController(); + const timeoutId = setTimeout(() => controller.abort(), 300000); // 5 min timeout + const response = await fetch(`${baseUrl}/api/obsidian/ingest`, { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${jwtToken}` }, - body: JSON.stringify({ vault_path: vaultPath }) + body: JSON.stringify({ vault_path: vaultPath }), + signal: controller.signal }); + + clearTimeout(timeoutId);backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
90-91: Misleading function name.The function is named
build_memory_config_from_envbut the docstring correctly states it builds configuration "strictly from config.yml". Consider renaming tobuild_memory_config_from_ymlorbuild_memory_configfor clarity.backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
41-65: Duplicate utility functions — consider reusing from shared module.
load_env_fileandresolve_valueare duplicated across multiple files:
backends/advanced/src/advanced_omi_backend/services/memory/config.pyextras/obsidian-injest/config.py- This file
Consider importing from the shared
services/memory/config.pymodule instead of duplicating these functions.🔎 Proposed refactor
-from advanced_omi_backend.services.memory.providers.llm_providers import _get_openai_client -from advanced_omi_backend.services.memory.config import load_config_yml as load_root_config +from advanced_omi_backend.services.memory.providers.llm_providers import _get_openai_client +from advanced_omi_backend.services.memory.config import ( + load_config_yml as load_root_config, + resolve_value, +) +from advanced_omi_backend.services.memory.config import load_env_file # If added thereThen remove the local
load_env_fileandresolve_valuedefinitions.Also applies to: 67-81
107-113: Redundant checks afterget_model_configalready raises.The inner
get_model_configfunction raisesValueErrorif the config is not found (lines 101, 105), so the subsequentif not llm_configandif not embed_configchecks on lines 108-113 are unreachable. Remove the redundant checks or adjust the helper to returnNoneinstead of raising.🔎 Proposed fix — remove redundant checks
llm_config = get_model_config('llm') - if not llm_config: - raise ValueError("Configuration for 'defaults.llm' not found in config.yml") embed_config = get_model_config('embedding') - if not embed_config: - raise ValueError("Configuration for 'defaults.embedding' not found in config.yml")extras/obsidian-injest/search_query.py (1)
24-27: Consider adding error handling for embedding failures.Unlike the
ObsidianService.get_embedding()which returnsNoneon failure, this version will propagate exceptions. For a CLI test tool this may be acceptable, but adding a try-except withlogging.exception()would improve debuggability.backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (1)
496-526: Consider atomic write for config updates.
update_memory_config_rawwrites directly to the config file. If the write fails mid-way, the file could be corrupted. Consider writing to a temp file first, then renaming (atomic on most filesystems).🔎 Proposed atomic write pattern
+import tempfile + async def update_memory_config_raw(config_yaml: str): """Save memory YAML into config.yml and hot-reload memory service.""" try: mem = _parse_memory_yaml(config_yaml) path, data = _find_config_yml_path() # Merge and write data["memory"] = mem # Ensure parents path.parent.mkdir(parents=True, exist_ok=True) - with open(path, "w") as f: - yaml.safe_dump(data, f, sort_keys=False) + # Write atomically via temp file + with tempfile.NamedTemporaryFile( + mode="w", dir=path.parent, delete=False, suffix=".tmp" + ) as tmp: + yaml.safe_dump(data, tmp, sort_keys=False) + tmp_path = Path(tmp.name) + tmp_path.replace(path)extras/obsidian-injest/config.py (2)
16-31: Significant code duplication across files.
load_env_file,resolve_value, andget_model_configare duplicated in:
backends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/services/memory/config.py- This file
Since
extras/obsidian-injest/is a separate utility, some duplication may be intentional for standalone operation. However, if these utilities share the same codebase, consider extracting to a shared package.Also applies to: 34-42, 59-67
124-139: Minor:__all__is unsorted.Static analysis suggests sorting
__all__for consistency. This is a minor style issue.🔎 Sorted __all__
__all__ = [ - "VAULT_PATH", - "NEO4J_URI", - "NEO4J_USER", - "NEO4J_PASSWORD", + "CHUNK_CHAR_LIMIT", + "CHUNK_OVERLAP", "EMBEDDING_MODEL", "EMBEDDING_DIMENSIONS", "LLM_MODEL", - "CHUNK_CHAR_LIMIT", - "CHUNK_OVERLAP", + "LOG_FORMAT", + "LOG_LEVEL", + "NEO4J_PASSWORD", + "NEO4J_URI", + "NEO4J_USER", "OPENAI_BASE_URL", "OPENAI_API_KEY", - "LOG_LEVEL", - "LOG_FORMAT", + "VAULT_PATH", "client", ]backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (1)
10-10: Remove unused importBackgroundTasks.
BackgroundTasksis imported but never used in this module.Proposed fix
-from fastapi import APIRouter, Depends, HTTPException, BackgroundTasks, UploadFile, File, Body +from fastapi import APIRouter, Depends, HTTPException, UploadFile, File, Body
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
app/app/components/ObsidianIngest.tsx(1 hunks)app/app/index.tsx(2 hunks)backends/advanced/docker-compose.yml(5 hunks)backends/advanced/pyproject.toml(1 hunks)backends/advanced/src/advanced_omi_backend/controllers/system_controller.py(4 hunks)backends/advanced/src/advanced_omi_backend/llm_client.py(3 hunks)backends/advanced/src/advanced_omi_backend/memory_config_loader.py(0 hunks)backends/advanced/src/advanced_omi_backend/routers/api_router.py(2 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/__init__.py(2 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py(5 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py(1 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/memory/config.py(2 hunks)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py(2 hunks)backends/advanced/webui/src/components/MemorySettings.tsx(8 hunks)backends/advanced/webui/src/pages/Upload.tsx(4 hunks)backends/advanced/webui/src/services/api.ts(2 hunks)extras/obsidian-injest/config.py(1 hunks)extras/obsidian-injest/search_query.py(1 hunks)
💤 Files with no reviewable changes (1)
- backends/advanced/src/advanced_omi_backend/memory_config_loader.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/routers/api_router.pybackends/advanced/src/advanced_omi_backend/services/transcription/__init__.pybackends/advanced/src/advanced_omi_backend/services/memory/config.pyextras/obsidian-injest/config.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.pyextras/obsidian-injest/search_query.pybackends/advanced/src/advanced_omi_backend/routers/modules/__init__.pybackends/advanced/src/advanced_omi_backend/controllers/system_controller.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.pybackends/advanced/src/advanced_omi_backend/routers/modules/system_routes.pybackends/advanced/src/advanced_omi_backend/llm_client.py
app/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Use TypeScript following standard React Native conventions for mobile app development
Files:
app/app/index.tsxapp/app/components/ObsidianIngest.tsx
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/routers/api_router.pybackends/advanced/src/advanced_omi_backend/services/transcription/__init__.pybackends/advanced/src/advanced_omi_backend/services/memory/config.pyextras/obsidian-injest/config.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.pyextras/obsidian-injest/search_query.pybackends/advanced/src/advanced_omi_backend/routers/modules/__init__.pybackends/advanced/src/advanced_omi_backend/controllers/system_controller.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.pybackends/advanced/src/advanced_omi_backend/routers/modules/system_routes.pybackends/advanced/src/advanced_omi_backend/llm_client.py
🧬 Code graph analysis (12)
backends/advanced/webui/src/components/MemorySettings.tsx (1)
backends/advanced/webui/src/contexts/AuthContext.tsx (1)
useAuth(115-121)
app/app/index.tsx (1)
app/app/components/ObsidianIngest.tsx (1)
ObsidianIngest(10-91)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
load_config_yml(72-88)resolve_value(57-70)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (3)
extras/obsidian-injest/config.py (1)
resolve_value(34-42)backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
MemoryProvider(29-34)backends/advanced/src/advanced_omi_backend/services/transcription/base.py (1)
name(65-67)
backends/advanced/webui/src/pages/Upload.tsx (1)
backends/advanced/webui/src/services/api.ts (1)
obsidianApi(230-247)
extras/obsidian-injest/config.py (2)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (3)
load_env_file(41-65)resolve_value(67-81)get_model_config(98-105)backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
resolve_value(57-70)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (3)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
_get_openai_client(45-69)extras/obsidian-injest/config.py (3)
load_env_file(16-31)resolve_value(34-42)get_model_config(59-67)extras/obsidian-injest/search_query.py (1)
get_embedding(24-27)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (1)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
load_config_yml(72-88)resolve_value(57-70)
extras/obsidian-injest/search_query.py (1)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
get_embedding(199-216)close(171-175)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (3)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
load_config_yml(72-88)backends/advanced/src/advanced_omi_backend/services/memory/service_factory.py (2)
reset_memory_service(127-133)get_memory_service(73-110)backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (3)
get_memory_config_raw(77-79)validate_memory_config_raw(82-87)update_memory_config_raw(90-95)
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (1)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (4)
get_memory_config_raw(441-451)validate_memory_config_raw(470-493)update_memory_config_raw(496-526)reload_memory_config(529-538)
backends/advanced/src/advanced_omi_backend/llm_client.py (2)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
load_config_yml(72-88)resolve_value(57-70)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
resolve_value(67-81)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
62-62: Local variable api_family is assigned to but never used
Remove assignment to unused variable api_family
(F841)
80-80: Use raise without specifying exception name
Remove exception name
(TRY201)
backends/advanced/src/advanced_omi_backend/services/memory/config.py
88-88: Avoid specifying long messages outside the exception class
(TRY003)
98-98: Avoid specifying long messages outside the exception class
(TRY003)
137-137: Avoid specifying long messages outside the exception class
(TRY003)
141-141: Avoid specifying long messages outside the exception class
(TRY003)
159-159: Avoid specifying long messages outside the exception class
(TRY003)
extras/obsidian-injest/config.py
71-71: Avoid specifying long messages outside the exception class
(TRY003)
75-75: Avoid specifying long messages outside the exception class
(TRY003)
80-80: Avoid specifying long messages outside the exception class
(TRY003)
124-139: __all__ is not sorted
Apply an isort-style sorting to __all__
(RUF022)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
101-101: Avoid specifying long messages outside the exception class
(TRY003)
105-105: Avoid specifying long messages outside the exception class
(TRY003)
109-109: Avoid specifying long messages outside the exception class
(TRY003)
113-113: Avoid specifying long messages outside the exception class
(TRY003)
133-133: Avoid specifying long messages outside the exception class
(TRY003)
210-210: Multiple statements on one line (colon)
(E701)
214-214: Do not catch blind exception: Exception
(BLE001)
215-215: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
329-329: Probable use of insecure hash functions in hashlib: md5
(S324)
362-362: Avoid specifying long messages outside the exception class
(TRY003)
370-370: Unused blanket noqa directive
Remove unused noqa directive
(RUF100)
380-380: Do not catch blind exception: Exception
(BLE001)
381-381: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
382-382: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
100-100: Do not catch blind exception: Exception
(BLE001)
102-102: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
102-102: Avoid specifying long messages outside the exception class
(TRY003)
102-102: Undefined name e
(F821)
246-246: Do not catch blind exception: Exception
(BLE001)
248-248: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
448-448: Consider moving this statement to an else block
(TRY300)
449-449: Do not catch blind exception: Exception
(BLE001)
450-450: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
451-451: Use explicit conversion flag
Replace with conversion flag
(RUF010)
458-458: Prefer TypeError exception for invalid type
(TRY004)
458-458: Abstract raise to an inner function
(TRY301)
458-458: Avoid specifying long messages outside the exception class
(TRY003)
464-464: Prefer TypeError exception for invalid type
(TRY004)
464-464: Abstract raise to an inner function
(TRY301)
464-464: Avoid specifying long messages outside the exception class
(TRY003)
465-465: Consider moving this statement to an else block
(TRY300)
466-466: Do not catch blind exception: Exception
(BLE001)
467-467: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
467-467: Avoid specifying long messages outside the exception class
(TRY003)
488-488: Consider moving this statement to an else block
(TRY300)
491-491: Do not catch blind exception: Exception
(BLE001)
492-492: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
493-493: Use explicit conversion flag
Replace with conversion flag
(RUF010)
513-513: Do not catch blind exception: Exception
(BLE001)
524-524: Do not catch blind exception: Exception
(BLE001)
525-525: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
526-526: Use explicit conversion flag
Replace with conversion flag
(RUF010)
535-535: Consider moving this statement to an else block
(TRY300)
536-536: Do not catch blind exception: Exception
(BLE001)
537-537: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
538-538: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
27-27: Unused function argument: current_user
(ARG001)
27-27: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
38-38: Consider moving this statement to an else block
(TRY300)
39-39: Do not catch blind exception: Exception
(BLE001)
40-40: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
41-41: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
51-51: Loop control variable root not used within loop body
Rename unused root to _root
(B007)
64-64: Do not catch blind exception: Exception
(BLE001)
94-94: Do not catch blind exception: Exception
(BLE001)
95-95: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
98-98: Use explicit conversion flag
Replace with conversion flag
(RUF010)
103-103: Do not catch blind exception: Exception
(BLE001)
104-104: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
113-113: Do not perform function call File in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
114-114: Unused function argument: current_user
(ARG001)
114-114: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
149-149: Do not catch blind exception: Exception
(BLE001)
150-150: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
151-151: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
157-157: Unused function argument: current_user
(ARG001)
157-157: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
179-179: Unused function argument: current_user
(ARG001)
179-179: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
193-193: Unused function argument: current_user
(ARG001)
193-193: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py
77-77: Unused function argument: current_user
(ARG001)
77-77: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
84-84: Unused function argument: current_user
(ARG001)
84-84: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
92-92: Unused function argument: current_user
(ARG001)
92-92: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
98-98: Unused function argument: current_user
(ARG001)
98-98: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
backends/advanced/src/advanced_omi_backend/llm_client.py
158-158: Avoid specifying long messages outside the exception class
(TRY003)
162-162: Avoid specifying long messages outside the exception class
(TRY003)
166-166: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (21)
backends/advanced/webui/src/components/MemorySettings.tsx (2)
34-42: Enhanced error handling improves UX.The differentiated error messages (401 unauthorized, 404/405 endpoints not exposed, generic errors) help users quickly understand the root cause and take appropriate action.
224-224: Button disabling logic prevents futile operations.Disabling validate and save actions when an error exists (e.g., unauthorized, endpoints unavailable) prevents users from attempting operations that will fail. Users can still reload or reset to recover.
Also applies to: 242-242
backends/advanced/docker-compose.yml (1)
11-11: LGTM!The
config.ymlvolume mounts enable the config-driven architecture for both the backend and workers containers.Also applies to: 67-67
backends/advanced/src/advanced_omi_backend/routers/modules/__init__.py (1)
23-23: LGTM!The import and export follow the established pattern for router modules in this package.
Also applies to: 36-36
backends/advanced/src/advanced_omi_backend/routers/api_router.py (1)
18-18: LGTM!The obsidian router is correctly imported and included in the main API router, consistent with other sub-routers.
Also applies to: 38-38
backends/advanced/webui/src/pages/Upload.tsx (1)
449-542: LGTM!The Obsidian Vault Import UI is well-structured with appropriate loading states, progress indicators, and error handling. The
isAdminguard at line 451 is redundant (the component returns early at line 239 for non-admins) but serves as defensive programming.app/app/index.tsx (1)
31-31: LGTM!The
ObsidianIngestcomponent is correctly integrated with authentication-gated rendering. The component appropriately handles WebSocket-to-HTTP URL conversion internally for API calls.Also applies to: 542-548
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
53-77: Config-driven provider selection looks correct.The logic properly handles provider selection from
config.ymlwith appropriate warnings for missing configurations. The transition from environment variables to config-driven architecture is consistent with the broader PR goals.Verify that all deployment environments have been updated with
config.ymlcontaining the required STT configuration, as this change removes environment variable fallbacks.backends/advanced/pyproject.toml (1)
13-13: LGTM!The
neo4j>=5.0.0dependency is required for the ObsidianService Neo4j integration. The version constraint is appropriate for Neo4j 5.x and newer versions.backends/advanced/webui/src/services/api.ts (1)
230-247: LGTM! Well-structured Obsidian API client.The implementation follows existing patterns in the file (similar to
uploadApi) and correctly handles:
- Multipart form data upload with progress tracking
- Reasonable 5-minute timeout for large vault uploads
- Consistent endpoint naming matching the backend routes
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (1)
75-100: LGTM! Clean transition to config.yml-based configuration.The endpoint updates correctly:
- Added dedicated validation endpoint (
/admin/memory/config/validate) that doesn't persist- Switched to
media_type="text/plain"for accepting raw YAML content- Updated docstrings to reflect config.yml usage
Regarding the static analysis hints: The
Depends()in argument defaults and "unused"current_userparameters are standard FastAPI patterns for dependency injection and authorization enforcement—these are false positives.app/app/components/ObsidianIngest.tsx (1)
61-91: LGTM! Clean and accessible UI implementation.The component structure is well-organized with appropriate:
- Visual feedback for loading state via button text and disabled styling
- Helpful instructions for users about the vault path requirement
- Proper accessibility attributes (
autoCapitalize,autoCorrect)backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
56-70: LGTM! Environment variable resolution logic.The implementation correctly handles both
${VAR}and${VAR:-default}syntax patterns and safely returns the original value for non-matching inputs.Note: Similar logic exists in
extras/obsidian-injest/config.py. Consider extracting to a shared utility if this pattern is used across more modules to avoid duplication.
72-88: LGTM! Robust config file discovery.The multi-path search strategy appropriately handles Docker, development, and relative path scenarios. Using
yaml.safe_loadis the correct secure choice.backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (1)
88-101: Consider graceful degradation vs failing the health check.The current logic sets fallback values but then immediately raises an exception, so the fallbacks are never used. Decide on the intended behavior:
- Fail health check if config is missing - remove fallback assignments (lines 101)
- Gracefully degrade - log the error and continue with fallbacks (remove the raise)
The current implementation is contradictory.
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
326-336: MD5 for chunk IDs is fine for non-cryptographic use.The static analysis flags MD5 as insecure (S324), but here it's used for generating deterministic chunk identifiers, not for security purposes. This is acceptable. However, consider using
hashlib.sha256if you want to avoid lint warnings without suppression.backends/advanced/src/advanced_omi_backend/llm_client.py (1)
150-172: Config-driven factory implementation looks good.The factory correctly loads config, validates the
api_family, resolves values, and constructs the client. Good use of shared utilities frommemory/config.pyinstead of duplicating code.extras/obsidian-injest/search_query.py (1)
108-117: Logger output won't appear without explicit handler setup.The
__main__block useslogger.info()but since this is module-level logging and Python's default logger has no handlers, output won't appear unlessconfig.py's_configure_logging()runs first. Since config is imported, this should work, but consider usingprint()for CLI output or verifying the logging setup.backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (1)
420-438: Good: Consistent config path resolution matching other modules.The
_find_config_yml_pathfunction follows the same search order asload_config_ymlinmemory/config.py, maintaining consistency across the codebase.extras/obsidian-injest/config.py (1)
118-119: Module-level OpenAI client instantiation may crash imports.Similar to
obsidian_service.py, instantiating the OpenAI client at module load time means any configuration error will prevent the entire module from being imported. For a CLI utility this is likely acceptable, but worth noting.backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (1)
178-189: LGTM!The status endpoint correctly computes progress percentage and handles the job-not-found case appropriately.
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backends/advanced/docker-compose.yml (1)
44-45: Document the steps to enable Neo4j for Obsidian ingestion.The PR description states that Neo4j is required for Obsidian ingestion but is disabled by default. Users will need to manually uncomment the
neo4j-mem0service (lines 184-203) and its correspondingdepends_onentry (lines 44-45) to use this feature.Consider adding a comment block above the neo4j service or updating the README with clear instructions on:
- Which lines to uncomment
- How to configure NEO4J_PASSWORD in .env
- Any initialization steps required after enabling the service
Note: The past review comment about a critical mismatch between depends_on and service definition is now resolved since both are consistently commented out.
Also applies to: 184-203
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backends/advanced/docker-compose.yml(4 hunks)
🔇 Additional comments (1)
backends/advanced/docker-compose.yml (1)
186-186: LGTM! Configuration improvements for Neo4j service.The additions to the commented neo4j-mem0 service definition are well-chosen:
hostnameenables reliable service discovery- Listen address configurations on
0.0.0.0allow proper network access- Memory heap settings (512m initial, 2G max) provide reasonable defaults
These will be valuable when users enable the Neo4j service.
Also applies to: 195-199
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (6)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (2)
491-494: Uselogging.exception()instead oflogging.error()in except block.This was flagged in a previous review. Per project learnings,
logging.exception()should be used in except blocks to automatically log the full stack trace.
530-539: LGTM with minor note on logging.The reload logic is straightforward and correctly uses
reset_memory_service()followed byget_memory_service()to reinitialize. Thelogger.errorvslogger.exceptionissue was already flagged in a previous review.backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (2)
62-62: Remove unusedapi_familyvariable.This issue was already flagged in a previous review. The
api_familyvariable is assigned but never used.
78-80: Consider usinglogger.exception()instead oflogger.warning()in the except block.The bare
raiseis correct now. However, the past review comment also suggested usinglogger.exception()to automatically log the full stack trace. Based on learnings from this repository.backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (1)
100-102: Exception handling has dead code and missing exception chaining.Two issues remain:
Dead code: Line 101 assigns default values, but line 102 immediately raises an exception, so those defaults are never used.
Missing exception chaining: Per static analysis (B904) and project conventions, use
raise ... from eto preserve the original context.For a health check endpoint, consider gracefully degrading instead of failing:
🔎 Proposed fix (graceful degradation)
except Exception as e: + logger.warning(f"Failed to load configuration from config.yml for health check: {e}") _llm_provider, _llm_model, _llm_base_url, _stt_name = "openai", None, None, None - raise ValueError(f"Failed to load configuration from config.yml for health check: {e}")🔎 Alternative fix (if raising is intended)
except Exception as e: - _llm_provider, _llm_model, _llm_base_url, _stt_name = "openai", None, None, None - raise ValueError(f"Failed to load configuration from config.yml for health check: {e}") + raise ValueError("Failed to load configuration from config.yml for health check") from ebackends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
208-216: Uselogging.exception()and fix multi-statement line.Line 210 has multiple statements on one line. Per project learnings, use
logging.exception()inside except blocks to automatically capture the full stack trace.🔎 Proposed fix
def get_embedding(self, text: str) -> Optional[List[float]]: ... try: clean_text = re.sub(r'\s+', ' ', text).strip() - if not clean_text: return None + if not clean_text: + return None response = self.client.embeddings.create(input=[clean_text], model=self.embedding_model) return response.data[0].embedding except Exception as e: - logger.error(f"Embedding failed for text length {len(text)}: {e}") + logger.exception("Embedding failed for text length %d", len(text)) return NoneBased on learnings, always use
logging.exception()in except blocks to preserve the full stack trace for debugging.
🧹 Nitpick comments (7)
extras/obsidian-injest/search_query.py (1)
106-115: Consider adding error handling in the CLI test block.While this is a testing script and crashing on errors might be acceptable, adding a try/except block would provide better user experience with informative error messages rather than stack traces.
🔎 Proposed enhancement
if __name__ == "__main__": user_query: str = "How do I create a table in dataview?" - final_response = graph_rag_query(user_query) - - logger.info("%s", "=" * 50) - logger.info("QUESTION: %s", user_query) - logger.info("%s", "-" * 50) - logger.info("AI ANSWER:\n%s", final_response) - logger.info("%s", "=" * 50) + try: + final_response = graph_rag_query(user_query) + + logger.info("%s", "=" * 50) + logger.info("QUESTION: %s", user_query) + logger.info("%s", "-" * 50) + logger.info("AI ANSWER:\n%s", final_response) + logger.info("%s", "=" * 50) + except Exception as e: + logger.exception("Query failed") + raisebackends/advanced/src/advanced_omi_backend/controllers/system_controller.py (4)
19-23: Group third-party imports together per coding guidelines.The
yamlimport (third-party) at line 20 is placed between local import blocks. Per coding guidelines, imports should be grouped: standard library, third-party, then local imports.🔎 Proposed fix
from advanced_omi_backend.models.user import User from advanced_omi_backend.task_manager import get_task_manager from fastapi.responses import JSONResponse -import yaml - -from advanced_omi_backend.services.memory.config import load_config_yml as _load_root_config +import yaml # Move after fastapi.responses in third-party group + +from advanced_omi_backend.config import ( + load_diarization_settings_from_file, + save_diarization_settings_to_file, +) +from advanced_omi_backend.models.user import User +from advanced_omi_backend.task_manager import get_task_manager +from advanced_omi_backend.services.memory.config import load_config_yml as _load_root_config from advanced_omi_backend.services.memory.service_factory import reset_memory_service, get_memory_serviceOr simply reorganize so all third-party imports (
fastapi,yaml) are together before local imports.
22-22: Unused import and code duplication withload_config_yml.
_load_root_configis imported at line 22 but never used. Additionally,_find_config_yml_pathduplicates the path search logic fromload_config_ymlinservices/memory/config.py.Consider either:
- Removing the unused import and keeping the local implementation, or
- Refactoring
load_config_ymlto also return the resolved path, then reusing it here🔎 Option 1: Remove unused import
-from advanced_omi_backend.services.memory.config import load_config_yml as _load_root_config from advanced_omi_backend.services.memory.service_factory import reset_memory_service, get_memory_serviceAlso applies to: 420-438
449-451: Simplify exception handling in except block.
logger.exception()automatically logs the full traceback, making the{e}in the message redundant. Also, use bareraiseinstead ofraise eto re-raise the original exception cleanly.🔎 Proposed fix
except Exception as e: - logger.exception(f"Failed to load memory configuration: {e}") - raise e + logger.exception("Failed to load memory configuration") + raise
454-468: ConsiderTypeErrorfor type checks and simplify exception handling.
- Lines 458 and 464 use
ValueErrorfor type validation, butTypeErroris more semantically appropriate when validating types.- Same as above:
{e}is redundant inlogger.exception()andraise eshould be bareraise.🔎 Proposed fix
def _parse_memory_yaml(config_yaml: str) -> dict: try: doc = yaml.safe_load(config_yaml) or {} if not isinstance(doc, dict): - raise ValueError("YAML must define a mapping") + raise TypeError("YAML must define a mapping") if "memory" in doc: mem = doc.get("memory") or {} else: mem = doc if not isinstance(mem, dict): - raise ValueError("'memory' section must be a mapping") + raise TypeError("'memory' section must be a mapping") return mem except Exception as e: - logger.exception(f"Invalid YAML: {e}") - raise e + logger.exception("Invalid YAML") + raisebackends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
41-81: Consolidate duplicated utility functions.Both
load_env_fileandresolve_valueare duplicated fromextras/obsidian-injest/config.py. Theresolve_valuefunction also exists inbackends/advanced/src/advanced_omi_backend/services/memory/config.py(line 71-87).Consider moving these utilities to a shared location (e.g., a common config utilities module) to reduce duplication and improve maintainability.
328-336: Consider using SHA-256 or UUID instead of MD5 for chunk IDs.While MD5 is acceptable for non-cryptographic ID generation, using a more modern hash function like SHA-256 or UUID would avoid static analysis warnings and follow current best practices.
🔎 Proposed alternatives
Option 1: Use SHA-256
- chunk_id = hashlib.md5(f"{note_data['path']}_{i}".encode()).hexdigest() + chunk_id = hashlib.sha256(f"{note_data['path']}_{i}".encode()).hexdigest()Option 2: Use UUID5 (namespace-based)
+import uuid ... - chunk_id = hashlib.md5(f"{note_data['path']}_{i}".encode()).hexdigest() + chunk_id = str(uuid.uuid5(uuid.NAMESPACE_DNS, f"{note_data['path']}_{i}"))
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py(4 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py(5 hunks)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py(2 hunks)extras/obsidian-injest/search_query.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/controllers/system_controller.pyextras/obsidian-injest/search_query.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/controllers/system_controller.pyextras/obsidian-injest/search_query.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
🧬 Code graph analysis (4)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (4)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
_get_openai_client(45-69)backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
load_config_yml(72-88)resolve_value(57-70)extras/obsidian-injest/config.py (3)
load_env_file(16-31)resolve_value(34-42)get_model_config(59-67)extras/obsidian-injest/search_query.py (1)
get_embedding(24-27)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (3)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
load_config_yml(72-88)backends/advanced/src/advanced_omi_backend/services/memory/service_factory.py (2)
reset_memory_service(127-133)get_memory_service(73-110)backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (4)
get_memory_config_raw(77-79)validate_memory_config_raw(82-87)update_memory_config_raw(90-95)reload_memory_config(98-100)
extras/obsidian-injest/search_query.py (1)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
get_embedding(199-216)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (1)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
load_config_yml(72-88)resolve_value(57-70)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
62-62: Local variable api_family is assigned to but never used
Remove assignment to unused variable api_family
(F841)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
101-101: Avoid specifying long messages outside the exception class
(TRY003)
105-105: Avoid specifying long messages outside the exception class
(TRY003)
109-109: Avoid specifying long messages outside the exception class
(TRY003)
113-113: Avoid specifying long messages outside the exception class
(TRY003)
133-133: Avoid specifying long messages outside the exception class
(TRY003)
210-210: Multiple statements on one line (colon)
(E701)
214-214: Do not catch blind exception: Exception
(BLE001)
215-215: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
329-329: Probable use of insecure hash functions in hashlib: md5
(S324)
362-362: Avoid specifying long messages outside the exception class
(TRY003)
382-382: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
448-448: Consider moving this statement to an else block
(TRY300)
450-450: Redundant exception object included in logging.exception call
(TRY401)
451-451: Use raise without specifying exception name
Remove exception name
(TRY201)
458-458: Prefer TypeError exception for invalid type
(TRY004)
458-458: Abstract raise to an inner function
(TRY301)
458-458: Avoid specifying long messages outside the exception class
(TRY003)
464-464: Prefer TypeError exception for invalid type
(TRY004)
464-464: Abstract raise to an inner function
(TRY301)
464-464: Avoid specifying long messages outside the exception class
(TRY003)
465-465: Consider moving this statement to an else block
(TRY300)
467-467: Redundant exception object included in logging.exception call
(TRY401)
468-468: Use raise without specifying exception name
Remove exception name
(TRY201)
489-489: Consider moving this statement to an else block
(TRY300)
492-492: Do not catch blind exception: Exception
(BLE001)
493-493: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
494-494: Use explicit conversion flag
Replace with conversion flag
(RUF010)
514-514: Do not catch blind exception: Exception
(BLE001)
525-525: Do not catch blind exception: Exception
(BLE001)
526-526: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
527-527: Use explicit conversion flag
Replace with conversion flag
(RUF010)
536-536: Consider moving this statement to an else block
(TRY300)
537-537: Do not catch blind exception: Exception
(BLE001)
538-538: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
539-539: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
100-100: Do not catch blind exception: Exception
(BLE001)
102-102: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
102-102: Avoid specifying long messages outside the exception class
(TRY003)
246-246: Do not catch blind exception: Exception
(BLE001)
248-248: Use explicit conversion flag
Replace with conversion flag
(RUF010)
🔇 Additional comments (16)
extras/obsidian-injest/search_query.py (2)
1-22: LGTM! Clean module structure.The module docstring clearly explains the CLI testing purpose, and imports are properly organized following the coding guidelines (standard library, third-party, then local imports).
55-104: LGTM! Neo4j driver resource leak fixed.The function now correctly uses context managers for both the Neo4j driver (line 83) and session (line 84), ensuring they are always properly closed even when exceptions occur. This addresses the resource leak concern from the previous review.
The query logic is well-structured: vector search, graph traversal for tags and links, context formatting, and fallback handling for empty results.
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
53-77: Config-driven provider selection looks good.The logic correctly handles:
- Loading configuration with fallback to empty dict
- Resolving provider-specific values via
_resolve_value- Returning appropriate provider instance (streaming vs batch)
- Warning and returning
Nonewhen required values are missingbackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (2)
128-130: LGTM - Config-driven display fields properly wired.The LLM configuration values are now correctly sourced from
config.ymlrather than environment variables, aligning with the broader migration to config-driven settings across the backend.
235-250: Provider field usage is consistent.The
_llm_provideris correctly used in the audioai service status for all code paths (success, timeout, and error). This provides consistent display of the configured LLM provider in health responses.backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (11)
1-26: LGTM! Well-structured module with clear documentation.The module docstring is comprehensive, and imports are properly organized following the coding guidelines (grouped by standard library, third-party, and local imports).
28-39: LGTM! Clear type definitions.The TypedDict definitions provide good type safety for note data and chunk payloads.
86-155: LGTM! Complex initialization appropriately handles multiple deployment scenarios.The initialization logic properly handles both containerized (Docker) and local development environments with appropriate fallbacks for configuration loading. The inline
get_model_confighelper is reasonable given its single use case.
171-179: LGTM! Clean resource management.The close and reset_driver methods properly manage the Neo4j driver lifecycle.
181-197: LGTM! Proper database schema initialization.The method correctly creates constraints and vector indexes with appropriate idempotent
IF NOT EXISTSclauses.
218-238: LGTM! Straightforward chunking implementation.The character-based chunking with overlap is implemented correctly and handles edge cases appropriately.
240-285: LGTM! Robust file parsing with encoding fallbacks.The method handles various encoding issues gracefully and correctly extracts Obsidian metadata including frontmatter, links, and tags.
287-309: LGTM! Resilient chunk processing.The method appropriately handles embedding failures for individual chunks while continuing to process the rest, with clear logging.
369-384: LGTM! Past review issues have been addressed.The method now correctly uses
logger.exception()to capture full stack traces (line 381), and the unusednoqadirective has been removed (line 370). The resilient error handling ensures the ingestion process continues even when individual files fail.
386-407: LGTM! Excellent lazy initialization pattern addressing past concerns.The lazy initialization with proxy pattern elegantly solves the module-level instantiation issue flagged in previous reviews. This prevents configuration errors from crashing the application at import time while maintaining backward compatibility for callers expecting a module-level
obsidian_serviceinstance.
341-344: This code correctly creates placeholder Note nodes for unresolved links.The MERGE clause with ON CREATE SET is the standard Neo4j pattern for idempotent upserts. When a link references a note that doesn't exist in the vault, this creates a placeholder Note node with a synthesized path, allowing the graph to represent all vault connections. This is intentional behavior, not a problem requiring correction.
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
30-47:provider_nameparameter is declared but never used.The function signature accepts
provider_nameand the docstring documents it as supporting explicit provider selection ('deepgram','parakeet'), but the implementation completely ignores this parameter and always relies onconfig.yml. This could break existing callers.Either remove the parameter and update the docstring, or implement the explicit provider selection logic.
🔎 Option 1: Remove unused parameter
def get_transcription_provider( - provider_name: Optional[str] = None, mode: Optional[str] = None, ) -> Optional[BaseTranscriptionProvider]: """ Factory function to get the appropriate transcription provider. Args: - provider_name: Name of the provider ('deepgram', 'parakeet'). - If None, will auto-select based on available configuration. mode: Processing mode ('streaming', 'batch'). If None, defaults to 'batch'.
🧹 Nitpick comments (2)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (2)
10-10: Unusedosimport.The
osmodule is imported but no longer used since environment variable access is now delegated to_resolve_value.🔎 Proposed fix
import logging -import os from typing import Optional
77-79: Uselogger.exception()to log the full stack trace.Based on learnings, prefer
logging.exception()inside except blocks to automatically log the full stack trace. The bareraiseis correct, butlogger.warningloses traceback context.🔎 Proposed fix
except Exception as e: - logger.warning(f"Failed to load STT provider from config.yml: {e}") + logger.exception("Failed to load STT provider from config.yml") raise
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
🧬 Code graph analysis (1)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
load_config_yml(72-88)resolve_value(57-70)
🔇 Additional comments (1)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
86-93: LGTM!The
__all__exports are consistent and maintain the expected public API surface.
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
57-83: Misleading warning when no STT is configured or model not found.The warning on line 82 runs in two different scenarios with different meanings:
- When
stt_nameisNone(no STT default configured) — logs "STT default 'None' configured..."- When
stt_nameexists but no matching model is found inmodelslistThese are distinct cases requiring different messages. Also, when
stt_nameisNone, returningNonesilently (or with a debug log) may be more appropriate than a warning.🔎 Proposed fix
stt_name = defaults.get("stt") if stt_name: stt_def = next((m for m in models if m.get("name") == stt_name), None) + if stt_def is None: + logger.warning(f"STT default '{stt_name}' configured but no matching model definition found") + return None if stt_def: provider = (stt_def.get("model_provider") or "").lower() if provider == "deepgram": api_key = str(_resolve_value(stt_def.get("api_key", ""))) if not api_key: logger.warning("Deepgram selected in config.yml but api_key missing or empty") return None return DeepgramStreamingProvider(api_key) if mode == "streaming" else DeepgramProvider(api_key) elif provider == "parakeet": service_url = str(_resolve_value(stt_def.get("model_url", ""))) if not service_url: logger.warning("Parakeet selected in config.yml but model_url missing or empty") return None return ParakeetStreamingProvider(service_url) if mode == "streaming" else ParakeetProvider(service_url) else: logger.warning(f"Unsupported STT provider in config.yml: {provider}") return None except Exception as e: logger.warning(f"Failed to load STT provider from config.yml: {e}") raise - # If no config present, no provider configured - logger.warning(f"STT default '{stt_name}' configured but no matching model definition found") + # No STT default configured in config.yml + logger.debug("No STT default configured in config.yml") return Nonebackends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
207-215: Uselogging.exception()and fix multi-statement line.Line 209 has multiple statements on one line. Per project learnings, use
logging.exception()inside except blocks to automatically capture the full stack trace. Also, use%formatting with logger methods instead of f-strings.🔎 Proposed fix
def get_embedding(self, text: str) -> Optional[List[float]]: ... try: clean_text = re.sub(r'\s+', ' ', text).strip() - if not clean_text: return None + if not clean_text: + return None response = self.client.embeddings.create(input=[clean_text], model=self.embedding_model) return response.data[0].embedding except Exception as e: - logger.error(f"Embedding failed for text length {len(text)}: {e}") + logger.exception("Embedding failed for text length %d", len(text)) return NoneBased on learnings, use
logging.exception()to capture full stack traces in except blocks.
🧹 Nitpick comments (6)
extras/obsidian-injest/search_query.py (2)
36-38: Preferlogging.exceptionwith bareraisefor better exception context.The current pattern uses
logger.errorfollowed byraise e. Per Python best practices and learnings, uselogging.exception(which automatically logs the full stack trace) with bareraise(which preserves the exception context).🔎 Proposed refactor
except Exception as e: - logger.error(f"Failed to generate embedding: {e}") - raise e + logger.exception("Failed to generate embedding") + raiseBased on learnings, prefer
logging.exception()inside except blocks to automatically log the full stack trace, and use bareraiseto preserve the original exception context.
74-76: Preferlogging.exceptionwith bareraisefor better exception context.Same pattern as in
get_embedding: uselogging.exceptionwith bareraisefor better exception handling.🔎 Proposed refactor
except Exception as e: - logger.error(f"Failed to generate answer: {e}") - raise e + logger.exception("Failed to generate answer") + raiseBased on learnings, prefer
logging.exception()inside except blocks to automatically log the full stack trace, and use bareraiseto preserve the original exception context.backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (2)
10-10: Unusedosimport.The
osmodule is no longer used in this file after the migration to config-driven selection. Theresolve_valueutility handles environment variable resolution internally.🔎 Proposed fix
import logging -import os from typing import Optional
77-79: Uselogger.exception()to capture full stack trace.The current
logger.warning()only logs the exception message, losing the full traceback. Based on learnings, preferlogging.exception()inside except blocks to automatically capture the complete stack trace for debugging.🔎 Proposed fix
except Exception as e: - logger.warning(f"Failed to load STT provider from config.yml: {e}") + logger.exception("Failed to load STT provider from config.yml") raisebackends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
41-66: Consider using python-dotenv for .env parsing.The manual .env parser reimplements functionality available in the widely-used
python-dotenvlibrary, which handles edge cases like multi-line values, variable expansion, and various quote styles more robustly.🔎 Alternative using python-dotenv
+from dotenv import dotenv_values + -def load_env_file(filepath: Path) -> dict[str, str]: - """Load environment variables from a .env file. - - Args: - filepath: Path to the .env file to load. - - Returns: - Dictionary of key-value pairs from the .env file. - """ - env_vars = {} - if filepath.exists(): - with open(filepath, 'r') as f: - for line in f: - line = line.strip() - if not line or line.startswith('#'): - continue - if '=' in line: - parts = line.split('=', 1) - key = parts[0].strip() - value = parts[1].strip() if len(parts) > 1 else "" - # Handle quotes - if (value.startswith("'") and value.endswith("'")) or (value.startswith('"') and value.endswith('"')): - value = value[1:-1] - env_vars[key] = value - return env_vars +def load_env_file(filepath: Path) -> dict[str, str]: + """Load environment variables from a .env file.""" + return dict(dotenv_values(filepath)) if filepath.exists() else {}
380-381: Use%formatting with logger and remove redundantstr()conversion.Line 380 uses an f-string with
logger.exception, but Python logging best practices recommend%formatting. Line 381 has a redundantstr(e)conversion since exceptions are already stringified when interpolated.🔎 Proposed fix
except Exception as e: - logger.exception(f"Processing {file} failed") - errors.append(f"{file}: {str(e)}") + logger.exception("Processing %s failed", file) + errors.append(f"{file}: {e}")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py(5 hunks)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py(2 hunks)extras/obsidian-injest/search_query.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pyextras/obsidian-injest/search_query.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pyextras/obsidian-injest/search_query.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
🧬 Code graph analysis (3)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
load_config_yml(72-88)resolve_value(57-70)
extras/obsidian-injest/search_query.py (1)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
get_embedding(198-215)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (1)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
load_config_yml(72-88)resolve_value(57-70)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
101-101: Avoid specifying long messages outside the exception class
(TRY003)
105-105: Avoid specifying long messages outside the exception class
(TRY003)
109-109: Avoid specifying long messages outside the exception class
(TRY003)
113-113: Avoid specifying long messages outside the exception class
(TRY003)
133-133: Avoid specifying long messages outside the exception class
(TRY003)
209-209: Multiple statements on one line (colon)
(E701)
213-213: Do not catch blind exception: Exception
(BLE001)
214-214: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
328-328: Probable use of insecure hash functions in hashlib: md5
(S324)
361-361: Avoid specifying long messages outside the exception class
(TRY003)
381-381: Use explicit conversion flag
Replace with conversion flag
(RUF010)
extras/obsidian-injest/search_query.py
37-37: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
38-38: Use raise without specifying exception name
Remove exception name
(TRY201)
75-75: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
76-76: Use raise without specifying exception name
Remove exception name
(TRY201)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
101-101: Use raise without specifying exception name
Remove exception name
(TRY201)
245-245: Do not catch blind exception: Exception
(BLE001)
247-247: Use explicit conversion flag
Replace with conversion flag
(RUF010)
🔇 Additional comments (5)
extras/obsidian-injest/search_query.py (1)
105-117: Excellent use of nested context managers!The Neo4j driver and session are now properly managed using context managers, ensuring resources are always released even if exceptions occur. This addresses the resource leak concern from the previous review.
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (4)
23-24: LGTM!The imports correctly reference the config loading utilities from the memory.config module, following the underscore naming convention for internal usage.
88-98: LGTM!The config loading logic properly handles missing sections and uses defensive programming with
.get()and fallback defaults. The use ofresolve_valuecorrectly supports environment variable substitution in config values.
119-119: LGTM!The config-derived values are correctly used for health response display fields. The fallback to "not set" for transcription provider is appropriate when config doesn't specify it.
Also applies to: 127-129
234-234: LGTM!The
_llm_providerfield is consistently added to all audioai service health responses, providing useful configuration visibility for debugging.Also applies to: 241-241, 249-249
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backends/advanced/src/advanced_omi_backend/chat_service.py (1)
284-296: Replace silent exception handling with explicit error propagation.The return type
List[MemoryEntry]correctly matches thememory_service.search_memoriesreturn type. However, the exception handling violates coding guidelines by silently catching all exceptions and returning an empty list (lines 294-296). Instead of masking errors withreturn [], propagate the exception or raise an explicit error. This allows callers to handle failures properly rather than receiving empty data that could be mistaken for a successful but empty result.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backends/advanced/src/advanced_omi_backend/chat_service.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/chat_service.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/chat_service.py
🧬 Code graph analysis (1)
backends/advanced/src/advanced_omi_backend/chat_service.py (2)
backends/advanced/src/advanced_omi_backend/services/memory/base.py (1)
MemoryEntry(26-63)backends/advanced/src/advanced_omi_backend/models/user.py (1)
user_id(71-73)
🔇 Additional comments (3)
backends/advanced/src/advanced_omi_backend/chat_service.py (3)
26-26: LGTM! Import properly placed.The
MemoryEntryimport is correctly positioned within the local imports section and is necessary for the type-safe memory handling refactoring.
307-307: LGTM! Correct attribute access for MemoryEntry.The change from dictionary access to attribute access is correct for the
MemoryEntrydataclass, and the defensive check formemory.idappropriately filters out any invalid IDs.
316-318: LGTM! Simplified content access with MemoryEntry.The migration from
memory.get("memory")tomemory.contentcorrectly uses theMemoryEntrydataclass attribute and simplifies the code by eliminating fallback dictionary key checks.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
198-215: Previous review comment still applies.This code segment was already reviewed with suggestions to split the multi-statement line (line 209) and use
logging.exception()instead oflogging.error()(line 214) for better stack trace logging.
385-397: Previous review comment about thread safety still applies.The singleton initialization lacks thread synchronization. The check-then-set pattern on lines 395-396 has a race condition where multiple threads could create separate service instances concurrently, leading to multiple Neo4j driver connections and inconsistent state.
Based on the previous review, consider adding a threading lock with double-check pattern to ensure thread-safe initialization.
🧹 Nitpick comments (4)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (4)
41-65: Consider importing from shared module to eliminate duplication.This
load_env_filefunction duplicates the implementation inextras/obsidian-injest/config.py(lines 15-30). Consider importing from the shared module instead to follow the DRY principle and reduce maintenance overhead.🔎 Proposed refactor
If
extras/obsidian-injest/config.pyis accessible, import the function:from neo4j import GraphDatabase, Driver from advanced_omi_backend.services.memory.providers.llm_providers import _get_openai_client from advanced_omi_backend.services.memory.config import load_config_yml as load_root_config +# Import shared utilities +import sys +from pathlib import Path +sys.path.insert(0, str(Path(__file__).parent.parent.parent.parent.parent / "extras" / "obsidian-injest")) +from config import load_env_file, resolve_value logger = logging.getLogger(__name__) -def load_env_file(filepath: Path) -> dict[str, str]: - """Load environment variables from a .env file. - - Args: - filepath: Path to the .env file to load. - - Returns: - Dictionary of key-value pairs from the .env file. - """ - env_vars = {} - if filepath.exists(): - with open(filepath, 'r') as f: - for line in f: - line = line.strip() - if not line or line.startswith('#'): - continue - if '=' in line: - parts = line.split('=', 1) - key = parts[0].strip() - value = parts[1].strip() if len(parts) > 1 else "" - # Handle quotes - if (value.startswith("'") and value.endswith("'")) or (value.startswith('"') and value.endswith('"')): - value = value[1:-1] - env_vars[key] = value - return env_vars - -def resolve_value(value: str | int | float) -> str | int | float: - """Resolve environment variable references in configuration values. - - Supports ${VAR} and ${VAR:-default} syntax. Returns the original value - if it's not a string or doesn't match the pattern. - """ - if isinstance(value, str) and value.startswith("${") and value.endswith("}"): - content = value[2:-1] - if ":-" in content: - var_name, default_val = content.split(":-", 1) - # Try to get from real environment first, then default - return os.getenv(var_name, default_val) - else: - return os.getenv(content, "") - return valueAlternatively, move these utilities to a shared package within
advanced_omi_backendthat both modules can import.
146-147: Consider making chunking parameters configurable.The
chunk_char_limitandchunk_overlapare hardcoded. Consider loading these fromconfig.ymlto allow tuning without code changes, similar to how embedding and model configurations are handled.🔎 Proposed enhancement
Add chunking configuration to config.yml and load it during initialization:
self.openai_api_key = str(resolve_value(llm_config['api_key'])) # Chunking - can have defaults - self.chunk_char_limit = 1000 - self.chunk_overlap = 200 + chunking_config = config_data.get('chunking', {}) + self.chunk_char_limit = int(chunking_config.get('char_limit', 1000)) + self.chunk_overlap = int(chunking_config.get('overlap', 200))
296-298: Consider lazy formatting for logging.Line 297 uses an f-string in the log statement. For consistency with Python logging best practices, consider using lazy formatting with
%style:- logger.info( - f"Processing: {note_data['path']} ({len(note_data['content'])} chars -> {len(text_chunks)} chunks)" - ) + logger.info( + "Processing: %s (%d chars -> %d chunks)", + note_data['path'], len(note_data['content']), len(text_chunks) + )
368-383: Main issue addressed; consider lazy formatting.The switch from
logger.errortologger.exceptionon line 380 properly captures stack traces (addressing the previous review). As a minor refinement, consider using lazy formatting instead of an f-string:except Exception as e: - logger.exception(f"Processing {file} failed") + logger.exception("Processing %s failed", file) errors.append(f"{file}: {str(e)}")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
🧬 Code graph analysis (1)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (4)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
_get_openai_client(45-69)backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
load_config_yml(72-88)resolve_value(57-70)extras/obsidian-injest/config.py (2)
load_env_file(16-31)resolve_value(34-42)extras/obsidian-injest/search_query.py (1)
get_embedding(24-38)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
101-101: Avoid specifying long messages outside the exception class
(TRY003)
105-105: Avoid specifying long messages outside the exception class
(TRY003)
109-109: Avoid specifying long messages outside the exception class
(TRY003)
113-113: Avoid specifying long messages outside the exception class
(TRY003)
133-133: Avoid specifying long messages outside the exception class
(TRY003)
209-209: Multiple statements on one line (colon)
(E701)
213-213: Do not catch blind exception: Exception
(BLE001)
214-214: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
328-328: Probable use of insecure hash functions in hashlib: md5
(S324)
361-361: Avoid specifying long messages outside the exception class
(TRY003)
381-381: Use explicit conversion flag
Replace with conversion flag
(RUF010)
🔇 Additional comments (10)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (10)
1-26: LGTM!The module docstring is comprehensive, and imports are properly organized following the coding guidelines (standard library, third-party, then local imports, all at the top of the file).
28-39: LGTM!The TypedDict definitions provide clear type hints for note data and chunk payloads.
118-122: Comment clarity improved with TODO note.The updated comment on line 120 now includes a TODO acknowledging the cleanup needed after Kubernetes migration, which addresses the previous concern about the misleading comment. The fallback path structure is now clearer.
157-168: LGTM!The driver initialization correctly passes authentication credentials. The previously flagged misleading comment has been removed.
170-178: LGTM!Resource cleanup is properly implemented with driver closure and reset functionality.
180-196: Verify the necessity of reset_driver() call.Line 183 calls
reset_driver()to "ensure we use current credentials," but credentials are set during__init__and don't appear to change. This closes and recreates the driver connection unnecessarily on everysetup_database()call. Consider removing this unless there's a specific use case where credentials change between initialization and setup.If credentials can indeed change, please clarify the scenario; otherwise, consider simplifying:
def setup_database(self) -> None: """Create database constraints and vector index for notes and chunks.""" - # Reset driver to ensure we use current credentials - self.reset_driver() driver = self._get_driver()
217-237: LGTM!The character-based chunking logic correctly implements overlapping windows and handles edge cases appropriately.
239-284: LGTM!The parsing logic is robust with proper encoding fallbacks, frontmatter extraction, and regex patterns for links and tags. The Windows path fallback is a thoughtful addition.
310-343: LGTM!The Neo4j ingestion logic properly handles notes, chunks, tags, and links with appropriate MERGE operations. The MD5 usage on line 328 is acceptable for generating chunk IDs (non-cryptographic purpose). The forward reference handling for links (lines 341-342) correctly creates placeholder nodes that will be populated when the target note is processed.
399-406: LGTM - clever lazy initialization pattern!The
_ObsidianServiceProxyprovides backward-compatible module-level access while enabling lazy initialization. This addresses the earlier concern about module-level instantiation causing startup failures.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
207-215: Uselogging.exception()and split multi-statement line.Per project learnings and static analysis, use
logging.exception()inside except blocks and avoid multiple statements on one line.🔎 Proposed fix
def get_embedding(self, text: str) -> Optional[List[float]]: ... try: clean_text = re.sub(r'\s+', ' ', text).strip() - if not clean_text: return None + if not clean_text: + return None response = self.client.embeddings.create(input=[clean_text], model=self.embedding_model) return response.data[0].embedding except Exception as e: - logger.error(f"Embedding failed for text length {len(text)}: {e}") + logger.exception("Embedding failed for text length %d", len(text)) return None
444-453: Singleton initialization is not thread-safe.The check-then-set pattern has a race condition: multiple threads could pass the
if _obsidian_service is Nonecheck simultaneously and create multiple instances, leading to resource leaks with multiple Neo4j driver connections.🔎 Proposed fix using threading.Lock
+import threading + # Lazy initialization to avoid startup failures _obsidian_service: Optional[ObsidianService] = None +_service_lock = threading.Lock() def get_obsidian_service() -> ObsidianService: """Get or create the Obsidian service singleton. Returns: ObsidianService instance, creating it on first access. """ global _obsidian_service if _obsidian_service is None: - _obsidian_service = ObsidianService() + with _service_lock: + # Double-check pattern after acquiring lock + if _obsidian_service is None: + _obsidian_service = ObsidianService() return _obsidian_service
🧹 Nitpick comments (3)
backends/advanced/webui/src/pages/Chat.tsx (1)
519-531: Consider disabling the checkbox while sending.The checkbox remains interactive during message sending. While functionally safe (the state is captured at send time), it could confuse users if they toggle it mid-stream and expect the change to affect the current response.
🔎 Proposed fix
<input type="checkbox" id="includeObsidian" checked={includeObsidian} onChange={(e) => setIncludeObsidian(e.target.checked)} - className="h-4 w-4 text-blue-600 rounded border-gray-300 focus:ring-blue-500" + className="h-4 w-4 text-blue-600 rounded border-gray-300 focus:ring-blue-500 disabled:opacity-50" + disabled={isSending} />backends/advanced/tests/test_obsidian_service.py (1)
3-6: Consider using pytest with proper test discovery instead of sys.path manipulation.The
sys.path.appendpattern is a common workaround but can lead to import issues and makes the test less portable. Consider configuring pytest with proper package discovery or usingpip install -e .for development installs.backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
436-438: Uselogging.exception()for better stack traces.Per project learnings, use
logging.exception()inside except blocks to automatically capture the full stack trace.🔎 Proposed fix
except Exception as e: - logger.error(f"Obsidian search failed: {e}") + logger.exception("Obsidian search failed")Based on learnings from previous reviews.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
backends/advanced/src/advanced_omi_backend/chat_service.py(6 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/chat_routes.py(2 hunks)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py(1 hunks)backends/advanced/tests/test_obsidian_service.py(1 hunks)backends/advanced/webui/src/pages/Chat.tsx(3 hunks)backends/advanced/webui/src/services/api.ts(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/chat_routes.pybackends/advanced/src/advanced_omi_backend/chat_service.pybackends/advanced/tests/test_obsidian_service.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/chat_routes.pybackends/advanced/src/advanced_omi_backend/chat_service.pybackends/advanced/tests/test_obsidian_service.py
🧬 Code graph analysis (2)
backends/advanced/src/advanced_omi_backend/chat_service.py (2)
backends/advanced/src/advanced_omi_backend/services/memory/base.py (1)
MemoryEntry(26-63)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
get_obsidian_service(444-453)search_obsidian(385-439)
backends/advanced/tests/test_obsidian_service.py (1)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (4)
search_obsidian(385-439)setup_database(180-196)character_chunker(217-237)ingest_note_and_chunks(310-343)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
101-101: Avoid specifying long messages outside the exception class
(TRY003)
105-105: Avoid specifying long messages outside the exception class
(TRY003)
109-109: Avoid specifying long messages outside the exception class
(TRY003)
113-113: Avoid specifying long messages outside the exception class
(TRY003)
133-133: Avoid specifying long messages outside the exception class
(TRY003)
209-209: Multiple statements on one line (colon)
(E701)
213-213: Do not catch blind exception: Exception
(BLE001)
214-214: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
328-328: Probable use of insecure hash functions in hashlib: md5
(S324)
361-361: Avoid specifying long messages outside the exception class
(TRY003)
381-381: Use explicit conversion flag
Replace with conversion flag
(RUF010)
436-436: Do not catch blind exception: Exception
(BLE001)
437-437: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/chat_service.py
334-334: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
335-335: Use raise without specifying exception name
Remove exception name
(TRY201)
🔇 Additional comments (13)
backends/advanced/src/advanced_omi_backend/routers/modules/chat_routes.py (2)
30-34: Clean addition of the Obsidian memory flag.The
include_obsidian_memoryfield is well-designed with a sensibleFalsedefault, ensuring backward compatibility. The field description clearly communicates its purpose.
310-315: LGTM - flag properly propagated to service layer.The
include_obsidian_memoryflag is correctly passed togenerate_response_stream, maintaining the request's intent through the service call.backends/advanced/webui/src/pages/Chat.tsx (1)
47-47: LGTM - state initialization is clean.The
includeObsidianstate withfalsedefault aligns with the backend's default behavior.backends/advanced/tests/test_obsidian_service.py (3)
12-51: Well-structured test setup with comprehensive mocking.The setUp method properly mocks all external dependencies (config loading, OpenAI client, Neo4j driver, environment variables) and uses
addCleanupfor reliable teardown. The mock session context manager is correctly configured.
53-99: Comprehensive test for the search flow.The test properly validates the embedding call, Neo4j query execution, and result formatting. The assertions check for proper parameter passing and output structure.
155-188: Good failure and edge case coverage.The tests for embedding failure, database failure, and empty results ensure the service degrades gracefully and returns empty lists instead of raising exceptions to callers.
backends/advanced/webui/src/services/api.ts (2)
230-247: Clean API namespace for Obsidian operations.The
obsidianApinamespace is well-structured with appropriate methods for the full ingestion lifecycle. The 5-minute timeout for zip uploads is reasonable for large vaults.
271-278: LGTM - conditional inclusion handles the flag correctly.The flag is only added when truthy, which aligns with the backend's default of
False. The snake_case key matches the Pydantic model.backends/advanced/src/advanced_omi_backend/chat_service.py (2)
26-27: Good type imports for better type safety.Importing
MemoryEntryandget_obsidian_serviceat the top of the file follows the coding guidelines and improves type annotations throughout the service.
285-320: Clean migration to typed MemoryEntry.The refactoring from dictionary access (
memory.get("id"),memory.get("memory")) to typed attribute access (memory.id,memory.content) improves type safety and aligns with theMemoryEntrydataclass.backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (3)
28-81: Clean utility functions and type definitions.The
NoteDataandChunkPayloadTypedDicts provide clear contracts for data structures. Theload_env_fileandresolve_valueutilities handle edge cases well (quotes, comments, default values).
455-462: Proxy pattern enables lazy module-level access.The
_ObsidianServiceProxyclass provides backward-compatible module-level access (obsidian_service.method()) while maintaining lazy initialization. This is a clean solution to avoid eager instantiation at import time.
345-383: Vault ingestion with proper error handling.The
ingest_vaultmethod correctly useslogger.exceptionfor failed note processing and continues processing other notes, collecting errors for the final report. This provides good resilience for large vault ingestion.
| # Add Obsidian context if requested | ||
| if include_obsidian_memory: | ||
| try: | ||
| obsidian_service = get_obsidian_service() | ||
| obsidian_context = obsidian_service.search_obsidian(current_message) | ||
| if obsidian_context: | ||
| context_parts.append("# Relevant Obsidian Notes:") | ||
| for entry in obsidian_context: | ||
| context_parts.append(entry) | ||
| context_parts.append("") | ||
| logger.info(f"Added {len(obsidian_context)} Obsidian notes to context") | ||
| except Exception as e: | ||
| logger.error(f"Failed to get Obsidian context: {e}") | ||
| raise e |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Use logging.exception() and bare raise for better stack traces.
Per project learnings and static analysis hints, use logging.exception() inside except blocks to automatically capture the full stack trace, and use bare raise to preserve the original exception context.
🔎 Proposed fix
# Add Obsidian context if requested
if include_obsidian_memory:
try:
obsidian_service = get_obsidian_service()
obsidian_context = obsidian_service.search_obsidian(current_message)
if obsidian_context:
context_parts.append("# Relevant Obsidian Notes:")
for entry in obsidian_context:
context_parts.append(entry)
context_parts.append("")
logger.info(f"Added {len(obsidian_context)} Obsidian notes to context")
except Exception as e:
- logger.error(f"Failed to get Obsidian context: {e}")
- raise e
+ logger.exception("Failed to get Obsidian context")
+ raiseBased on learnings from previous reviews in this project.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Add Obsidian context if requested | |
| if include_obsidian_memory: | |
| try: | |
| obsidian_service = get_obsidian_service() | |
| obsidian_context = obsidian_service.search_obsidian(current_message) | |
| if obsidian_context: | |
| context_parts.append("# Relevant Obsidian Notes:") | |
| for entry in obsidian_context: | |
| context_parts.append(entry) | |
| context_parts.append("") | |
| logger.info(f"Added {len(obsidian_context)} Obsidian notes to context") | |
| except Exception as e: | |
| logger.error(f"Failed to get Obsidian context: {e}") | |
| raise e | |
| # Add Obsidian context if requested | |
| if include_obsidian_memory: | |
| try: | |
| obsidian_service = get_obsidian_service() | |
| obsidian_context = obsidian_service.search_obsidian(current_message) | |
| if obsidian_context: | |
| context_parts.append("# Relevant Obsidian Notes:") | |
| for entry in obsidian_context: | |
| context_parts.append(entry) | |
| context_parts.append("") | |
| logger.info(f"Added {len(obsidian_context)} Obsidian notes to context") | |
| except Exception as e: | |
| logger.exception("Failed to get Obsidian context") | |
| raise |
🧰 Tools
🪛 Ruff (0.14.8)
334-334: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
335-335: Use raise without specifying exception name
Remove exception name
(TRY201)
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/chat_service.py around lines 322
to 335, the except block logs the error with logger.error and re-raises using
raise e which loses the original traceback; replace logger.error(f"Failed to get
Obsidian context: {e}") with logger.exception("Failed to get Obsidian context")
to capture the full stack trace, and change raise e to a bare raise to preserve
the original exception context.
There was a problem hiding this comment.
Single config has memory and llm models
…to feat/add-obsidian
thestumonkey
left a comment
There was a problem hiding this comment.
This PR has a lot of duplication instead of using standard functions we have already.
1: jobs. all jobs should be processed by queues and status shown on the queue management page using queue controller
2: upload (small issue): we should leverage existing upload functions we have to upload gdrive or audio to extend to upload zip.
3: service and processing. We currently extract memories from conversations, pass this to openmemory which uses neo4j as graph memory for storing inference and memories. The obsidian service is writing direct to neo4j, which creates multiple memory banks for chat to leverage instead of a single one.
4: there are some config changes here and use of llm models. this should be linked into the existing llm provider class
We should use openmemory to process the obsidian files directly, and use a type to send to it. (conversation or note).
Otherwise, if we don't want to do this, we should connect to neo4j via a proper client which isn't part of obsidian service.
There was a problem hiding this comment.
This looks like it should live in utils/job_utils or queue_controller (which should maybe be job controller). this seems like an interface to redis and only the job prefix has anything to do with obsidian service
There was a problem hiding this comment.
we also have a job base class which is used for processing jobs on redis using RQ. Please use this for processing jobs, as this is handled natively with RQ.
There was a problem hiding this comment.
also functions in queue_routes that should provide what you need. This will then hook it into the queue management page so we can see progress.
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
As far as I can see, this service goes through the markdown files creates an LLM and writes directly to NEO4J.
This can be achieved using the current infra by just passing the text to the memory router add_memory
(which writes to neo4J using openmemory)
If there are specific obsidian characteristics, then I would make a model for obsidianmemory/memory and then use a custom prompt to process it using memory controller.
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
Outdated
Show resolved
Hide resolved
extras/obsidian-injest/config.py
Outdated
There was a problem hiding this comment.
why is this in extras? duplicates existing fucntions
There was a problem hiding this comment.
why is this in exteras
|
Can you also provide some obsidian test data and put into tests/test_assets ? |
There was a problem hiding this comment.
Actionable comments posted: 9
♻️ Duplicate comments (4)
backends/advanced/src/advanced_omi_backend/chat_service.py (1)
322-335: Uselogging.exception()and bareraisefor better stack traces.This is the same issue flagged in a previous review. Use
logging.exception()to capture the full stack trace and bareraiseto preserve the original exception context.🔎 Proposed fix
# Add Obsidian context if requested if include_obsidian_memory: try: obsidian_service = get_obsidian_service() obsidian_context = await obsidian_service.search_obsidian(current_message) if obsidian_context: context_parts.append("# Relevant Obsidian Notes:") for entry in obsidian_context: context_parts.append(entry) context_parts.append("") logger.info(f"Added {len(obsidian_context)} Obsidian notes to context") except Exception as e: - logger.error(f"Failed to get Obsidian context: {e}") - raise e + logger.exception("Failed to get Obsidian context") + raisebackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (2)
46-48: Improve exception handling with stack trace and exception chaining.Same issue flagged in previous review. Use
logging.exception()and chain withfrom e.🔎 Proposed fix
except Exception as e: - logger.error(f"Ingestion failed: {e}") - raise HTTPException(status_code=500, detail=str(e)) + logger.exception("Ingestion failed") + raise HTTPException(status_code=500, detail=str(e)) from e
111-113: Improve exception handling with stack trace and exception chaining.Same issue flagged in previous review.
🔎 Proposed fix
except Exception as e: - logger.error(f"Failed to process uploaded zip: {e}") - raise HTTPException(status_code=500, detail=f"Failed to process uploaded zip: {e}") + logger.exception("Failed to process uploaded zip") + raise HTTPException(status_code=500, detail=f"Failed to process uploaded zip: {e}") from ebackends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
265-274: Uselogging.exception()for automatic stack trace capture.Per project learnings,
logging.exception()should be used inside except blocks to automatically capture the full stack trace, improving debuggability.Based on learnings, when re-raising exceptions, chain with
raise ... from e.🔎 Proposed fix
try: vectors = await generate_openai_embeddings( cleaned_chunks, api_key=self.openai_api_key, base_url=self.openai_base_url, model=self.embedding_model, ) except Exception as e: - logger.error(f"Embedding generation failed for {note_data['path']}: {e}") + logger.exception("Embedding generation failed for %s", note_data['path']) return []
🧹 Nitpick comments (12)
backends/advanced/src/advanced_omi_backend/utils/model_utils.py (1)
39-39: Optional: Consider extracting error messages.Ruff's TRY003 rule suggests avoiding long messages directly in exception constructors. You could optionally extract these to constants or a separate error message module for better maintainability, though the current inline messages are clear and contextual.
Also applies to: 45-45
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
277-279: Uselogging.exceptionfor better error diagnostics.Replace
memory_logger.errorwithmemory_logger.exceptionto automatically capture the full stack trace.Based on learnings, using
logging.exception()inside except blocks improves debuggability by logging the complete traceback.🔎 Proposed fix
except Exception as e: - memory_logger.error(f"OpenAI embedding generation failed: {e}") + memory_logger.exception("OpenAI embedding generation failed") raisebackends/advanced/src/advanced_omi_backend/utils/file_utils.py (1)
66-81: Uselogging.exception()in except blocks for full stack traces.Per repository conventions, use
logging.exception()instead oflogging.error()inside except blocks to automatically capture the full stack trace for better debuggability. The exception chaining withfrom eis correctly implemented.🔎 Proposed fix
except zipfile.BadZipFile as e: error_msg = f"Invalid zip file: {zip_path} - {e}" - logger.error(error_msg) + logger.exception(error_msg) raise zipfile.BadZipFile(error_msg) from e except zipfile.LargeZipFile as e: error_msg = f"Zip file too large: {zip_path} - {e}" - logger.error(error_msg) + logger.exception(error_msg) raise ZipExtractionError(error_msg) from e except PermissionError as e: error_msg = f"Permission denied extracting zip file: {zip_path} - {e}" - logger.error(error_msg) + logger.exception(error_msg) raise ZipExtractionError(error_msg) from e except Exception as e: error_msg = f"Error extracting zip file {zip_path}: {e}" - logger.error(error_msg) + logger.exception(error_msg) raise ZipExtractionError(error_msg) from eBased on learnings from previous reviews in this project.
backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py (3)
17-25: Rename unused loop variablerootto_root.The
rootvariable in the loop is not used directly in the loop body (onlydirsandfilesare used). While the in-place modification ofdirsaffects the walk, therootvariable itself is unused.🔎 Proposed fix
def count_markdown_files(vault_path: str) -> int: """Recursively count markdown files in a vault.""" count = 0 - for root, dirs, files in os.walk(vault_path): + for _root, dirs, files in os.walk(vault_path): dirs[:] = [d for d in dirs if not d.startswith(".")] for filename in files: if filename.endswith(".md"): count += 1 return count
93-97: Uselogging.exception()for per-file processing errors.Per repository conventions, use
logging.exception()instead oflogger.error()to capture the full stack trace for debugging.🔎 Proposed fix
except Exception as exc: - logger.error("Processing %s failed: %s", filename, exc) + logger.exception("Processing %s failed", filename) errors.append(f"{filename}: {exc}") job.meta["errors"] = errors job.save_meta()Based on learnings from previous reviews in this project.
46-51: Remove redundant exception object fromlogging.exceptioncall.When using
logging.exception(), the exception info is automatically included, making theexcparameter in the format string redundant.🔎 Proposed fix
except Exception as exc: - logger.exception("Database setup failed for job %s: %s", job.id, exc) + logger.exception("Database setup failed for job %s", job.id) job.meta["status"] = "failed" job.meta["error"] = f"Database setup failed: {exc}" job.save_meta() raisebackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (1)
152-154: Uselogging.exception()and chain the exception.🔎 Proposed fix
except Exception as e: - logger.error(f"Failed to start job {job_id}: {e}") - raise HTTPException(status_code=500, detail=f"Failed to start job: {e}") + logger.exception(f"Failed to start job {job_id}") + raise HTTPException(status_code=500, detail=f"Failed to start job: {e}") from eBased on learnings from previous reviews in this project.
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (5)
48-88: Consider consolidating with existing utility functions.Per previous feedback, environment loading utilities already exist at
upload_audio/load_env_file,scripts/lib/env_utils, andwizard.py. Theresolve_valuefunction is also generic enough to be shared across services. Consider moving these to a shared utilities module to reduce duplication and improve maintainability.Based on learnings, this consolidation would align with the DRY principle and make these utilities available for other services.
118-118: Address technical debt: cleanup after k8s migration.The inline comment indicates this path handling needs cleanup after k8s migration is complete and the root
.envfile is removed. Consider creating a tracking issue for this technical debt to ensure it's addressed.
277-277: Add explicitstrict=Trueparameter tozip().Without
strict=True, mismatched lengths betweenoriginal_chunksandvectorswould silently truncate to the shorter sequence. If embedding generation partially fails, this could mask data loss.🔎 Proposed fix
- for orig_text, vector in zip(original_chunks, vectors): + for orig_text, vector in zip(original_chunks, vectors, strict=True): if vector: chunk_payloads.append({"text": orig_text, "embedding": vector}) else: logger.warning(f"Embedding missing for chunk in {note_data['path']}")
302-302: Consider UUID-based chunk IDs for consistency.While MD5 is acceptable here for generating deterministic chunk IDs (not cryptographic use), consider using
hashlib.sha256()oruuid.uuid5()for consistency with security best practices and to avoid static analysis flags.
426-438: Consider adding thread safety to singleton initialization.Per previous review, the check-then-set pattern has a theoretical race condition where multiple threads could simultaneously create service instances. While Python's GIL makes this less likely in practice, using a
threading.Lockwith double-check pattern would follow best practices.🔎 Proposed fix with threading.Lock
+import threading + # Lazy initialization to avoid startup failures _obsidian_service: Optional[ObsidianService] = None +_service_lock = threading.Lock() def get_obsidian_service() -> ObsidianService: """Get or create the Obsidian service singleton. Returns: ObsidianService instance, creating it on first access. """ global _obsidian_service if _obsidian_service is None: - _obsidian_service = ObsidianService() + with _service_lock: + # Double-check: another thread may have initialized it + if _obsidian_service is None: + _obsidian_service = ObsidianService() return _obsidian_service
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
backends/advanced/src/advanced_omi_backend/chat_service.py(7 hunks)backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py(2 hunks)backends/advanced/src/advanced_omi_backend/services/neo4j_client.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py(1 hunks)backends/advanced/src/advanced_omi_backend/utils/file_utils.py(1 hunks)backends/advanced/src/advanced_omi_backend/utils/model_utils.py(1 hunks)backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py(1 hunks)backends/advanced/tests/test_obsidian_service.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backends/advanced/tests/test_obsidian_service.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.pybackends/advanced/src/advanced_omi_backend/utils/file_utils.pybackends/advanced/src/advanced_omi_backend/chat_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.pybackends/advanced/src/advanced_omi_backend/utils/model_utils.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.pybackends/advanced/src/advanced_omi_backend/services/neo4j_client.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.py
🧠 Learnings (2)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.pybackends/advanced/src/advanced_omi_backend/utils/file_utils.pybackends/advanced/src/advanced_omi_backend/chat_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.pybackends/advanced/src/advanced_omi_backend/utils/model_utils.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.pybackends/advanced/src/advanced_omi_backend/services/neo4j_client.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.py
📚 Learning: 2025-12-20T01:11:51.486Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-20T01:11:51.486Z
Learning: Applies to **/*.py : Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Applied to files:
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
🧬 Code graph analysis (4)
backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py (3)
backends/advanced/src/advanced_omi_backend/models/job.py (1)
async_job(203-290)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (4)
setup_database(158-173)parse_obsidian_note(180-233)chunking_and_embedding(235-283)ingest_note_and_chunks(285-317)backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (1)
get_status(171-225)
backends/advanced/src/advanced_omi_backend/chat_service.py (2)
backends/advanced/src/advanced_omi_backend/services/memory/base.py (1)
MemoryEntry(26-63)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
get_obsidian_service(429-438)search_obsidian(359-424)
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (2)
backends/advanced/src/advanced_omi_backend/utils/file_utils.py (2)
extract_zip(21-81)ZipExtractionError(16-18)backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py (2)
count_markdown_files(17-25)ingest_obsidian_vault_job(29-107)
backends/advanced/src/advanced_omi_backend/services/neo4j_client.py (1)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
close(150-152)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py
20-20: Loop control variable root not used within loop body
Rename unused root to _root
(B007)
29-29: Unused function argument: job_id
(ARG001)
29-29: Unused function argument: redis_client
(ARG001)
47-47: Redundant exception object included in logging.exception call
(TRY401)
93-93: Do not catch blind exception: Exception
(BLE001)
94-94: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/utils/file_utils.py
65-65: Consider moving this statement to an else block
(TRY300)
68-68: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
72-72: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
76-76: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
80-80: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/chat_service.py
334-334: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
335-335: Use raise without specifying exception name
Remove exception name
(TRY201)
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
34-34: Unused function argument: current_user
(ARG001)
34-34: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
45-45: Consider moving this statement to an else block
(TRY300)
46-46: Do not catch blind exception: Exception
(BLE001)
47-47: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
48-48: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
53-53: Do not perform function call File in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
54-54: Unused function argument: current_user
(ARG001)
54-54: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
82-82: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
83-83: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
89-89: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
90-90: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
92-92: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
93-93: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
111-111: Do not catch blind exception: Exception
(BLE001)
112-112: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
113-113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
121-121: Use raise without specifying exception name
Remove exception name
(TRY201)
127-127: Unused function argument: current_user
(ARG001)
127-127: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
151-151: Consider moving this statement to an else block
(TRY300)
152-152: Do not catch blind exception: Exception
(BLE001)
153-153: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
154-154: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
167-167: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
171-171: Unused function argument: current_user
(ARG001)
171-171: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
222-222: Do not use bare except
(E722)
222-223: try-except-pass detected, consider logging the exception
(S110)
225-225: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
backends/advanced/src/advanced_omi_backend/utils/model_utils.py
39-39: Avoid specifying long messages outside the exception class
(TRY003)
45-45: Avoid specifying long messages outside the exception class
(TRY003)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
278-278: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
107-107: Avoid specifying long messages outside the exception class
(TRY003)
111-111: Avoid specifying long messages outside the exception class
(TRY003)
131-131: Avoid specifying long messages outside the exception class
(TRY003)
272-272: Do not catch blind exception: Exception
(BLE001)
273-273: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
277-277: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
302-302: Probable use of insecure hash functions in hashlib: md5
(S324)
335-335: Avoid specifying long messages outside the exception class
(TRY003)
355-355: Use explicit conversion flag
Replace with conversion flag
(RUF010)
379-379: Redundant exception object included in logging.exception call
(TRY401)
380-380: Use raise without specifying exception name
Remove exception name
(TRY201)
421-421: Do not catch blind exception: Exception
(BLE001)
422-422: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🔇 Additional comments (9)
backends/advanced/src/advanced_omi_backend/utils/model_utils.py (1)
37-45: LGTM!The implementation logic is clear and correct. It properly validates the configuration structure, safely navigates nested dictionaries, and provides explicit error handling with informative messages when defaults or model definitions are missing.
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
271-279: AI summary inconsistency with exception handling.The AI summary states that "exceptions are logged and swallowed (no explicit return value provided on error)," but the code at line 279 contains a
raisestatement that re-raises the exception. The exception is not swallowed—it propagates to the caller.backends/advanced/src/advanced_omi_backend/utils/file_utils.py (1)
1-18: Clean module structure with proper imports.Good implementation with clear docstring, standard library imports at the top, and a well-defined custom exception class. The function signature with type hints and comprehensive docstring is helpful.
backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py (1)
74-80: Good implementation of cancellation support.The cancellation check pattern with
job.refresh()and status check allows users to cancel long-running jobs gracefully. The metadata is properly updated before returning.backends/advanced/src/advanced_omi_backend/services/neo4j_client.py (1)
7-33: Clean implementation of lazy driver initialization.The
Neo4jClientclass properly implements lazy driver initialization withget_driver(), and theclose()andreset()methods handle cleanup appropriately. The session factory method with optional access mode is well-designed.backends/advanced/src/advanced_omi_backend/chat_service.py (2)
285-297: Correct migration toMemoryEntrytype.The return type change from
List[Dict]toList[MemoryEntry]and the corresponding attribute access pattern (.id,.content) properly align with theMemoryEntrydataclass definition.
299-320: Clean integration of optional Obsidian memory parameter.The
include_obsidian_memoryparameter is properly threaded through the call chain and the context formatting logic handles both memory types appropriately.backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (2)
124-151: Good integration with RQ queue system.The
/startendpoint properly enqueues jobs to the RQ queue with appropriate parameters (job_id,job_timeout,description). The pending state is correctly managed by removing the Redis key after successful enqueueing.
170-203: Well-structured status reporting with fallback logic.The status endpoint correctly checks RQ first and falls back to pending state. The status mapping from RQ terminology (
started→running,canceled→cancelled) and progress calculation are properly implemented.
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/neo4j_client.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/neo4j_client.py
Outdated
Show resolved
Hide resolved
| def __init__(self): | ||
| """Initialize the Obsidian service with configuration from config.yml and environment.""" | ||
| # Resolve paths relative to this file | ||
| # backends/advanced/src/advanced_omi_backend/services/obsidian_service.py | ||
| self.CURRENT_DIR = Path(__file__).parent.resolve() | ||
|
|
||
| # Load configuration strictly from standard locations | ||
| # Prefer /app/config.yml inside containers (mounted by docker-compose) | ||
| # Fallbacks handled by shared utility | ||
| config_data = load_root_config() | ||
|
|
||
| # Get model configurations using shared utility | ||
| llm_config = get_model_config(config_data, 'llm') | ||
| if not llm_config: | ||
| raise ValueError("Configuration for 'defaults.llm' not found in config.yml") | ||
|
|
||
| embed_config = get_model_config(config_data, 'embedding') | ||
| if not embed_config: | ||
| raise ValueError("Configuration for 'defaults.embedding' not found in config.yml") | ||
|
|
||
| # Neo4j Connection - Prefer environment variables passed by Docker Compose | ||
| neo4j_host = os.getenv("NEO4J_HOST") | ||
| # Load .env file as fallback (for local dev or if env vars not set) | ||
| candidate_env_files = [ | ||
| Path("/app/.env"), | ||
| self.CURRENT_DIR.parent.parent.parent.parent / ".env", # Project root .env file ToDo cleanup needed after k8s is migrated and there is no .env file in the project root. | ||
| self.CURRENT_DIR.parent.parent.parent.parent / "backends" / "advanced" / ".env", # repo path | ||
| ] | ||
| env_data = {} | ||
| for p in candidate_env_files: | ||
| if p.exists(): | ||
| env_data.update(load_env_file(p)) | ||
|
|
||
| # Use env var first, then fallback to .env file | ||
| if not neo4j_host: | ||
| neo4j_host = env_data.get("NEO4J_HOST") | ||
|
|
||
| if not neo4j_host: | ||
| raise KeyError("NEO4J_HOST not found in environment or .env") | ||
|
|
||
| self.neo4j_uri = f"bolt://{neo4j_host}:7687" | ||
| self.neo4j_user = os.getenv("NEO4J_USER") or env_data.get("NEO4J_USER", "neo4j") | ||
| self.neo4j_password = os.getenv("NEO4J_PASSWORD") or env_data.get("NEO4J_PASSWORD", "") | ||
|
|
||
| # Models / API - Loaded strictly from config.yml | ||
| self.embedding_model = str(resolve_value(embed_config['model_name'])) | ||
| self.embedding_dimensions = int(resolve_value(embed_config['embedding_dimensions'])) | ||
| self.openai_base_url = str(resolve_value(llm_config['model_url'])) | ||
| self.openai_api_key = str(resolve_value(llm_config['api_key'])) | ||
|
|
||
| # Chunking - uses shared spaCy/text fallback utility | ||
| self.chunk_word_limit = 120 | ||
|
|
||
| self.neo4j_client = Neo4jClient(self.neo4j_uri, self.neo4j_user, self.neo4j_password) | ||
| self.read_interface = Neo4jReadInterface(self.neo4j_client) | ||
| self.write_interface = Neo4jWriteInterface(self.neo4j_client) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Re-evaluate architectural approach to leverage existing infrastructure.
Per maintainer feedback, this service appears to duplicate existing functionality. The current memory router's add_memory already writes to Neo4j via OpenMemory. Consider refactoring to use the existing memory infrastructure (services/memory/providers/llm_provider and memory router) rather than implementing a parallel ingestion system. If Obsidian-specific characteristics require custom handling, create an ObsidianMemory model and use a custom prompt through the memory controller.
This approach would reduce code duplication, maintain consistency with the existing memory system, and leverage proven infrastructure.
🧰 Tools
🪛 Ruff (0.14.8)
107-107: Avoid specifying long messages outside the exception class
(TRY003)
111-111: Avoid specifying long messages outside the exception class
(TRY003)
131-131: Avoid specifying long messages outside the exception class
(TRY003)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
Outdated
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/utils/model_utils.py
Outdated
Show resolved
Hide resolved
|
|
||
|
|
||
| @async_job(redis=True, beanie=False) | ||
| async def ingest_obsidian_vault_job(job_id: str, vault_path: str, redis_client=None) -> dict: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for the function definition and call sites
cd /repo 2>/dev/null || cd .
# Find the file
find . -name "obsidian_jobs.py" -type f 2>/dev/null | head -5Repository: chronicler-ai/chronicle
Length of output: 136
🏁 Script executed:
#!/bin/bash
# Look for the function and how it's called
rg "ingest_obsidian_vault_job" --type=py -B2 -A10 2>/dev/null | head -100Repository: chronicler-ai/chronicle
Length of output: 4187
🏁 Script executed:
#!/bin/bash
# Check decorator pattern and redis_client usage
rg "@async_job|redis_client" --type=py -B1 -A3 2>/dev/null | head -100Repository: chronicler-ai/chronicle
Length of output: 9797
🏁 Script executed:
#!/bin/bash
# Read the obsidian_jobs.py file to see full function implementation
cat -n backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py | head -150Repository: chronicler-ai/chronicle
Length of output: 4161
Remove unused job_id and redis_client parameters.
The job_id parameter is never used; the function always retrieves the job ID via get_current_job().id instead. The redis_client parameter is injected by the @async_job decorator but is never referenced in the function body. Consider removing both parameters or documenting their intended purpose.
Additionally, fix logging inconsistencies: line 47 uses logger.exception() with exc in the format string (redundant), while line 94 uses logger.error() with the same pattern. Use logger.exception() without the exc argument (which automatically includes traceback) or use logger.error() consistently.
🧰 Tools
🪛 Ruff (0.14.8)
29-29: Unused function argument: job_id
(ARG001)
29-29: Unused function argument: redis_client
(ARG001)
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py around
lines 29, 47 and 94: remove the unused function parameters `job_id` and
`redis_client` from the async def ingest_obsidian_vault_job signature (the
function already uses get_current_job().id), and update any decorators/call
sites that inject `redis_client` so they no longer expect those args; also fix
logging: at line 47 remove the redundant `exc` interpolation and call
logger.exception("...") (which includes traceback automatically), and change the
error at line 94 to use logger.exception("...") (or use logger.error without
interpolating the exception) so logging is consistent and does not duplicate the
exception object in the formatted message.
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (8)
backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py (3)
28-29: Remove unused function parameters.The
job_idandredis_clientparameters are never used in the function body. The function always retrieves the job ID viaget_current_job().idinstead of using thejob_idparameter. Based on learnings, remove unused parameters.🔎 Proposed fix
@async_job(redis=True, beanie=False) -async def ingest_obsidian_vault_job(job_id: str, vault_path: str, redis_client=None) -> dict: # type: ignore +async def ingest_obsidian_vault_job(vault_path: str) -> dict: # type: ignore """ Long-running ingestion job enqueued on the default RQ queue. """
46-47: Fix redundant exception logging.The
logger.exception()call automatically includes exception details, so interpolatingexcin the format string is redundant. Based on learnings from this repository, uselogger.exception()without the exception argument.🔎 Proposed fix
except Exception as exc: - logger.exception("Database setup failed for job %s: %s", job.id, exc) + logger.exception("Database setup failed for job %s", job.id) job.meta["status"] = "failed" job.meta["error"] = f"Database setup failed: {exc}" job.save_meta()
93-94: Uselogger.exception()for automatic stack trace capture.Per repository learnings, prefer
logging.exception()inside except blocks to automatically log the full stack trace for better debuggability.🔎 Proposed fix
except Exception as exc: - logger.error("Processing %s failed: %s", filename, exc) + logger.exception("Processing %s failed", filename) errors.append(f"{filename}: {exc}") job.meta["errors"] = errors job.save_meta()backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (3)
46-48: Uselogger.exception()and chain exceptions properly.Per repository learnings, use
logging.exception()inside except blocks to capture the full stack trace, and chain exceptions withfrom eto preserve context.🔎 Proposed fix
except Exception as e: - logger.error(f"Ingestion failed: {e}") - raise HTTPException(status_code=500, detail=str(e)) + logger.exception("Ingestion failed") + raise HTTPException(status_code=500, detail=str(e)) from e
111-113: Uselogger.exception()and add exception chaining.Per repository learnings, use
logging.exception()to capture stack traces and chain exceptions withfrom e.🔎 Proposed fix
except Exception as e: - logger.error(f"Failed to process uploaded zip: {e}") - raise HTTPException(status_code=500, detail=f"Failed to process uploaded zip: {e}") + logger.exception("Failed to process uploaded zip") + raise HTTPException(status_code=500, detail=f"Failed to process uploaded zip: {e}") from e
221-223: Replace bareexceptwith specific exception handling.Per coding guidelines, always raise errors explicitly with proper exceptions. The bare
exceptviolates this principle. Catch the specific exception (json.JSONDecodeError) and add exception chaining.🔎 Proposed fix
try: job_data = json.loads(pending_data) return { "job_id": job_id, "status": "ready", "total": job_data.get("total", 0), "processed": 0, "percent": 0, "errors": [], "vault_path": job_data.get("vault_path") } - except: - raise HTTPException(status_code=500, detail="Failed to get job status") + except json.JSONDecodeError as e: + logger.exception("Failed to parse pending job data for %s", job_id) + raise HTTPException(status_code=500, detail="Failed to get job status") from e raise HTTPException(status_code=404, detail="Job not found")backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
420-421: Uselogger.exception()for search failures.Per repository learnings, use
logging.exception()inside except blocks to automatically capture the full stack trace.🔎 Proposed fix
except Exception as e: - logger.error(f"Obsidian search failed: {e}") + logger.exception("Obsidian search failed")
428-437: Add thread-safe singleton initialization.The check-then-set pattern has a race condition under concurrent access. Multiple threads could simultaneously pass the
if _obsidian_service is Nonecheck and create multiple instances.🔎 Proposed fix using threading.Lock
+import threading + # Lazy initialization to avoid startup failures _obsidian_service: Optional[ObsidianService] = None +_service_lock = threading.Lock() def get_obsidian_service() -> ObsidianService: """Get or create the Obsidian service singleton. Returns: ObsidianService instance, creating it on first access. """ global _obsidian_service if _obsidian_service is None: - _obsidian_service = ObsidianService() + with _service_lock: + # Double-check pattern: another thread may have initialized it + if _obsidian_service is None: + _obsidian_service = ObsidianService() return _obsidian_service
🧹 Nitpick comments (4)
backends/advanced/src/advanced_omi_backend/utils/model_utils.py (1)
37-45: Logic is correct and error handling is appropriate.The function correctly retrieves the default model name, searches for the corresponding model definition, and raises informative
ValueErrorexceptions when configurations are missing. The error messages clearly indicate what went wrong, which aids debugging.Optional: Consider Ruff TRY003 suggestion.
Static analysis flags the long error messages (lines 39, 45) as a stylistic concern. For simple validation errors like these, inline messages are perfectly acceptable. If you prefer to follow Ruff's recommendation strictly, you could create custom exception classes, but this is entirely optional.
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (3)
48-72: Consider consolidating environment loading utilities.Per maintainer feedback, there are existing env loaders in
upload_audio/load_env_file,root/scripts/lib/env_utils, andwizard.py. Consider consolidating to avoid duplication.Based on learnings: Maintainer noted consolidation opportunity.
378-379: Remove redundant exception handler.The exception is caught and immediately re-raised without any handling. This is redundant. Either handle the exception or remove the try-except block.
🔎 Proposed fix
- try: - clean_query = self._clean_text(query) - vectors = await generate_openai_embeddings( - [clean_query], - api_key=self.openai_api_key, - base_url=self.openai_base_url, - model=self.embedding_model, - ) - query_vector = vectors[0] if vectors else None - except Exception: - raise + clean_query = self._clean_text(query) + vectors = await generate_openai_embeddings( + [clean_query], + api_key=self.openai_api_key, + base_url=self.openai_base_url, + model=self.embedding_model, + ) + query_vector = vectors[0] if vectors else None
90-149: Consider leveraging existing memory infrastructure.Per maintainer feedback, this service appears to duplicate existing functionality. The memory router's
add_memoryalready writes to Neo4j via OpenMemory. Consider refactoring to useservices/memory/providers/llm_providerand the memory router rather than implementing a parallel ingestion system. If Obsidian-specific characteristics require custom handling, create anObsidianMemorymodel and use a custom prompt through the memory controller.Based on learnings: Maintainer suggested architectural alignment with existing memory infrastructure.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
tests/test_assets/obsidian_dataview_example_vault.zipis excluded by!**/*.zip
📒 Files selected for processing (5)
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/neo4j_client.py(1 hunks)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py(1 hunks)backends/advanced/src/advanced_omi_backend/utils/model_utils.py(1 hunks)backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/services/neo4j_client.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.pybackends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.pybackends/advanced/src/advanced_omi_backend/utils/model_utils.py
🧠 Learnings (2)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/neo4j_client.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.pybackends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.pybackends/advanced/src/advanced_omi_backend/utils/model_utils.py
📚 Learning: 2025-12-20T01:11:51.486Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-20T01:11:51.486Z
Learning: Applies to **/*.py : Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Applied to files:
backends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
🧬 Code graph analysis (2)
backends/advanced/src/advanced_omi_backend/services/neo4j_client.py (1)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
close(150-152)
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (3)
backends/advanced/src/advanced_omi_backend/utils/file_utils.py (2)
extract_zip(21-81)ZipExtractionError(16-18)backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py (2)
count_markdown_files(17-25)ingest_obsidian_vault_job(29-107)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
ingest_vault(319-357)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
107-107: Avoid specifying long messages outside the exception class
(TRY003)
111-111: Avoid specifying long messages outside the exception class
(TRY003)
131-131: Avoid specifying long messages outside the exception class
(TRY003)
272-272: Do not catch blind exception: Exception
(BLE001)
273-273: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
277-277: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
302-302: Probable use of insecure hash functions in hashlib: md5
(S324)
335-335: Avoid specifying long messages outside the exception class
(TRY003)
355-355: Use explicit conversion flag
Replace with conversion flag
(RUF010)
378-379: Remove exception handler; error is immediately re-raised
(TRY203)
420-420: Do not catch blind exception: Exception
(BLE001)
421-421: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
34-34: Unused function argument: current_user
(ARG001)
34-34: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
45-45: Consider moving this statement to an else block
(TRY300)
46-46: Do not catch blind exception: Exception
(BLE001)
47-47: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
48-48: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
53-53: Do not perform function call File in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
54-54: Unused function argument: current_user
(ARG001)
54-54: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
82-82: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
83-83: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
89-89: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
90-90: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
92-92: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
93-93: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
111-111: Do not catch blind exception: Exception
(BLE001)
112-112: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
113-113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
119-119: Do not catch blind exception: Exception
(BLE001)
126-126: Unused function argument: current_user
(ARG001)
126-126: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
150-150: Consider moving this statement to an else block
(TRY300)
151-151: Do not catch blind exception: Exception
(BLE001)
152-152: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
153-153: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
166-166: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
170-170: Unused function argument: current_user
(ARG001)
170-170: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
221-221: Do not use bare except
(E722)
222-222: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
223-223: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py
20-20: Loop control variable root not used within loop body
Rename unused root to _root
(B007)
29-29: Unused function argument: job_id
(ARG001)
29-29: Unused function argument: redis_client
(ARG001)
47-47: Redundant exception object included in logging.exception call
(TRY401)
93-93: Do not catch blind exception: Exception
(BLE001)
94-94: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/utils/model_utils.py
39-39: Avoid specifying long messages outside the exception class
(TRY003)
45-45: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (2)
backends/advanced/src/advanced_omi_backend/utils/model_utils.py (1)
11-36: LGTM! Past issue resolved.The return type has been correctly updated to
Dict[str, Any](non-optional), and the docstring accurately reflects that the function either returns a model configuration dictionary or raises aValueError. The documentation is thorough and includes a helpful usage example.backends/advanced/src/advanced_omi_backend/services/neo4j_client.py (1)
1-63: LGTM! Previous issues have been resolved.The implementation correctly:
- Uses
READ_ACCESSandWRITE_ACCESSconstants from the neo4j module (lines 4, 58, 63) instead of string literals- Consumes the result data before the session closes (line 53) by calling
result.data(), preventing issues with closed Result objectsThe Neo4j client abstraction is well-designed and thread-safe for session management.
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/utils/model_utils.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (5)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (4)
281-283: Uselogger.exception()for automatic stack trace capture.Per project learnings, use
logging.exception()inside except blocks to automatically capture the full stack trace for better debuggability.🔎 Proposed fix
except Exception as e: - logger.error(f"Embedding generation failed for {note_data['path']}: {e}") + logger.exception("Embedding generation failed for %s", note_data['path']) return []
387-389: Uselogger.exception()for embedding failure.Per project learnings, use
logging.exception()inside except blocks to automatically capture the full stack trace.🔎 Proposed fix
except Exception as exc: - logger.error("Obsidian search embedding failed: %s", exc) + logger.exception("Obsidian search embedding failed") raise ObsidianSearchError("embedding", str(exc)) from exc
430-432: Uselogger.exception()for database failure.Per project learnings, use
logging.exception()inside except blocks to capture the full stack trace.🔎 Proposed fix
except Exception as e: - logger.error(f"Obsidian search failed: {e}") + logger.exception("Obsidian search failed") raise ObsidianSearchError("database", str(e)) from e
439-448: Singleton initialization is not thread-safe.The check-then-set pattern has a race condition where multiple threads could create separate instances. Consider using a
threading.Lockwith double-check pattern if concurrent access is expected.🔎 Proposed fix
Add at module level:
import threading _service_lock = threading.Lock()Then update the function:
def get_obsidian_service() -> ObsidianService: global _obsidian_service if _obsidian_service is None: - _obsidian_service = ObsidianService() + with _service_lock: + if _obsidian_service is None: + _obsidian_service = ObsidianService() return _obsidian_servicebackends/advanced/src/advanced_omi_backend/chat_service.py (1)
344-346: Uselogging.exception()and bareraisefor better stack traces.Per project learnings, use
logging.exception()inside except blocks for automatic stack trace capture, and use bareraiseto preserve the original exception context.🔎 Proposed fix
except Exception as e: - logger.error(f"Failed to get Obsidian context: {e}") - raise e + logger.exception("Failed to get Obsidian context") + raise
🧹 Nitpick comments (4)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (4)
57-97: Consider consolidating with existing env utilities.Per prior feedback, the codebase already has env loaders in
upload_audio/load_env_file,scripts/lib/env_utils, andwizard.py. Consider importing from an existing utility or consolidating these functions into a shared module to reduce duplication. Similarly,resolve_valuecould be moved to a shared utils module for reuse by other services.
286-288: Addstrict=Truetozip()to catch length mismatches.If
original_chunksandvectorshave different lengths (e.g., API returns fewer embeddings), the current code silently drops items. Addingstrict=Truewill raise aValueErrormaking such issues explicit.🔎 Proposed fix
- for orig_text, vector in zip(original_chunks, vectors): + for orig_text, vector in zip(original_chunks, vectors, strict=True):
310-311: Consider using SHA256 instead of MD5 for chunk IDs.While MD5 is sufficient for generating unique chunk identifiers (not used for security), SHA256 is a more modern choice that avoids static analysis warnings and is future-proof.
🔎 Proposed fix
- chunk_id = hashlib.md5(f"{note_data['path']}_{i}".encode()).hexdigest() + chunk_id = hashlib.sha256(f"{note_data['path']}_{i}".encode()).hexdigest()[:32]
362-364: Remove redundant exception object fromlogger.exception().
logger.exception()automatically includes the exception info, so{e}in the message is redundant.🔎 Proposed fix
except Exception as e: - logger.exception(f"Processing {file} failed") + logger.exception("Processing %s failed", file) errors.append(f"{file}: {str(e)}")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
backends/advanced/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
backends/advanced/pyproject.toml(1 hunks)backends/advanced/src/advanced_omi_backend/chat_service.py(7 hunks)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py(1 hunks)backends/advanced/tests/test_obsidian_service.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- backends/advanced/tests/test_obsidian_service.py
- backends/advanced/pyproject.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/chat_service.py
🧠 Learnings (2)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/chat_service.py
📚 Learning: 2025-12-20T01:11:51.486Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-20T01:11:51.486Z
Learning: Applies to **/*.py : Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Applied to files:
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
🧬 Code graph analysis (1)
backends/advanced/src/advanced_omi_backend/chat_service.py (2)
backends/advanced/src/advanced_omi_backend/services/memory/base.py (1)
MemoryEntry(26-63)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (3)
get_obsidian_service(439-448)ObsidianSearchError(51-55)search_obsidian(368-434)
🪛 Ruff (0.14.8)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
116-116: Avoid specifying long messages outside the exception class
(TRY003)
120-120: Avoid specifying long messages outside the exception class
(TRY003)
140-140: Avoid specifying long messages outside the exception class
(TRY003)
281-281: Do not catch blind exception: Exception
(BLE001)
282-282: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
286-286: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
311-311: Probable use of insecure hash functions in hashlib: md5
(S324)
344-344: Avoid specifying long messages outside the exception class
(TRY003)
364-364: Use explicit conversion flag
Replace with conversion flag
(RUF010)
388-388: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
431-431: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/chat_service.py
338-342: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
345-345: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
346-346: Use raise without specifying exception name
Remove exception name
(TRY201)
🔇 Additional comments (5)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
1-56: LGTM!The module structure, imports, and type definitions are well-organized. The
ObsidianSearchErrorexception with stage differentiation provides useful error context for callers.
102-158: LGTM with note on configuration approach.The initialization properly uses the shared
load_root_config()utility and falls back to environment variables for Neo4j credentials. The descriptive exception messages for missing configuration are appropriate for startup validation.backends/advanced/src/advanced_omi_backend/chat_service.py (3)
26-30: LGTM!The new imports are properly organized within the local imports section and introduce the necessary types for Obsidian integration.
288-323: LGTM!The return type change to
List[MemoryEntry]and the corresponding attribute access updates (memory.id,memory.content) are cleaner and provide better type safety than the previous dict-based approach.
363-384: LGTM!The
include_obsidian_memoryparameter is properly added to the signature and correctly propagated toformat_conversation_context. The integration follows the existing patterns in the codebase.
|
@0xrushi I think all the places where the config was read has a conflict due to the config refactor PR being merged |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (1)
134-136: Inconsistency: Config display values don't use new variables.Lines 134-136 still reference the module-level
_llm_defvariable (defined at lines 45-49), not the newly created local variables_llm_provider,_llm_model, and_llm_base_urlfrom the try block above. This is inconsistent with the AI summary stating that display values are now sourced from registry-based defaults.If the intent is to use the new config-derived values for health display, these lines should be updated to:
- "llm_provider": (_llm_def.model_provider if _llm_def else None), - "llm_model": (_llm_def.model_name if _llm_def else None), - "llm_base_url": (_llm_def.model_url if _llm_def else None), + "llm_provider": _llm_provider, + "llm_model": _llm_model, + "llm_base_url": _llm_base_url,backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (1)
81-104: Critical: Duplicate route path causes FastAPI conflict.Lines 81-87 define
POST /admin/memory/config/validatewith functionvalidate_memory_config_raw, and lines 98-104 define the exact same routePOST /admin/memory/config/validatewith functionvalidate_memory_config.FastAPI does not allow duplicate route paths with the same HTTP method. This will cause a routing conflict—either FastAPI will fail to start, raise a validation error, or only register the first endpoint.
Based on the relevant code snippets from
system_controller.py, these appear to be two distinct validation functions. One path needs to be renamed.🔎 Proposed fix
Rename one of the endpoints to differentiate them:
-@router.post("/admin/memory/config/validate") +@router.post("/admin/memory/config/validate-raw") async def validate_memory_config_raw( config_yaml: str = Body(..., media_type="text/plain"), current_user: User = Depends(current_superuser), ): """Validate posted memory YAML (does not save). Admin only.""" return await system_controller.validate_memory_config_raw(config_yaml)Alternatively, consolidate into a single validation endpoint if the functionality overlaps.
♻️ Duplicate comments (2)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py (1)
95-108: Critical: Multiple issues including undefined functions and dead code.This code block has several critical problems:
Undefined functions:
_load_root_config()(line 97) and_resolve_value()(lines 104-105) are not imported or defined, causingNameErrorat runtime.Dead code pattern (duplicate of past review): Lines 107-108 set fallback values then immediately re-raise the exception, making line 107 unreachable dead code. Past review comments already flagged this exact pattern.
Unused variables: The new variables
_llm_provider,_llm_model, and_llm_base_url(lines 103-105) are created but never used. Lines 134-136 still reference the module-level_llm_defvariable, not these new local variables.Exception chaining: Line 108 should use
raiseinstead ofraise e, and should chain withfrom eper project learnings.🔎 Proposed fix
First, ensure
_load_root_configand_resolve_valueare imported at the top of the file. Then apply this fix:# Load model config once for display fields try: _cfg = _load_root_config() or {} _defaults = _cfg.get("defaults", {}) or {} _models = _cfg.get("models", []) or [] _llm_name = _defaults.get("llm") _stt_name = _defaults.get("stt") _llm_def = next((m for m in _models if m.get("name") == _llm_name), None) _llm_provider = (_llm_def.get("model_provider") if _llm_def else None) or "openai" _llm_model = str(_resolve_value(_llm_def.get("model_name", ""))) if _llm_def else None _llm_base_url = str(_resolve_value(_llm_def.get("model_url", ""))) if _llm_def else None except Exception as e: + logger.warning(f"Failed to load configuration from config.yml for health check: {e}") _llm_provider, _llm_model, _llm_base_url, _stt_name = "openai", None, None, None - raise eBased on learnings, graceful degradation is preferred for health endpoints, and exception chaining with
from epreserves context when re-raising.Run this script to verify the required imports exist:
#!/bin/bash # Search for definitions of _load_root_config and _resolve_value rg -nP --type=py -C3 'def _load_root_config|def _resolve_value'backends/advanced/src/advanced_omi_backend/chat_service.py (1)
336-345: Uselogging.exception()and bareraisefor better stack traces.Both exception handlers should use
logging.exception()to automatically capture the full stack trace, and the genericExceptionhandler should use bareraiseto preserve the original exception context.🔎 Proposed fix
except ObsidianSearchError as exc: - logger.error( - "Failed to get Obsidian context (%s stage): %s", + logger.exception( + "Failed to get Obsidian context (%s stage)", exc.stage, - exc, ) raise except Exception as e: - logger.error(f"Failed to get Obsidian context: {e}") - raise e + logger.exception("Failed to get Obsidian context") + raiseBased on learnings from previous reviews in this project and confirmed by static analysis (TRY400, TRY201).
🧹 Nitpick comments (1)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
59-72: Code duplication:resolve_valueexists in multiple files.This function is duplicated in
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py(lines 82-96). Consider consolidating into a shared utility module to maintain a single source of truth.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
backends/advanced/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
backends/advanced/docker-compose.ymlbackends/advanced/src/advanced_omi_backend/chat_service.pybackends/advanced/src/advanced_omi_backend/controllers/system_controller.pybackends/advanced/src/advanced_omi_backend/llm_client.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.pybackends/advanced/src/advanced_omi_backend/routers/modules/system_routes.pybackends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
🚧 Files skipped from review as they are similar to previous changes (2)
- backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
- backends/advanced/docker-compose.yml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.pybackends/advanced/src/advanced_omi_backend/chat_service.pybackends/advanced/src/advanced_omi_backend/llm_client.pybackends/advanced/src/advanced_omi_backend/routers/modules/system_routes.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.pybackends/advanced/src/advanced_omi_backend/chat_service.pybackends/advanced/src/advanced_omi_backend/llm_client.pybackends/advanced/src/advanced_omi_backend/routers/modules/system_routes.pybackends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
🧬 Code graph analysis (3)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
resolve_value(83-97)backends/advanced/init.py (1)
load_config_yml(129-143)
backends/advanced/src/advanced_omi_backend/chat_service.py (2)
backends/advanced/src/advanced_omi_backend/services/memory/base.py (1)
MemoryEntry(26-63)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (3)
get_obsidian_service(439-448)ObsidianSearchError(51-55)search_obsidian(368-434)
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (1)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (2)
get_memory_config_raw(259-279)update_memory_config_raw(282-317)
🪛 Ruff (0.14.10)
backends/advanced/src/advanced_omi_backend/services/memory/config.py
90-90: Avoid specifying long messages outside the exception class
(TRY003)
281-281: Use raise without specifying exception name
Remove exception name
(TRY201)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
284-284: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/chat_service.py
337-341: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
344-344: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
345-345: Use raise without specifying exception name
Remove exception name
(TRY201)
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py
77-77: Unused function argument: current_user
(ARG001)
77-77: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
84-84: Unused function argument: current_user
(ARG001)
84-84: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
92-92: Unused function argument: current_user
(ARG001)
92-92: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py
97-97: Undefined name _load_root_config
(F821)
104-104: Undefined name _resolve_value
(F821)
105-105: Undefined name _resolve_value
(F821)
108-108: Use raise without specifying exception name
Remove exception name
(TRY201)
🔇 Additional comments (11)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
75-92: LGTM! Well-structured async embedding helper.The new
generate_openai_embeddingsfunction cleanly extracts the embedding generation logic and reuses the existing_get_openai_clienthelper for consistent client creation with optional Langfuse tracing.backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
74-90: LGTM! Config loading with multiple fallback paths.The function appropriately searches multiple standard locations for config.yml with sensible defaults for both Docker and development environments.
backends/advanced/src/advanced_omi_backend/llm_client.py (1)
59-64: LGTM! Config-driven initialization without environment fallbacks.The constructor correctly requires all parameters to be explicitly provided (removing environment variable fallbacks), which aligns with the config-driven architecture. The validation on lines 63-64 provides clear error messaging for missing configuration.
backends/advanced/src/advanced_omi_backend/chat_service.py (8)
26-30: LGTM: Clean import additions.The new imports for
MemoryEntryand Obsidian service components are properly placed and grouped with local imports, following the project's coding guidelines.
287-287: LGTM: Type safety improvement.Changing the return type to
List[MemoryEntry]improves type safety and aligns with the memory service API, enabling cleaner attribute-based access to memory data.
302-302: LGTM: Backward-compatible signature change.The addition of
include_obsidian_memory: bool = Falseis backward compatible and provides a clean feature toggle for Obsidian integration.
310-310: LGTM: Cleaner attribute access.The migration from dict-style access (
memory.get("id"),memory.get("memory")) to direct attribute access (memory.id,memory.content) is cleaner and type-safe with theMemoryEntrydataclass.Also applies to: 319-319
324-335: LGTM: Well-structured Obsidian integration.The core logic for integrating Obsidian vault search is clean:
- Properly awaits the async search
- Defensively checks for results before appending
- Maintains consistent formatting with memory context
- Includes helpful logging for observability
363-363: LGTM: Consistent parameter propagation.The addition of
include_obsidian_memory: bool = Falsetogenerate_response_streamproperly propagates the feature flag through the call chain and maintains backward compatibility.
381-383: LGTM: Correct parameter forwarding.The
include_obsidian_memoryparameter is correctly forwarded fromgenerate_response_streamtoformat_conversation_context, enabling the Obsidian integration when requested.
577-583: LGTM: Clean service lifecycle logging.The cleanup log message provides good visibility into the service lifecycle and is symmetric with the initialization log.
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (1)
92-115: Critical: Duplicate route registration.Both
validate_memory_config_raw(line 92) andvalidate_memory_config(line 109) are registered to the same path and HTTP method:POST /admin/memory/config/validate. FastAPI cannot handle duplicate routes and will either fail at startup or unpredictably override one handler with the other.Looking at the pattern established by other endpoints:
get_memory_config_rawuses/admin/memory/config/raw(GET)update_memory_config_rawuses/admin/memory/config/raw(POST)The newly added
validate_memory_config_rawshould either:
- Be removed entirely if
validate_memory_config(which accepts a structured model) is sufficient, or- Use a different path like
/admin/memory/config/raw/validateto follow the established naming pattern🔎 Recommended fix: remove duplicate or use distinct path
Option 1: Remove the duplicate
validate_memory_config_rawendpoint if the structured model approach is sufficient:-@router.post("/admin/memory/config/validate") -async def validate_memory_config_raw( - config_yaml: str = Body(..., media_type="text/plain"), - current_user: User = Depends(current_superuser), -): - """Validate posted memory YAML (does not save). Admin only.""" - return await system_controller.validate_memory_config_raw(config_yaml) -Option 2: Use a distinct path following the established pattern:
-@router.post("/admin/memory/config/validate") +@router.post("/admin/memory/config/raw/validate") async def validate_memory_config_raw( config_yaml: str = Body(..., media_type="text/plain"), current_user: User = Depends(current_superuser), ): """Validate posted memory YAML (does not save). Admin only.""" return await system_controller.validate_memory_config_raw(config_yaml)
♻️ Duplicate comments (11)
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (1)
86-86: Already flagged: Misleading comment.This comment was already identified in a previous review as misleading because the endpoints still exist below.
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (5)
102-157: Re-evaluate architectural approach per maintainer feedback.Per maintainer feedback, this service appears to duplicate existing functionality. The current memory router's
add_memoryalready writes to Neo4j via OpenMemory. Consider refactoring to use the existing memory infrastructure (services/memory/providers/llm_providerand memory router) rather than implementing a parallel ingestion system.If Obsidian-specific characteristics require custom handling, consider creating an
ObsidianMemorymodel and using a custom prompt through the memory controller. This approach would reduce code duplication and leverage proven infrastructure.Based on maintainer feedback: "This can be achieved using the current infra by just passing the text to the memory router add_memory (which writes to neo4J using openmemory)."
281-283: Remove redundant exception from log message.
logger.exception()automatically includes exception details, so thef"Embedding generation failed for {note_data['path']}: {e}"is redundant.🔎 Proposed fix
except Exception as e: - logger.exception(f"Embedding generation failed for {note_data['path']}: {e}") + logger.exception("Embedding generation failed for %s", note_data['path']) return []Based on learnings from this repository.
362-364: Remove redundant exception from log message.
logger.exception()automatically includes exception details, so thef"Processing {file} failed"message is sufficient without the redundant exception reference.🔎 Proposed fix
except Exception as e: - logger.exception(f"Processing {file} failed") + logger.exception("Processing %s failed", file) errors.append(f"{file}: {str(e)}")Based on learnings from this repository.
445-448: Lazy initialization is not thread-safe.The check-then-set pattern has a race condition: multiple threads could simultaneously pass the
if _obsidian_service is Nonecheck and create multiple service instances, each with separate Neo4j driver connections.🔎 Proposed fix using double-checked locking
+import threading + # Lazy initialization to avoid startup failures _obsidian_service: Optional[ObsidianService] = None +_service_lock = threading.Lock() def get_obsidian_service() -> ObsidianService: """Get or create the Obsidian service singleton. Returns: ObsidianService instance, creating it on first access. """ global _obsidian_service if _obsidian_service is None: - _obsidian_service = ObsidianService() + with _service_lock: + # Double-check pattern + if _obsidian_service is None: + _obsidian_service = ObsidianService() return _obsidian_service
430-432: Uselogger.exception()for better stack traces.Per repository learnings, use
logger.exception()instead oflogger.error()to capture the full stack trace for better debuggability.🔎 Proposed fix
except Exception as e: - logger.error(f"Obsidian search failed: {e}") + logger.exception("Obsidian search failed") raise ObsidianSearchError("database", str(e)) from eBased on learnings from this repository.
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (5)
46-48: Uselogger.exception()and add exception chaining.Per repository learnings and past review comments, replace
logger.error()withlogger.exception()to capture the full stack trace, and chain the exception withfrom ewhen re-raising.🔎 Proposed fix
except Exception as e: - logger.error(f"Ingestion failed: {e}") - raise HTTPException(status_code=500, detail=str(e)) + logger.exception("Ingestion failed") + raise HTTPException(status_code=500, detail=str(e)) from eBased on learnings from this repository.
82-83: Uselogger.exception()and add exception chaining.Per repository learnings, use
logger.exception()inside except blocks to capture the full stack trace, and chain exceptions withfrom e.🔎 Proposed fix
except IOError as e: - logger.error(f"Error writing zip file {zip_path}: {e}") - raise HTTPException(status_code=500, detail=f"Failed to save uploaded zip: {e}") + logger.exception("Error writing zip file %s", zip_path) + raise HTTPException(status_code=500, detail=f"Failed to save uploaded zip: {e}") from eBased on learnings from this repository.
88-93: Fix redundant exception in log and add exception chaining.Line 89:
logger.exception()automatically includes exception details, so thef"Invalid zip file: {e}"is redundant. Line 92: Uselogger.exception()instead oflogger.error(). Both lines need exception chaining.🔎 Proposed fix
except zipfile.BadZipFile as e: - logger.exception(f"Invalid zip file: {e}") - raise HTTPException(status_code=400, detail=f"Invalid zip file: {e}") + logger.exception("Invalid zip file") + raise HTTPException(status_code=400, detail=f"Invalid zip file: {e}") from e except ZipExtractionError as e: - logger.error(f"Error extracting zip file: {e}") - raise HTTPException(status_code=500, detail=f"Failed to extract zip file: {e}") + logger.exception("Error extracting zip file") + raise HTTPException(status_code=500, detail=f"Failed to extract zip file: {e}") from eBased on learnings from this repository.
111-113: Fix redundant exception in log and add exception chaining.
logger.exception()automatically includes exception details, so thef"Failed to process uploaded zip: {e}"is redundant. Also add exception chaining.🔎 Proposed fix
except Exception as e: - logger.exception(f"Failed to process uploaded zip: {e}") - raise HTTPException(status_code=500, detail=f"Failed to process uploaded zip: {e}") + logger.exception("Failed to process uploaded zip") + raise HTTPException(status_code=500, detail=f"Failed to process uploaded zip: {e}") from eBased on learnings from this repository.
151-153: Fix redundant exception in log and add exception chaining.
logger.exception()automatically includes exception details, so thef"Failed to start job {job_id}: {e}"is redundant. Also add exception chaining.🔎 Proposed fix
except Exception as e: - logger.exception(f"Failed to start job {job_id}: {e}") - raise HTTPException(status_code=500, detail=f"Failed to start job: {e}") + logger.exception("Failed to start job %s", job_id) + raise HTTPException(status_code=500, detail=f"Failed to start job: {e}") from eBased on learnings from this repository.
🧹 Nitpick comments (2)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (2)
57-97: Consider consolidating env loading with existing utilities.Per maintainer feedback, there are existing env loaders in
upload_audio/load_env_file,root/scripts/lib/env_utils, andwizard.py. Consider consolidating these utilities rather than duplicating env loading logic in this service.Based on maintainer feedback.
286-291: Addstrict=Truetozip()for safety.Without
strict=True, iforiginal_chunksandvectorshave different lengths,zip()will silently truncate to the shorter length, potentially losing chunks. This could happen if embedding generation partially fails.🔎 Proposed fix
chunk_payloads: List[ChunkPayload] = [] - for orig_text, vector in zip(original_chunks, vectors): + for orig_text, vector in zip(original_chunks, vectors, strict=True): if vector: chunk_payloads.append({"text": orig_text, "embedding": vector}) else: logger.warning(f"Embedding missing for chunk in {note_data['path']}")Note: This requires Python 3.10+. If targeting older Python, keep the current behavior but consider adding a length check before the loop.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.pybackends/advanced/src/advanced_omi_backend/routers/modules/system_routes.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/utils/model_utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/utils/model_utils.pybackends/advanced/src/advanced_omi_backend/routers/modules/system_routes.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
🧠 Learnings (2)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/utils/model_utils.pybackends/advanced/src/advanced_omi_backend/routers/modules/system_routes.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
📚 Learning: 2025-12-20T01:11:51.495Z
Learnt from: CR
Repo: chronicler-ai/chronicle PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-20T01:11:51.495Z
Learning: Applies to **/*.py : Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Applied to files:
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
🧬 Code graph analysis (3)
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (1)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (2)
get_memory_config_raw(260-280)update_memory_config_raw(283-318)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (4)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
generate_openai_embeddings(75-91)backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
load_config_yml(74-90)backends/advanced/src/advanced_omi_backend/utils/model_utils.py (1)
get_model_config(11-45)backends/advanced/src/advanced_omi_backend/services/neo4j_client.py (5)
Neo4jClient(7-32)close(26-29)reset(31-32)session(20-24)session(42-43)
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py (3)
backends/advanced/src/advanced_omi_backend/utils/file_utils.py (2)
extract_zip(21-81)ZipExtractionError(16-18)backends/advanced/src/advanced_omi_backend/workers/obsidian_jobs.py (2)
count_markdown_files(17-25)ingest_obsidian_vault_job(29-107)backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
ingest_vault(328-366)
🪛 Ruff (0.14.10)
backends/advanced/src/advanced_omi_backend/utils/model_utils.py
39-39: Avoid specifying long messages outside the exception class
(TRY003)
45-45: Avoid specifying long messages outside the exception class
(TRY003)
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py
88-88: Unused function argument: current_user
(ARG001)
88-88: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
95-95: Unused function argument: current_user
(ARG001)
95-95: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
103-103: Unused function argument: current_user
(ARG001)
103-103: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
116-116: Avoid specifying long messages outside the exception class
(TRY003)
120-120: Avoid specifying long messages outside the exception class
(TRY003)
140-140: Avoid specifying long messages outside the exception class
(TRY003)
282-282: Redundant exception object included in logging.exception call
(TRY401)
286-286: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
311-311: Probable use of insecure hash functions in hashlib: md5
(S324)
344-344: Avoid specifying long messages outside the exception class
(TRY003)
364-364: Use explicit conversion flag
Replace with conversion flag
(RUF010)
388-388: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
431-431: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
34-34: Unused function argument: current_user
(ARG001)
34-34: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
45-45: Consider moving this statement to an else block
(TRY300)
46-46: Do not catch blind exception: Exception
(BLE001)
47-47: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
48-48: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
53-53: Do not perform function call File in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
54-54: Unused function argument: current_user
(ARG001)
54-54: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
82-82: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
83-83: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
89-89: Redundant exception object included in logging.exception call
(TRY401)
90-90: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
92-92: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
93-93: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
112-112: Redundant exception object included in logging.exception call
(TRY401)
113-113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
119-119: Do not catch blind exception: Exception
(BLE001)
126-126: Unused function argument: current_user
(ARG001)
126-126: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
150-150: Consider moving this statement to an else block
(TRY300)
152-152: Redundant exception object included in logging.exception call
(TRY401)
153-153: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
166-166: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
170-170: Unused function argument: current_user
(ARG001)
170-170: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
221-221: Do not use bare except
(E722)
222-222: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
223-223: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🔇 Additional comments (3)
backends/advanced/src/advanced_omi_backend/utils/model_utils.py (1)
1-46: LGTM! Well-structured utility with proper error handling.The implementation correctly looks up model configurations from the config data structure, with appropriate error handling that raises
ValueErrorwhen defaults or models are not found. The function signature accurately reflects that it always returns a dictionary (addressed in previous reviews), and the docstring provides clear usage examples.The code follows the project's coding guidelines:
- All imports at the top after the docstring
- Explicit error raising rather than silent failures
- Avoids defensive
hasattr()checks, relying on the documented contract thatconfig_datahas the expected structureNote: The Ruff TRY003 hints about "long messages outside exception class" can be safely ignored—inline error messages are standard practice for simple utilities like this, and custom exception classes would reduce readability without adding value.
backends/advanced/src/advanced_omi_backend/routers/modules/system_routes.py (2)
89-89: Docstring updates look good.The updated docstrings correctly reflect that
config.ymlis the source of truth for memory configuration, which aligns with the implementation shown in the system_controller snippets.Also applies to: 105-105, 120-120
92-98: Verify that validation methods exist in system_controller.Ensure that both
validate_memory_config_rawandvalidate_memory_configmethods are implemented insystem_controller. The provided code snippets only showget_memory_config_rawandupdate_memory_config_raw.#!/bin/bash # Description: Verify that the validation methods exist in system_controller # Search for the validation method definitions in system_controller rg -n -A 5 'async def validate_memory_config' --type=py -g '**/system_controller.py'Also applies to: 109-115
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Show resolved
Hide resolved
backends/advanced/src/advanced_omi_backend/routers/modules/obsidian_routes.py
Show resolved
Hide resolved
…to feat/add-obsidian
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
316-316: Use bareraiseto preserve exception context.This issue was flagged in a previous review. When re-raising exceptions, use bare
raiseinstead ofraise eto preserve the full traceback.🔎 Proposed fix
except Exception as e: memory_logger.exception( f"Failed to get embedding dimensions from registry for model '{embedding_model}'" ) - raise e + raiseBased on learnings, bare
raisepreserves the original exception context and full traceback.
🧹 Nitpick comments (1)
backends/advanced/pyproject.toml (1)
59-59: Consider pinning to a specific commit SHA instead of the "main" branch.Using
rev = "main"instead of a specific commit SHA or tag means different builds at different times will pull different versions of mem0ai, reducing build reproducibility and potentially introducing unexpected changes.🔎 Suggested approach to pin to a specific commit
After verifying that PR #3250 is merged to main, pin to the specific commit SHA:
-mem0ai = { git = "https://github.com/AnkushMalaker/mem0.git", rev = "main" } +mem0ai = { git = "https://github.com/AnkushMalaker/mem0.git", rev = "<commit-sha>" }Replace
<commit-sha>with the actual commit hash from the main branch that includes the fix.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
backends/advanced/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
backends/advanced/docker-compose.ymlbackends/advanced/pyproject.tomlbackends/advanced/src/advanced_omi_backend/services/memory/config.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backends/advanced/docker-compose.yml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/services/memory/config.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/services/memory/config.py
🧬 Code graph analysis (1)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (2)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (1)
resolve_value(83-97)backends/advanced/init.py (1)
load_config_yml(129-143)
🪛 Ruff (0.14.10)
backends/advanced/src/advanced_omi_backend/services/memory/config.py
90-90: Avoid specifying long messages outside the exception class
(TRY003)
316-316: Use raise without specifying exception name
Remove exception name
(TRY201)
backends/advanced/src/advanced_omi_backend/services/memory/config.py
Outdated
Show resolved
Hide resolved
@thestumonkey https://github.com/s-blu/obsidian_dataview_example_vault |
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (4)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
315-315: Use bareraiseto preserve exception context.As noted in previous review, use bare
raiseinstead ofraise eto preserve the full traceback.🔎 Proposed fix
except Exception as e: memory_logger.exception( f"Failed to get embedding dimensions from registry for model '{embedding_model}'" ) - raise e + raiseBased on learnings, when re-raising exceptions, use bare
raiseto preserve the original context.backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (3)
406-408: Uselogger.exception()for automatic stack trace capture.Per project learnings, use
logger.exception()inside except blocks to automatically capture the full stack trace for better debuggability, even when re-raising the exception.🔎 Proposed fix
except Exception as exc: - logger.error("Obsidian search embedding failed: %s", exc) + logger.exception("Obsidian search embedding failed") raise ObsidianSearchError("embedding", str(exc)) from excBased on learnings from this repository.
449-451: Uselogger.exception()for better stack traces.Per repository learnings, use
logger.exception()instead oflogger.error()inside except blocks to capture the full stack trace for better debuggability.🔎 Proposed fix
except Exception as e: - logger.error(f"Obsidian search failed: {e}") + logger.exception("Obsidian search failed") raise ObsidianSearchError("database", str(e)) from eBased on learnings from this repository.
456-469: Singleton initialization is not thread-safe.The check-then-set pattern in
get_obsidian_service()has a race condition. In a concurrent environment (e.g., FastAPI handling multiple requests), multiple threads could simultaneously pass theif _obsidian_service is Nonecheck and create multiple service instances, each with separate Neo4j driver connections. This could lead to resource leaks and inconsistent state.🔎 Proposed fix using threading.Lock
+import threading + # Lazy initialization to avoid startup failures _obsidian_service: Optional[ObsidianService] = None +_service_lock = threading.Lock() def get_obsidian_service() -> ObsidianService: """Get or create the Obsidian service singleton. Returns: ObsidianService instance, creating it on first access. """ global _obsidian_service if _obsidian_service is None: - _obsidian_service = ObsidianService() + with _service_lock: + # Double-check pattern: another thread may have initialized it + if _obsidian_service is None: + _obsidian_service = ObsidianService() return _obsidian_service
🧹 Nitpick comments (6)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
102-102: Add explicitOptionaltype hint forapi_keyparameter.The
api_keyparameter has a default value ofNonebut lacks an explicitOptionaltype hint, which violates PEP 484.🔎 Proposed fix
def create_mycelia_config( - api_url: str = "http://localhost:8080", api_key: str = None, timeout: int = 30, **kwargs + api_url: str = "http://localhost:8080", api_key: Optional[str] = None, timeout: int = 30, **kwargs ) -> Dict[str, Any]:backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (5)
64-91: Consider consolidating with existing env loaders.Per maintainer feedback, there are existing env loaders in
upload_audio/load_env_file,root/scripts/lib/env_utils, andwizard.py. This function duplicates that functionality. Consider using or consolidating with existing utilities to reduce code duplication.
186-189: Consider moving to shared utils for reusability.Per maintainer feedback, this text normalization function could be useful for other LLM services and isn't specific to Obsidian. Consider moving it to a shared utilities module (e.g.,
utils/text_utils.pyorservices/memory/utils.py) so other services can leverage it.
283-285: Remove redundant exception object from log message.
logger.exception()automatically includes exception details in the log output, so including{e}in the message is redundant.🔎 Proposed fix
except Exception as e: - logger.exception(f"Embedding generation failed for {note_data['path']}: {e}") + logger.exception("Embedding generation failed for %s", note_data['path']) return []Based on learnings from this repository.
288-290: Add strict parameter to zip() for safety.The
zip()call lacks an explicitstrict=parameter. Ifgenerate_openai_embeddingsreturns fewer vectors than expected (due to API failures or rate limiting), the silent truncation could cause data loss without any error signal. Usingstrict=Trueensures lengths match and raises an error on mismatch.🔎 Proposed fix
chunk_payloads: List[ChunkPayload] = [] - for orig_text, vector in zip(original_chunks, vectors): + for orig_text, vector in zip(original_chunks, vectors, strict=True): if vector: chunk_payloads.append({"text": orig_text, "embedding": vector}) else:
383-383: Consider using conversion flag instead of explicit str().For consistency with modern Python string formatting conventions, consider using the
!sconversion flag instead of explicitstr()call.🔎 Proposed fix
- errors.append(f"{file}: {str(e)}") + errors.append(f"{file}: {e!s}")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
backends/advanced/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
backends/advanced/pyproject.tomlbackends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/utils/config_utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backends/advanced/pyproject.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
backends/advanced/src/advanced_omi_backend/utils/config_utils.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/services/memory/config.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
backends/advanced/src/advanced_omi_backend/utils/config_utils.pybackends/advanced/src/advanced_omi_backend/services/obsidian_service.pybackends/advanced/src/advanced_omi_backend/services/memory/config.py
🧬 Code graph analysis (1)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py (5)
backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (2)
generate_openai_embeddings(75-91)chunk_text_with_spacy(102-155)backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
load_config_yml(67-83)backends/advanced/src/advanced_omi_backend/utils/model_utils.py (1)
get_model_config(11-45)backends/advanced/src/advanced_omi_backend/utils/config_utils.py (1)
resolve_value(7-20)backends/advanced/src/advanced_omi_backend/services/neo4j_client.py (7)
Neo4jClient(7-32)Neo4jReadInterface(56-58)Neo4jWriteInterface(61-63)close(26-29)reset(31-32)session(20-24)session(42-43)
🪛 Ruff (0.14.10)
backends/advanced/src/advanced_omi_backend/services/obsidian_service.py
110-110: Avoid specifying long messages outside the exception class
(TRY003)
114-114: Avoid specifying long messages outside the exception class
(TRY003)
138-138: Avoid specifying long messages outside the exception class
(TRY003)
284-284: Redundant exception object included in logging.exception call
(TRY401)
288-288: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
318-318: Probable use of insecure hash functions in hashlib: md5
(S324)
363-363: Avoid specifying long messages outside the exception class
(TRY003)
383-383: Use explicit conversion flag
Replace with conversion flag
(RUF010)
407-407: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
450-450: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/services/memory/config.py
83-83: Avoid specifying long messages outside the exception class
(TRY003)
102-102: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
315-315: Use raise without specifying exception name
Remove exception name
(TRY201)
🔇 Additional comments (4)
backends/advanced/src/advanced_omi_backend/utils/config_utils.py (1)
7-20: Well-implemented utility function that resolves code duplication.This extraction addresses the duplication issue previously flagged in past reviews. The implementation cleanly handles both
${VAR}and${VAR:-default}patterns for environment variable resolution.backends/advanced/src/advanced_omi_backend/services/memory/config.py (3)
67-83: Clean implementation for multi-environment config loading.The function appropriately searches standard locations (Docker, relative paths) and provides helpful error messages when config.yml is not found.
115-132: Excellent use of keyword-only parameters and sensible defaults.The function is well-structured with clear parameter organization and appropriate default values for OpenAI-compatible configurations.
135-147: Clean configuration helper with appropriate defaults.The function provides sensible defaults for Qdrant vector store setup, including the standard embedding dimension for OpenAI models.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
status.py (1)
71-71: Remove redundant import inside function.The
dotenv_valuesfunction is already imported at the module level (Line 21). This mid-function import violates the coding guidelines which state: "ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files."As per coding guidelines, all imports should be at the top of the file.
🔎 Proposed fix
# Handle speaker-recognition profiles if service_name == 'speaker-recognition': - from dotenv import dotenv_values env_file = service_path / '.env'
🧹 Nitpick comments (2)
wizard.py (2)
400-400: Remove unnecessary f-string prefix.This string has no placeholders, so the
fprefix is unnecessary.🔎 Proposed fix
- console.print(f"\n🎊 [bold green]Setup Complete![/bold green]") + console.print("\n🎊 [bold green]Setup Complete![/bold green]")
408-410: Remove unnecessary f-string prefix.This string has no placeholders, so the
fprefix is unnecessary.🔎 Proposed fix
- console.print(f"\n📚 [bold cyan]Obsidian Integration Detected[/bold cyan]") + console.print("\n📚 [bold cyan]Obsidian Integration Detected[/bold cyan]")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backends/advanced/docker-compose.ymlservices.pystatus.pywizard.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backends/advanced/docker-compose.yml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use Black formatter with 100-character line length for Python code
Use isort for Python import organization
ALL imports must be at the top of the file after the docstring - never import modules in the middle of functions or files
Group imports in Python files: standard library, third-party, then local imports
Use lazy imports sparingly and only when absolutely necessary for circular import issues in Python
Always raise errors in Python, never silently ignore - use explicit error handling with proper exceptions rather than silent failures
Avoid defensivehasattr()checks in Python - research and understand input/response or class structure instead
Files:
services.pywizard.pystatus.py
🧠 Learnings (1)
📚 Learning: 2025-12-08T23:52:34.959Z
Learnt from: AnkushMalaker
Repo: chronicler-ai/chronicle PR: 178
File: backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py:218-223
Timestamp: 2025-12-08T23:52:34.959Z
Learning: In Python code (chronicle project), prefer logging.exception() inside except blocks to automatically log the full stack trace. When re-raising exceptions, always chain with 'raise ... from e' to preserve the original context; use 'raise ... from None' only if you explicitly want to suppress the context. This improves debuggability across Python files.
Applied to files:
services.pywizard.pystatus.py
🪛 Ruff (0.14.10)
wizard.py
400-400: f-string without any placeholders
Remove extraneous f prefix
(F541)
408-408: f-string without any placeholders
Remove extraneous f prefix
(F541)
status.py
60-60: Local variable dotenv_values referenced before assignment
(F823)
🔇 Additional comments (3)
services.py (1)
77-86: LGTM! Obsidian/Neo4j profile detection implemented correctly.The logic properly reads the environment file, validates the NEO4J_HOST value against placeholders, and conditionally enables the obsidian profile. The implementation follows the established pattern used for HTTPS profile detection above.
Minor note: The empty string
''in the exclusion list on Line 84 is redundant since the conditionif neo4j_hostalready filters out empty strings, but this doesn't affect correctness.wizard.py (1)
390-397: LGTM! Obsidian/Neo4j detection logic is sound.The implementation correctly detects Obsidian configuration by reading the NEO4J_HOST value from the backend .env file and checking against known placeholders. The logic appropriately gates the check to only run when the advanced backend was selected and successfully configured.
status.py (1)
48-67: LGTM! Multiple profile detection implemented correctly.The refactored logic properly accumulates multiple profiles (https and obsidian) and constructs the docker compose command with all applicable
--profileflags. The command construction on Line 67 correctly flattens the profile list into the command arguments.
- Removed the deprecated `validate_memory_config_raw` endpoint and replaced it with a new endpoint that accepts plain text for validation. - Updated the existing `validate_memory_config` endpoint to clarify that it now accepts JSON input. - Adjusted the API call in the frontend to point to the new validation endpoint.
- Updated the health check function to load model configuration from the models registry instead of the root config. - Improved error handling by logging warnings when model configuration loading fails.
Feat/add obsidian 3
Needs neo4j to be enabled, thats disabled by default
new note
client connected to chronicle neo4j
enabled obsidian memory, asked a query, then disabled and asked same query
new chat with obsidian memory disabled
Summary by CodeRabbit
New Features
Enhancements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.