Feat/better reprocess memory#300
Conversation
…290) - Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general. - Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection.
- Removed redundant Obsidian and Knowledge Graph configuration checks from services.py, streamlining the command execution process. - Updated wizard.py to enhance user experience by setting default options for speaker recognition during service selection. - Improved Neo4j password handling in setup processes, ensuring consistent configuration prompts and feedback. - Introduced a new cron scheduler for managing scheduled tasks, enhancing the backend's automation capabilities. - Added new entity annotation features, allowing for corrections and updates to knowledge graph entities directly through the API.
- Added new configuration options for VibeVoice ASR in defaults.yml, including batching parameters for audio processing. - Updated Docker Compose files to mount the config directory, ensuring access to ASR service configurations. - Enhanced the VibeVoice transcriber to load configuration settings from defaults.yml, allowing for dynamic adjustments via environment variables. - Introduced quantization options for model loading in the VibeVoice transcriber, improving performance and flexibility. - Refactored the speaker identification process to streamline audio handling and improve logging for better debugging. - Updated documentation to reflect new configuration capabilities and usage instructions for the VibeVoice ASR provider.
- Introduced functions for checking LangFuse configuration in services.py, ensuring proper setup for observability. - Updated wizard.py to facilitate user input for LangFuse configuration, including options for local and external setups. - Implemented memory reprocessing logic in memory services to update existing memories based on speaker re-identification. - Enhanced speaker recognition client to support per-segment identification, improving accuracy during reprocessing. - Refactored various components to streamline handling of LangFuse parameters and improve overall service management.
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis pull request introduces significant enhancements to speaker reprocessing, ASR context-aware transcription, cron-based job scheduling, entity-level annotations with knowledge graph integration, and refined setup workflows. Core changes include a new async cron scheduler with Redis state persistence, speaker re-identification flows with memory reprocessing, per-segment speaker identification mode, hot words and jargon extraction for ASR, entity annotation support, and simplified Neo4j/Knowledge Graph setup in initialization. Multiple timeout reductions are applied across transcription jobs, and the UI expands to support entity editing and comprehensive fine-tuning/cron management. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 13
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (7)
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py (1)
628-680:⚠️ Potential issue | 🔴 CriticalDuplicate "wait for streaming transcription" block — remove one.
The wait-for-streaming-completion logic is duplicated verbatim. Lines 631–656 and lines 661–680 are identical: same
completion_key, samemax_wait_streaming = 30, same polling loop. This doubles the maximum wait time from 30s to 60s unintentionally and is clearly a copy-paste accident.Remove the second block (lines 658–680).
🐛 Proposed fix — remove duplicate block
if waited_streaming >= max_wait_streaming: logger.warning( f"⚠️ Timed out waiting for streaming completion signal for {session_id} " f"(waited {max_wait_streaming}s), proceeding with available transcript" ) - # Wait for streaming transcription consumer to complete before reading transcript - # This fixes the race condition where conversation job reads transcript before - # streaming consumer stores all final results (seen as 24+ second delay in logs) - completion_key = f"transcription:complete:{session_id}" - max_wait_streaming = 30 # seconds - waited_streaming = 0.0 - while waited_streaming < max_wait_streaming: - completion_status = await redis_client.get(completion_key) - if completion_status: - status_str = completion_status.decode() if isinstance(completion_status, bytes) else completion_status - if status_str == "error": - logger.warning(f"⚠️ Streaming transcription ended with error for {session_id}, proceeding anyway") - else: - logger.info(f"✅ Streaming transcription confirmed complete for {session_id}") - break - await asyncio.sleep(0.5) - waited_streaming += 0.5 - - if waited_streaming >= max_wait_streaming: - logger.warning( - f"⚠️ Timed out waiting for streaming completion signal for {session_id} " - f"(waited {max_wait_streaming}s), proceeding with available transcript" - ) - # Wait for audio_streaming_persistence_job to complete and write MongoDB chunksbackends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (1)
632-654:⚠️ Potential issue | 🟠 Major
version_idis generated but never passed to the processing job.
version_idis created at line 633 and returned in the response (line 651), butenqueue_memory_processingon line 637 only receivesconversation_idandpriority— it never seesversion_id. The client receives aversion_idthat has no effect on the actual job, which is misleading.Either pass
version_idtoenqueue_memory_processingor stop generating/returning it.backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py (1)
181-196:⚠️ Potential issue | 🟡 MinorSegment access assumes objects with
.speakerand.textattributes, not dicts.Lines 185-186 call
segment.speakerandsegment.textdirectly. This is consistent withgenerate_detailed_summarybelow (line 281), but the docstring at line 171 describes segments as dicts ([{"speaker": str, "text": str, ...}]). Either update the docstring to reflect that these areSpeakerSegmentobjects or add duck-type handling.wizard.py (1)
850-883:⚠️ Potential issue | 🟡 MinorDuplicate step number "3" in the next steps output.
Line 853 shows "3. Or start individual services:" and line 882 shows "3. Check service status:". The second should be "4" and subsequent steps renumbered.
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
214-226:⚠️ Potential issue | 🟡 MinorDead code: debug logging block is unreachable after
raise.Lines 221–226 follow a
raise RuntimeError(…) from eon Line 216, so they will never execute. This appears to be pre-existing, but since this file is being modified in this PR, it's worth fixing. The debug block likely belongs after Line 206 (data = resp.json()) on the success path.🐛 Suggested fix — move debug logging to the success path
resp.raise_for_status() data = resp.json() + + # DEBUG: Log Deepgram response structure + if "results" in data and "channels" in data.get("results", {}): + channels = data["results"]["channels"] + if channels and "alternatives" in channels[0]: + alt = channels[0]["alternatives"][0] + logger.debug(f"DEBUG Registry: Deepgram alternative keys: {list(alt.keys())}") + except httpx.ConnectError as e: raise ConnectionError( ... ) from e except httpx.HTTPStatusError as e: status = e.response.status_code raise RuntimeError( f"Transcription service '{self._name}' at {url} returned HTTP {status}. " f"{'Check your API key.' if status in (401, 403) else ''}" ) from e - - # DEBUG: Log Deepgram response structure - if "results" in data and "channels" in data.get("results", {}): - channels = data["results"]["channels"] - if channels and "alternatives" in channels[0]: - alt = channels[0]["alternatives"][0] - logger.debug(f"DEBUG Registry: Deepgram alternative keys: {list(alt.keys())}")backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py (1)
322-336:⚠️ Potential issue | 🟡 Minor
use_per_segmentonly affectsidentify_provider_segments, not the full diarization path.When the code falls through to the
elsebranch (lines 337–376,diarize_identify_match), theper_segment/min_segment_durationparameters are not forwarded to the speaker service. The metadata at line 520 will still recordidentification_mode: "per_segment"even though the full diarization path was used without per-segment behavior.If this is intentional (per-segment only makes sense for pre-diarized segments), consider gating the metadata value on the actual code path taken, not just the config flag.
🧹 Suggested metadata fix
transcript_version.metadata["speaker_recognition"] = { "enabled": True, - "identification_mode": "per_segment" if use_per_segment else "majority_vote", + "identification_mode": ( + "per_segment" if use_per_segment and (transcript_version.segments and not can_run_pyannote) + else "majority_vote" + ), "identified_speakers": list(identified_speakers),backends/advanced/src/advanced_omi_backend/routers/modules/finetuning_routes.py (1)
25-62:⚠️ Potential issue | 🟠 Major
annotation_typequery parameter is accepted but ignored in the filter.The endpoint declares
annotation_typeas a query parameter (line 28, defaulting to"diarization") but the actual database query on line 54 always hard-codesAnnotationType.DIARIZATION. The parameter has no effect on behavior.Either use the parameter in the query or remove it from the signature to avoid misleading callers.
🐛 Proposed fix: use the parameter
+ # Validate and convert annotation type + try: + ann_type = AnnotationType(annotation_type) if annotation_type else AnnotationType.DIARIZATION + except ValueError: + raise HTTPException(status_code=400, detail=f"Unknown annotation type: {annotation_type}") + annotations = await Annotation.find( - Annotation.annotation_type == AnnotationType.DIARIZATION, + Annotation.annotation_type == ann_type, Annotation.processed == True, ).to_list()
🤖 Fix all issues with AI agents
In `@backends/advanced/src/advanced_omi_backend/cron_scheduler.py`:
- Around line 106-110: run_job_now currently allows concurrent manual and
scheduled runs because it doesn't honor the job's running flag; update
run_job_now to first look up the job's cfg (e.g., job = self.jobs[job_id]; cfg =
job.cfg) and if cfg.running is true, raise a ValueError (or return an error)
indicating the job is already running; if you have a per-job asyncio.Lock or
similar (e.g., cfg.lock), perform the running check and any mutation under that
lock to avoid races, then call await self._execute_job(job_id) as before so you
don't bypass the existing _execute_job behavior.
- Around line 167-181: The _load_jobs_from_config currently calls
croniter(schedule, now) without validating the schedule string; add a validation
step using croniter.is_valid(schedule) before creating the CronJobConfig and
calling croniter. If croniter.is_valid(schedule) is False, log an error (e.g.
via self.logger.error) including the job_id and bad schedule and skip adding
that job to self.jobs (or mark it disabled), otherwise proceed to compute
next_run with croniter(schedule, now). This mirrors the validation used in
update_job and prevents unhandled exceptions during startup.
- Around line 250-262: The loop currently fire-and-forgets
asyncio.create_task(self._execute_job(job_id)) which risks GC and lost
exceptions; add a tasks container (e.g., self._tasks = set()) in __init__,
change _loop to store each created Task in that set, add a done-callback that
removes the Task from self._tasks and logs any exception (inspect
task.exception() or use task.add_done_callback to call a handler that logs
unhandled exceptions), and ensure shutdown/stop logic awaits or cancels
remaining tasks from self._tasks; update references to _loop, _execute_job, and
__init__ accordingly.
In
`@backends/advanced/src/advanced_omi_backend/routers/modules/knowledge_graph_routes.py`:
- Around line 170-175: The validation currently rejects valid "clear" requests
because it uses truthiness checks on request.name, request.details, and
request.icon (so empty strings are treated as missing); change the guard in the
update handler to check for None explicitly (use "is None" on request.name,
request.details, request.icon) and only raise the HTTPException when all three
are None so that {"details": ""} is allowed to clear the field.
In
`@backends/advanced/src/advanced_omi_backend/services/memory/providers/chronicle.py`:
- Around line 596-606: The current code uses zip(texts_needing_embeddings,
embeddings) which can silently truncate if lengths differ; ensure the embedding
count invariant by either using zip(..., strict=True) when building
text_to_embedding or explicitly check len(embeddings) ==
len(texts_needing_embeddings) after calling
self.llm_provider.generate_embeddings and raise/log a ValueError if they differ
so the exception handler runs; update the block around generate_embeddings,
embeddings, text_to_embedding (and keep behavior consistent with add_memory) to
enforce and surface mismatches instead of silently truncating.
In
`@backends/advanced/src/advanced_omi_backend/services/transcription/context.py`:
- Around line 57-61: The call to registry.get_prompt("asr.hot_words") can raise
KeyError if prompts aren't registered yet; update the code around
get_prompt_registry() / registry.get_prompt(...) to catch KeyError (and
optionally Exception) and set hot_words to an empty string as a safe fallback
(matching the Redis handling pattern), so registry.get_prompt("asr.hot_words")
is wrapped in a try-except and does not propagate the error during startup or
LangFuse outages.
In `@backends/advanced/src/advanced_omi_backend/speaker_recognition_client.py`:
- Around line 277-280: The code currently hardcodes
form_data.add_field("user_id", "1") causing all users to share a single speaker
pool; replace this by computing a stable integer speaker_id from the provided
MongoDB ObjectId instead of "1". Implement a helper (e.g.,
map_user_oid_to_speaker_id(user_id)) referenced from the methods that call
form_data.add_field("user_id", ...) to: 1) look up or create a persistent
mapping in a new MongoDB collection (e.g., speaker_user_map) that stores
{mongo_oid -> integer_id}, or 2) if you prefer deterministic mapping, compute a
non-negative 32-bit hash of the ObjectId and use that as the integer id; ensure
the helper returns an int and raise/log a clear error if user_id is None or
mapping fails. Update all places in speaker_recognition_client.py that currently
add "user_id","1" to call this helper.
In `@backends/advanced/src/advanced_omi_backend/workers/finetuning_jobs.py`:
- Line 173: The current logger.debug call logs user-generated jargon content
(logger.debug(f"Cached jargon for user {user_id}: {jargon[:80]}...")), which can
leak PII; change it to avoid printing any jargon content and instead log a safe
metric (e.g., length or token count) or a redacted indicator. Locate the
logger.debug that references jargon and user_id, remove the substring output of
the jargon variable, and replace it with a message like "Cached jargon for user
{user_id}: length={len(jargon)}" or "cached_jargon=REDACTED" so no user text is
emitted.
- Around line 108-126: Replace the hardcoded user_id=1 when calling
speaker_client.get_speaker_by_name and speaker_client.enroll_new_speaker with
the conversation owner's actual user id from the already-fetched conversation
object (e.g., pass user_id=conversation["user_id"] or
user_id=conversation.user_id depending on how conversation is represented) so
lookups and enrollments are scoped to the correct user.
In `@extras/asr-services/pyproject.toml`:
- Around line 118-119: Remove "langfuse>=3.13.0,<4.0" from the vibevoice
dependency group in pyproject.toml (it is not imported or used by vibevoice);
either delete that entry entirely or move it into a new optional group named
observability (e.g., observability = ["langfuse>=3.13.0,<4.0"]) so users can opt
into LangFuse without it being a hard dependency; leave "bitsandbytes>=0.43.0"
in the vibevoice group as-is for quantization support.
In `@extras/speaker-recognition/webui/src/services/speakerIdentification.ts`:
- Around line 280-287: In processWithDiarizeIdentifyMatch, fix the inconsistent
field mapping for speaker IDs: replace usages of the nonexistent
segment.identified_id with segment.speaker_id inside the speakers
mapping/processing logic (the same field used elsewhere for speaker_id and
consistent with the backend /v1/diarize-identify-match response) so speaker
identification no longer falls back to segment.speaker.
In `@services.py`:
- Around line 410-412: The pre-filtering logic excludes LangFuse from the --all
run before _ensure_langfuse_env() can create its .env; update the filtering so
LangFuse is included when the backend indicates it is enabled (or call
_ensure_langfuse_env() prior to filtering). Concretely, modify the service list
pre-filter (used when assembling services for start --all, which currently calls
check_service_configured(s)) to treat "langfuse" as configured if
backend_has_langfuse_enabled() (or similar backend flag), or invoke
_ensure_langfuse_env() for "langfuse" before running check_service_configured;
ensure you reference the existing functions _ensure_langfuse_env and
check_service_configured and keep the rest of the filtering behavior unchanged.
In `@wizard.py`:
- Around line 745-756: Replace the plaintext prompt using console.input in the
wizard loop so the Neo4j password is masked: stop calling console.input("...")
and instead call the shared masked prompt utility (e.g., prompt_password from
setup_utils or the wrapper method self.prompt_password if available) to read
neo4j_password, preserve the default behavior (use "neo4jpassword" when empty or
on EOF), and keep the length check on neo4j_password; also change the
console.print(f"Using default password") to a plain string literal to avoid the
Ruff F541 f-string warning. Reference symbols: neo4j_password, console.input,
prompt_password (or getpass.getpass), and the EOFError handling in the same
while True loop.
🟡 Minor comments (16)
extras/asr-services/scripts/convert_to_ct2.py-63-66 (1)
63-66:⚠️ Potential issue | 🟡 MinorFix unused variable and extraneous f-string prefix (static analysis).
Line 64:
resultis assigned but never used. Line 66: f-string has no placeholders.Proposed fix
- result = subprocess.run(cmd, check=True) + subprocess.run(cmd, check=True) print() - print(f"Conversion successful!") + print("Conversion successful!")services.py-123-130 (1)
123-130:⚠️ Potential issue | 🟡 MinorMissing timeout on
subprocess.run— could hang indefinitely.Every other
subprocess.runcall in this file usestimeout=120. This call toinit.pyhas no timeout, so a misbehaving init script would block the entire startup flow forever.Proposed fix
- result = subprocess.run(cmd, cwd=service_path) + result = subprocess.run(cmd, cwd=service_path, timeout=120)extras/speaker-recognition/webui/src/services/speakerIdentification.ts-290-307 (1)
290-307:⚠️ Potential issue | 🟡 MinorConfidence summary doesn't account for segments below 0.4.
Segments with
confidence < 0.4(including the0default for unidentified speakers) are counted intotal_segmentsbut not in any of the three buckets. This meanstotal_segments != high + medium + low, which could confuse downstream consumers or dashboard displays.Consider either adding a fourth bucket (e.g.,
unidentifiedorvery_low_confidence) or adjustinglow_confidenceto cover< 0.6(i.e., all remaining segments).backends/advanced/src/advanced_omi_backend/config.py-201-208 (1)
201-208:⚠️ Potential issue | 🟡 MinorArchitectural mismatch:
per_segment_speaker_idreads frombackend.speaker_recognitionbutspeaker_recognitionis top-level indefaults.yml.The code reads from
backend.speaker_recognition(viaget_backend_config('speaker_recognition')), but indefaults.yml,speaker_recognitionis a top-level section (line 262), not nested underbackend. This creates two separate config namespaces:
- Top-level
speaker_recognitionindefaults.yml— contains existing settings likeenabled,service_url,timeout,similarity_threshold, etc., but is never read for this feature.backend.speaker_recognition— where the code reads/writes, but doesn't exist indefaults.yml.The code works (defaults to
Falseand persists correctly after UI changes), but the architectural inconsistency means addingper_segment_speaker_idto the top-levelspeaker_recognitionindefaults.ymlwon't be picked up.Consider moving
per_segment_speaker_idto the top-levelspeaker_recognitionsection indefaults.ymland using the appropriate service config loader, or explicitly definebackend.speaker_recognitionindefaults.ymlfor backend-specific speaker settings.backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py-786-789 (1)
786-789:⚠️ Potential issue | 🟡 MinorSetting
diarization_source = "provider"when provider didn't diarize is misleading.When
not has_words and has_segments and not provider_has_diarization(the segment-only fallback from lines 738-742), this condition still setsdiarization_source = "provider"at line 789. This is semantically incorrect — the segments exist but weren't produced by a diarizing provider.If this is intentional to signal the downstream speaker job to use segment-based identification, consider using a distinct value (e.g.,
"segments") to avoid confusion in logs, debugging, and future maintenance.backends/advanced/src/advanced_omi_backend/services/transcription/context.py-64-71 (1)
64-71:⚠️ Potential issue | 🟡 MinorSilent
except: passhides Redis connectivity problems.When Redis is misconfigured (wrong URL, auth failure, etc.), the silent
passon line 71 makes it impossible to diagnose why user jargon is never applied. At minimum, add a debug-level log.🔧 Proposed fix
except Exception: - pass # Redis unavailable → skip dynamic jargon + logger.debug("Redis unavailable — skipping dynamic jargon for user %s", user_id, exc_info=True)wizard.py-868-875 (1)
868-875:⚠️ Potential issue | 🟡 MinorRemove extraneous
fprefix on line 874.As flagged by Ruff (F541), this f-string has no placeholders:
🔧 Proposed fix
- console.print(f"[bold cyan]Prompt Management:[/bold cyan] Edit AI prompts at your LangFuse instance:") + console.print("[bold cyan]Prompt Management:[/bold cyan] Edit AI prompts at your LangFuse instance:")backends/advanced/init.py-447-464 (1)
447-464:⚠️ Potential issue | 🟡 MinorRemove extraneous
fprefix on f-strings without placeholders.Lines 452 and 470 use f-strings with no interpolation variables, as flagged by Ruff (F541).
🔧 Proposed fix
- self.console.print(f"[green]✅[/green] Neo4j: password configured via wizard") + self.console.print("[green]✅[/green] Neo4j: password configured via wizard")And in
setup_obsidian(line 470):- self.console.print(f"[green]✅[/green] Obsidian: enabled (configured via wizard)") + self.console.print("[green]✅[/green] Obsidian: enabled (configured via wizard)")backends/advanced/webui/src/components/knowledge-graph/EntityCard.tsx-75-96 (1)
75-96:⚠️ Potential issue | 🟡 MinorNo user-facing error feedback on save failure.
When
updateEntitythrows (Line 91), the error is only logged toconsole.error. The user stays in editing mode (good for retry) but receives no visible indication of what went wrong. Consider adding a brief error state or toast notification.💡 Minimal error state approach
const [saving, setSaving] = useState(false) + const [error, setError] = useState<string | null>(null) // ... const handleSave = async (e: React.MouseEvent) => { e.stopPropagation() + setError(null) // ... try { setSaving(true) const response = await knowledgeGraphApi.updateEntity(entity.id, updates) setIsEditing(false) onEntityUpdated?.(response.data.entity) } catch (err) { console.error('Failed to update entity:', err) + setError('Failed to save changes') } finally { setSaving(false) } }Then render the error near the save/cancel buttons or below the textarea.
backends/advanced/src/advanced_omi_backend/routers/modules/finetuning_routes.py-376-384 (1)
376-384:⚠️ Potential issue | 🟡 MinorChain the exception with
raise ... fromfor proper traceback.When re-raising as
HTTPExceptionfrom a caughtValueError, useraise ... fromto preserve the exception chain.🧹 Proposed fix
try: requested_type = AnnotationType(annotation_type) except ValueError: - raise HTTPException(status_code=400, detail=f"Unknown annotation type: {annotation_type}") + raise HTTPException(status_code=400, detail=f"Unknown annotation type: {annotation_type}") from Noneextras/asr-services/providers/vibevoice/transcriber.py-121-131 (1)
121-131:⚠️ Potential issue | 🟡 MinorInline comment contradicts actual priority order.
Line 121 says
"config.yml > env vars > hardcoded defaults"but the code evaluatesos.getenv(...) or config.get(...), which gives env vars priority over config. The module docstring at lines 11–12 and line 82 correctly document the priority asenv vars > config.yml > defaults.yml > hardcoded.🧹 Fix the comment
- # Batching config: config.yml > env vars > hardcoded defaults + # Batching config: env vars > config.yml > defaults.yml > hardcoded defaultsbackends/advanced/src/advanced_omi_backend/services/memory/providers/chronicle.py-625-636 (1)
625-636:⚠️ Potential issue | 🟡 MinorOuter
exceptfallback: remove extraneousfprefix and preferlogging.exception.Line 631 is an f-string with no placeholders. Also, using
memory_logger.exception(...)instead ofmemory_logger.error(...)on line 626 would automatically include the traceback.🧹 Proposed fix
except Exception as e: - memory_logger.error( + memory_logger.exception( f"❌ Reprocess memory failed for {source_id}: {e}" ) # Fall back to normal extraction on any unexpected error memory_logger.info( - f"🔄 Falling back to normal extraction after reprocess error" + "🔄 Falling back to normal extraction after reprocess error" )backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py-282-300 (1)
282-300:⚠️ Potential issue | 🟡 MinorRemove extraneous
fprefixes from plain strings.Lines 285-286 are f-strings with no interpolation placeholders. The
fprefix is unnecessary.🧹 Fix
logger.warning( - f"🎤 No word timestamps available, provider didn't diarize, " - f"and no existing segments to identify." + "🎤 No word timestamps available, provider didn't diarize, " + "and no existing segments to identify." )extras/asr-services/providers/vibevoice/transcriber.py-50-72 (1)
50-72:⚠️ Potential issue | 🟡 Minor
OmegaConf.to_containerwill fail on a plaindictif config keys are missing.If
"asr_services"or"vibevoice"keys don't exist in the merged config,merged.get("asr_services", {})returns a plain Pythondict, not an OmegaConf node. CallingOmegaConf.to_container({}, resolve=True)on a plain dict raises aValueError. The outertry/exceptcatches it gracefully, but it masks a valid config-missing scenario with a warning about "Failed to load config."🧹 Proposed fix: guard the to_container call
+ from omegaconf import DictConfig + asr_config = merged.get("asr_services", {}).get("vibevoice", {}) - resolved = OmegaConf.to_container(asr_config, resolve=True) + if isinstance(asr_config, DictConfig): + resolved = OmegaConf.to_container(asr_config, resolve=True) + else: + resolved = asr_config if isinstance(asr_config, dict) else {} logger.info(f"Loaded vibevoice config: {resolved}") return resolvedbackends/advanced/webui/src/pages/Finetuning.tsx-171-178 (1)
171-178:⚠️ Potential issue | 🟡 Minor
handleReattachrelies on the catch block for its primary flow — anti-pattern.This function expects the API call to fail and uses the error handler to display "Coming soon" as a
successMessage. This conflates error handling with intentional UI messaging, and if the API is later implemented and succeeds, the success path is a silent no-op (no feedback to the user).Proposed fix
If the feature isn't implemented yet, communicate that directly without making the API call:
const handleReattach = async () => { - try { - await finetuningApi.reattachOrphanedAnnotations() - } catch (err: any) { - const detail = err.response?.data?.detail || 'Reattach functionality coming soon' - setSuccessMessage(detail) - } + // TODO: Implement once the backend endpoint is ready + setSuccessMessage('Reattach functionality coming soon') }backends/advanced/src/advanced_omi_backend/cron_scheduler.py-78-89 (1)
78-89:⚠️ Potential issue | 🟡 Minor
start()will crash if Redis is unreachable — consider handling the connection error gracefully.
aioredis.from_urlcreates a lazy connection pool, so it won't fail here, but the subsequent_restore_statecall (line 85) will raise if Redis is down. Since the scheduler can function without persisted state (it recomputesnext_runfrom config), consider catching the Redis error in_restore_stateat the top level so startup isn't blocked.Currently
_restore_statedoes catch per-job errors, but a connection-level failure (e.g.,ConnectionRefusedError) from the first.get()call will propagate up and prevent the scheduler from starting entirely.Proposed fix
async def _restore_state(self) -> None: """Restore last_run / next_run from Redis.""" if not self._redis: return + try: + await self._redis.ping() + except Exception: + logger.warning("Redis unreachable — skipping state restore") + return for job_id, cfg in self.jobs.items():
🧹 Nitpick comments (28)
extras/asr-services/scripts/convert_to_ct2.py (1)
38-41:sys.exit()insideconvert_modelprevents clean reuse as a library function.If
convert_modelis ever called programmatically (e.g., from another script),sys.exitwill raiseSystemExitinstead of a catchable exception. Consider raisingValueError/RuntimeErrorand reservingsys.exitformain().extras/langfuse/docker-compose.yml (1)
14-60: Consider documenting that these defaults are for local development only.The environment block contains numerous default credentials (
postgres:postgres,minio:miniosecret,clickhouse:clickhouse,myredissecret). While this is standard for a local dev compose file inextras/, a brief comment at the top of the file clarifying this is not production-ready would help prevent accidental misuse.extras/speaker-recognition/webui/src/services/deepgram.ts (1)
91-93: Clean simplification — consider removing now-unused interface fields.The endpoint change looks good. However,
mode(line 27) andminDuration(line 28) inDeepgramTranscriptionOptionsare no longer consumed by the function logic, andmodeis still set inDEFAULT_DEEPGRAM_OPTIONS(line 76). These fields should be cleaned up to avoid confusion, especially since the interface is exported and may mislead external consumers about what parameters are actually used.services.py (1)
143-151: Langfuse branch is identical to every other branch.All three paths (
langfuse,backend,else) return the same(service_path / '.env').exists(). If this was added in anticipation of future differentiation, a brief comment would help; otherwise it's dead branching.extras/speaker-recognition/webui/src/services/speakerIdentification.ts (2)
310-312: Unguardederror.messageaccess in catch block.If the caught value isn't an
Errorinstance,error.messagewill beundefined, producing"Hybrid processing failed: undefined". The outerprocessAudiohandler (Line 114) correctly useserror instanceof Error ? error.message : 'Unknown error'— apply the same pattern here for consistent error messages.This same issue also exists in the other private method catch blocks (Lines 219, 371, 445, 532).
Proposed fix
} catch (error) { - throw new Error(`Hybrid processing failed: ${error.message}`) + throw new Error(`Hybrid processing failed: ${error instanceof Error ? error.message : 'Unknown error'}`) }
246-275: Duplicated FormData construction and API call for/v1/diarize-identify-match.The FormData building (file, transcript_data, optional params) and the
apiService.post('/v1/diarize-identify-match', ...)call are nearly identical betweenprocessWithHybrid(Lines 246-275) andprocessWithDiarizeIdentifyMatch(Lines 467-497). Extracting a shared helper (e.g.,callDiarizeIdentifyMatch(audioFile, transcriptData, options)) would reduce duplication and ensure both paths stay in sync — including the field mapping and timeout values that currently diverge.backends/advanced/src/advanced_omi_backend/config.py (1)
236-240: Docstring is stale — does not mentionper_segment_speaker_id.The docstring on line 217 mentions
always_persist_enabledand/oruse_provider_segmentsbut doesn't cover the newper_segment_speaker_idparameter.📝 Suggested docstring update
""" Save miscellaneous settings to config.yml using OmegaConf. Args: - settings: Dict with always_persist_enabled and/or use_provider_segments + settings: Dict with always_persist_enabled, use_provider_segments, and/or per_segment_speaker_id Returns: True if saved successfully, False otherwise """backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (1)
800-854: Job chaining for speaker → memory → title/summary is well-structured.The dependency chain correctly ensures sequential execution. Timeouts are reasonable for the workloads (20 min speakers, 30 min memory, 5 min title/summary).
One nit: job IDs here use
conversation_id[:12]whilereprocess_transcript(lines 493, 517, 539, 560) uses[:8]. Consider making this consistent to avoid confusion when debugging job queues.backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py (1)
162-244: Good consolidation of title + summary into a single LLM call.The combined approach reduces latency and cost by halving LLM calls. The parsing at lines 225-230 is simple but adequate given the fallbacks at lines 232-233 and the exception handler at lines 237-244.
One minor robustness note:
line.replace("Title:", "")replaces all occurrences of"Title:"in the line, not just the prefix. Usingremoveprefix(Python 3.9+) would be more precise:♻️ Optional: use removeprefix for safer parsing
if line.startswith("Title:"): - title = line.replace("Title:", "").strip().strip('"').strip("'") + title = line.removeprefix("Title:").strip().strip('"').strip("'") elif line.startswith("Summary:"): - summary = line.replace("Summary:", "").strip().strip('"').strip("'") + summary = line.removeprefix("Summary:").strip().strip('"').strip("'")backends/advanced/src/advanced_omi_backend/services/transcription/context.py (1)
64-69: New Redis connection created and destroyed per call.
aioredis.from_url()creates a fresh connection pool on every invocation. Ifgather_transcription_contextis called for each transcription request, this adds unnecessary overhead. Consider accepting an optional Redis client parameter or using a module-level shared pool.backends/advanced/src/advanced_omi_backend/services/knowledge_graph/service.py (1)
538-574: Use explicitOptional[str](orstr | None) for parameters defaulting toNone.Ruff (RUF013) correctly flags that PEP 484 prohibits implicit
Optional. Usingstr = Noneis ambiguous — it should beOptional[str] = Noneorstr | None = Noneto match the rest of this file's conventions and the type checker's expectations.Proposed fix
async def update_entity( self, entity_id: str, user_id: str, - name: str = None, - details: str = None, - icon: str = None, + name: Optional[str] = None, + details: Optional[str] = None, + icon: Optional[str] = None, ) -> Optional[Entity]:backends/advanced/src/advanced_omi_backend/app_factory.py (1)
339-347: Shutdown may instantiate a new scheduler if startup failed; uselogging.exceptionfor traceback.Two minor points:
- If the startup block failed (Line 238),
get_scheduler()in shutdown will create a brand-newCronSchedulerjust to callstop()on it. Consider guarding with a check or storing a reference.- Line 347:
application_logger.error(...)swallows the traceback. Useapplication_logger.exception(...)instead (per Ruff TRY400).Proposed fix
# Shutdown cron scheduler try: from advanced_omi_backend.cron_scheduler import get_scheduler scheduler = get_scheduler() - await scheduler.stop() - application_logger.info("Cron scheduler stopped") + if scheduler._running: + await scheduler.stop() + application_logger.info("Cron scheduler stopped") except Exception as e: - application_logger.error(f"Error stopping cron scheduler: {e}") + application_logger.exception(f"Error stopping cron scheduler: {e}")backends/advanced/webui/src/pages/System.tsx (1)
179-191: State replacement may drop new fields not yet returned by older backends.
setMiscSettings(response.data.settings)replaces the entire state object. If the backend hasn't been updated to returnper_segment_speaker_id, it will be missing from state, andsaveMiscSettingswill subsequently omit it. Consider merging with defaults:♻️ Suggested defensive merge
const response = await systemApi.getMiscSettings() if (response.data.status === 'success') { - setMiscSettings(response.data.settings) + setMiscSettings(prev => ({ ...prev, ...response.data.settings })) }backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (1)
360-448: Well-implemented reprocess actions method with proper fallback chain.The prompt resolution priority (custom → registry → hardcoded), JSON response format enforcement, and separate
JSONDecodeErrorhandling are all solid.One minor improvement: consider using
memory_logger.exception()instead ofmemory_logger.error()in the except blocks (Lines 444, 447) to capture stack traces, which is especially valuable for debugging a new feature path.♻️ Suggested logging improvement
except json.JSONDecodeError as e: - memory_logger.error(f"Reprocess LLM returned invalid JSON: {e}") + memory_logger.exception(f"Reprocess LLM returned invalid JSON: {e}") return {} except Exception as e: - memory_logger.error(f"propose_reprocess_actions failed: {e}") + memory_logger.exception(f"propose_reprocess_actions failed: {e}") return {}backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py (2)
416-526: Robust fallback chain in_process_speaker_reprocess.Each step that might fail (no active version, no source_version_id, source version not found, empty diff) gracefully falls back to normal
add_memory. The integration withmemory_service.reprocess_memoryproperly passes the computed diff and previous transcript.One minor observation: the function has no return type annotation. Consider adding
-> Tuple[bool, List[str]]for clarity since the docstring (Line 444) already describes the return type.🔧 Add return type annotation
+from typing import Tuple + async def _process_speaker_reprocess( memory_service, conversation_model, full_conversation: str, client_id: str, conversation_id: str, user_id: str, user_email: str, -): +) -> Tuple[bool, List[str]]:
203-237:enqueue_memory_processingdoesn't support setting thetriggermetadata needed for the reprocess pathway.The trigger is read from
current_rq_job.meta.get("trigger")(line 208), butenqueue_memory_processingdoesn't accept or forward ametadict. When jobs are enqueued through this helper, the reprocess pathway at line 216 is never activated.Callers requiring
"reprocess_after_speaker"behavior must use directmemory_queue.enqueue()calls with appropriatemeta(as shown at line 822). Consider adding an optionalmetaparameter toenqueue_memory_processingfor consistency:♻️ Suggested refactor
def enqueue_memory_processing( conversation_id: str, priority: JobPriority = JobPriority.NORMAL, + meta: dict = None, ):Then forward it in the enqueue call:
job = memory_queue.enqueue( process_memory_job, conversation_id, job_timeout=timeout_mapping.get(priority, 1800), result_ttl=JOB_RESULT_TTL, job_id=f"memory_{conversation_id[:8]}", description=f"Process memory for conversation {conversation_id[:8]}", + meta=meta or {}, )backends/advanced/src/advanced_omi_backend/routers/modules/annotation_routes.py (1)
309-387: Entity annotation creation and Neo4j update look good; minor duplication inupdate_kwargsconstruction.The
update_kwargsdict construction (Lines 362–366) duplicates the pattern at Lines 275–279 inupdate_annotation_status. Consider extracting a small helper if more entity fields are added in the future, but this is fine for now with only two fields.One note: the
entity_fieldvalidation on Line 328 is a good safeguard. IfEntityAnnotationCreateever gains more allowed fields, remember to update this check as well.backends/advanced/src/advanced_omi_backend/routers/modules/knowledge_graph_routes.py (1)
198-218: Annotation creation for changed fields — consider skipping annotations when the value didn't actually change.Line 203 checks
if new_value is not None, but doesn't verify the value actually differs from the existing one. If a user submits the same name, an annotation withoriginal_text == corrected_textis created. This pollutes the jargon/finetuning pipeline with no-op annotations.♻️ Add equality check before creating annotation
for field in ("name", "details"): new_value = getattr(request, field) if new_value is not None: old_value = getattr(existing, field) or "" + if new_value == old_value: + continue annotation = Annotation(backends/advanced/src/advanced_omi_backend/models/annotation.py (1)
166-175: Consider constrainingentity_fieldto known values.
entity_fieldis typed as a free-formstrwith only a comment hinting at"name"or"details". If these are the only valid values, aLiteral["name", "details"]type (or a small enum) would catch invalid input at the API boundary and make the contract explicit.This is not urgent — the comment is sufficient for now.
♻️ Optional: use Literal for entity_field
+from typing import Literal + class EntityAnnotationCreate(BaseModel): """Create entity annotation request.""" entity_id: str - entity_field: str # "name" or "details" + entity_field: Literal["name", "details"] original_text: str corrected_text: strbackends/advanced/src/advanced_omi_backend/services/memory/providers/chronicle.py (1)
479-510:previous_transcriptparameter is unused.The
previous_transcriptargument is declared in the signature but never referenced in the method body. If it's reserved for future use, consider documenting that explicitly or removing it until needed.backends/advanced/src/advanced_omi_backend/routers/modules/finetuning_routes.py (2)
230-258: Status endpoint loads all annotations into memory — potential scalability concern.
get_finetuning_statuscalls.to_list()for everyAnnotationType, loading all annotation documents into memory. With a growing annotation corpus this will degrade. Consider using aggregation pipelines orcount()queries for the status endpoint.This is fine for an admin-only endpoint at current scale but worth keeping in mind.
413-422: Orphaned annotations are deleted one-by-one; consider bulk delete for efficiency.The loop at lines 418–419 calls
await a.delete()individually for each orphaned annotation. For large orphan sets, a bulk delete viaAnnotation.find(...).delete()ordelete_manywould be significantly faster.♻️ Bulk delete example
for ann_type, annotations in annotations_by_type.items(): orphaned = [a for a in annotations if a.conversation_id in orphaned_conv_ids] - for a in orphaned: - await a.delete() + if orphaned: + orphaned_ids = [a.id for a in orphaned] + await Annotation.find({"_id": {"$in": orphaned_ids}}).delete() if orphaned: deleted_by_type[ann_type.value] = len(orphaned) total_deleted += len(orphaned)backends/advanced/src/advanced_omi_backend/cron_scheduler.py (1)
269-277: Singleton is not thread-safe, but acceptable for single-process asyncio.In a multi-worker deployment (e.g., multiple Uvicorn workers), each process gets its own singleton. This means multiple workers would each run the scheduler independently, potentially executing jobs multiple times. If this service scales beyond a single worker, a distributed lock (e.g., Redis-based) should be added.
backends/advanced/src/advanced_omi_backend/workers/finetuning_jobs.py (3)
155-186:User.find_all().to_list()loads all users into memory — will not scale.For a growing user base, this will consume increasing memory and potentially time out. Consider using an async iterator/cursor to process users in batches.
Proposed fix
- users = await User.find_all().to_list() + # Process users in batches to avoid loading all into memory + batch_size = 100 + skip = 0 + users_batch = await User.find_all().skip(skip).limit(batch_size).to_list() + while users_batch: + for user in users_batch: + user_id = str(user.id) + try: + jargon = await _extract_jargon_for_user(user_id) + if jargon: + await redis_client.set(f"asr:jargon:{user_id}", jargon, ex=JARGON_CACHE_TTL) + processed += 1 + logger.debug(f"Cached jargon for user {user_id}") + else: + skipped += 1 + except Exception as e: + logger.error(f"Jargon extraction failed for user {user_id}: {e}") + errors += 1 + skip += batch_size + users_batch = await User.find_all().skip(skip).limit(batch_size).to_list()
164-164: New Redis client created per job invocation — consider reusing the scheduler's connection.
run_asr_jargon_extraction_jobcreates its ownaioredis.from_urlon every run. TheCronScheduleralready maintains a Redis connection (self._redis). Passing it through or using a shared connection factory would reduce connection churn.
189-235: No timeout on the LLM call —async_generatecould hang indefinitely.Line 227 calls
await async_generate(prompt_template)with no timeout. If the LLM backend is slow or unresponsive, this will block the jargon extraction for the current user indefinitely, stalling the entire job (since users are processed sequentially).Consider wrapping with
asyncio.wait_for:Proposed fix
- result = await async_generate(prompt_template) + try: + result = await asyncio.wait_for(async_generate(prompt_template), timeout=60) + except asyncio.TimeoutError: + logger.warning(f"Jargon extraction LLM call timed out for user {user_id}") + return None(Requires
import asyncioat the top of the file.)backends/advanced/webui/src/pages/Finetuning.tsx (2)
379-437: IIFE in JSX reduces readability — consider extracting to a named component oruseMemo.The immediately-invoked function expression computes
totalOrphanedand renders a complex block. Extracting this into a small named component (e.g.,<AnnotationHeader>) or computingtotalOrphanedwithuseMemooutside the JSX would improve readability and make the render tree easier to follow.
190-220:handleRunNowshows "nothing to do" viasetError— consider using a neutral/warning state instead.Lines 201–203 display a "no work" message (e.g., "No annotations ready for training") through the red error banner. This isn't really an error condition — it's informational. Using
setSuccessMessageor a dedicated warning state would be more appropriate from a UX perspective.
| async def run_job_now(self, job_id: str) -> dict: | ||
| """Manually trigger a job regardless of schedule.""" | ||
| if job_id not in self.jobs: | ||
| raise ValueError(f"Unknown cron job: {job_id}") | ||
| return await self._execute_job(job_id) |
There was a problem hiding this comment.
run_job_now doesn't check cfg.running — allows overlapping manual + scheduled executions of the same job.
If a job is already running (triggered by the scheduler loop), calling run_job_now will execute it concurrently. The _loop guards against this with cfg.running, but run_job_now does not.
Proposed fix
async def run_job_now(self, job_id: str) -> dict:
"""Manually trigger a job regardless of schedule."""
if job_id not in self.jobs:
raise ValueError(f"Unknown cron job: {job_id}")
+ if self.jobs[job_id].running:
+ return {"error": f"Job '{job_id}' is already running"}
return await self._execute_job(job_id)🧰 Tools
🪛 Ruff (0.14.14)
[warning] 109-109: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/cron_scheduler.py` around lines
106 - 110, run_job_now currently allows concurrent manual and scheduled runs
because it doesn't honor the job's running flag; update run_job_now to first
look up the job's cfg (e.g., job = self.jobs[job_id]; cfg = job.cfg) and if
cfg.running is true, raise a ValueError (or return an error) indicating the job
is already running; if you have a per-job asyncio.Lock or similar (e.g.,
cfg.lock), perform the running check and any mutation under that lock to avoid
races, then call await self._execute_job(job_id) as before so you don't bypass
the existing _execute_job behavior.
| def _load_jobs_from_config(self) -> None: | ||
| """Read cron_jobs section from config.yml.""" | ||
| cfg = load_config() | ||
| cron_section = cfg.get("cron_jobs", {}) | ||
|
|
||
| for job_id, job_cfg in cron_section.items(): | ||
| schedule = str(job_cfg.get("schedule", "0 * * * *")) | ||
| now = datetime.now(timezone.utc) | ||
| self.jobs[job_id] = CronJobConfig( | ||
| job_id=job_id, | ||
| enabled=bool(job_cfg.get("enabled", False)), | ||
| schedule=schedule, | ||
| description=str(job_cfg.get("description", "")), | ||
| next_run=croniter(schedule, now).get_next(datetime), | ||
| ) |
There was a problem hiding this comment.
No validation that the cron schedule from config is valid.
_load_jobs_from_config passes the schedule string directly to croniter(schedule, now) on line 180. If a user has a malformed cron expression in config.yml, this will throw an unhandled exception and prevent the scheduler from starting. update_job (line 126) already validates with croniter.is_valid() — apply the same check here.
Proposed fix
for job_id, job_cfg in cron_section.items():
schedule = str(job_cfg.get("schedule", "0 * * * *"))
+ if not croniter.is_valid(schedule):
+ logger.warning(f"Invalid cron expression for job '{job_id}': {schedule} — skipping")
+ continue
now = datetime.now(timezone.utc)🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/cron_scheduler.py` around lines
167 - 181, The _load_jobs_from_config currently calls croniter(schedule, now)
without validating the schedule string; add a validation step using
croniter.is_valid(schedule) before creating the CronJobConfig and calling
croniter. If croniter.is_valid(schedule) is False, log an error (e.g. via
self.logger.error) including the job_id and bad schedule and skip adding that
job to self.jobs (or mark it disabled), otherwise proceed to compute next_run
with croniter(schedule, now). This mirrors the validation used in update_job and
prevents unhandled exceptions during startup.
| async def _loop(self) -> None: | ||
| """Main scheduler loop – checks every 30s for due jobs.""" | ||
| while self._running: | ||
| try: | ||
| now = datetime.now(timezone.utc) | ||
| for job_id, cfg in self.jobs.items(): | ||
| if not cfg.enabled or cfg.running: | ||
| continue | ||
| if cfg.next_run and now >= cfg.next_run: | ||
| asyncio.create_task(self._execute_job(job_id)) | ||
| except Exception as e: | ||
| logger.error(f"Error in cron scheduler loop: {e}", exc_info=True) | ||
| await asyncio.sleep(30) |
There was a problem hiding this comment.
Fire-and-forget asyncio.create_task — reference must be stored to prevent GC and to surface exceptions.
The task created on line 259 is not stored anywhere. In CPython, if no reference is held, the task can be garbage-collected before completion, and any exception it raises will be silently lost (you'll only see a Task was destroyed but it is pending! warning). This is also flagged by Ruff (RUF006).
Store task references (e.g., in a set) and add a done-callback to log unhandled exceptions:
Proposed fix
Add a field in __init__:
class CronScheduler:
def __init__(self) -> None:
self.jobs: Dict[str, CronJobConfig] = {}
self._running = False
self._task: Optional[asyncio.Task] = None
self._redis: Optional[aioredis.Redis] = None
+ self._active_tasks: set[asyncio.Task] = set()Then in _loop:
if cfg.next_run and now >= cfg.next_run:
- asyncio.create_task(self._execute_job(job_id))
+ task = asyncio.create_task(self._execute_job(job_id))
+ self._active_tasks.add(task)
+ task.add_done_callback(self._active_tasks.discard)🧰 Tools
🪛 Ruff (0.14.14)
[warning] 251-251: Docstring contains ambiguous – (EN DASH). Did you mean - (HYPHEN-MINUS)?
(RUF002)
[warning] 259-259: Store a reference to the return value of asyncio.create_task
(RUF006)
🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/cron_scheduler.py` around lines
250 - 262, The loop currently fire-and-forgets
asyncio.create_task(self._execute_job(job_id)) which risks GC and lost
exceptions; add a tasks container (e.g., self._tasks = set()) in __init__,
change _loop to store each created Task in that set, add a done-callback that
removes the Task from self._tasks and logs any exception (inspect
task.exception() or use task.add_done_callback to call a handler that logs
unhandled exceptions), and ensure shutdown/stop logic awaits or cancels
remaining tasks from self._tasks; update references to _loop, _execute_job, and
__init__ accordingly.
| try: | ||
| if not request.name and not request.details and not request.icon: | ||
| raise HTTPException( | ||
| status_code=400, | ||
| detail="At least one field (name, details, icon) must be provided", | ||
| ) |
There was a problem hiding this comment.
Validation rejects valid "clear details" requests because not "" is True.
Line 171 uses truthiness checks: not request.name and not request.details and not request.icon. An empty string "" is falsy in Python, so sending {"details": ""} to clear a field (while name and icon are None) would be rejected with a 400.
The frontend does produce this case: when a user clears existing details text without changing the entity name, updates will be {"details": ""}.
Use is None checks instead:
🐛 Proposed fix
- if not request.name and not request.details and not request.icon:
+ if request.name is None and request.details is None and request.icon is None:🧰 Tools
🪛 Ruff (0.14.14)
[warning] 172-175: Abstract raise to an inner function
(TRY301)
🤖 Prompt for AI Agents
In
`@backends/advanced/src/advanced_omi_backend/routers/modules/knowledge_graph_routes.py`
around lines 170 - 175, The validation currently rejects valid "clear" requests
because it uses truthiness checks on request.name, request.details, and
request.icon (so empty strings are treated as missing); change the guard in the
update handler to check for None explicitly (use "is None" on request.name,
request.details, request.icon) and only raise the HTTPException when all three
are None so that {"details": ""} is allowed to clear the field.
| embeddings = await asyncio.wait_for( | ||
| self.llm_provider.generate_embeddings(texts_needing_embeddings), | ||
| timeout=self.config.timeout_seconds, | ||
| ) | ||
| text_to_embedding = dict( | ||
| zip(texts_needing_embeddings, embeddings) | ||
| ) | ||
| except Exception as e: | ||
| memory_logger.warning( | ||
| f"Batch embedding generation failed for reprocess: {e}" | ||
| ) |
There was a problem hiding this comment.
zip() without strict=True can silently mask embedding count mismatches.
If generate_embeddings returns a different number of embeddings than texts_needing_embeddings, zip will silently truncate the longer list. Since an embedding count mismatch is treated as a hard error in add_memory (line 181–184), the same invariant should be enforced here.
🐛 Proposed fix
text_to_embedding = dict(
- zip(texts_needing_embeddings, embeddings)
+ zip(texts_needing_embeddings, embeddings, strict=True)
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| embeddings = await asyncio.wait_for( | |
| self.llm_provider.generate_embeddings(texts_needing_embeddings), | |
| timeout=self.config.timeout_seconds, | |
| ) | |
| text_to_embedding = dict( | |
| zip(texts_needing_embeddings, embeddings) | |
| ) | |
| except Exception as e: | |
| memory_logger.warning( | |
| f"Batch embedding generation failed for reprocess: {e}" | |
| ) | |
| embeddings = await asyncio.wait_for( | |
| self.llm_provider.generate_embeddings(texts_needing_embeddings), | |
| timeout=self.config.timeout_seconds, | |
| ) | |
| text_to_embedding = dict( | |
| zip(texts_needing_embeddings, embeddings, strict=True) | |
| ) | |
| except Exception as e: | |
| memory_logger.warning( | |
| f"Batch embedding generation failed for reprocess: {e}" | |
| ) |
🧰 Tools
🪛 Ruff (0.14.14)
[warning] 601-601: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
[warning] 603-603: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
In
`@backends/advanced/src/advanced_omi_backend/services/memory/providers/chronicle.py`
around lines 596 - 606, The current code uses zip(texts_needing_embeddings,
embeddings) which can silently truncate if lengths differ; ensure the embedding
count invariant by either using zip(..., strict=True) when building
text_to_embedding or explicitly check len(embeddings) ==
len(texts_needing_embeddings) after calling
self.llm_provider.generate_embeddings and raise/log a ValueError if they differ
so the exception handler runs; update the block around generate_embeddings,
embeddings, text_to_embedding (and keep behavior consistent with add_memory) to
enforce and surface mismatches instead of silently truncating.
| if jargon: | ||
| await redis_client.set(f"asr:jargon:{user_id}", jargon, ex=JARGON_CACHE_TTL) | ||
| processed += 1 | ||
| logger.debug(f"Cached jargon for user {user_id}: {jargon[:80]}...") |
There was a problem hiding this comment.
Logging user-generated jargon content may leak PII or sensitive terms.
Even truncated to 80 chars, jargon extracted from user memories could contain sensitive information (names, medical terms, etc.). Consider removing the content from the log or restricting to a count.
Proposed fix
- logger.debug(f"Cached jargon for user {user_id}: {jargon[:80]}...")
+ logger.debug(f"Cached jargon for user {user_id}: {len(jargon.split(','))} terms")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.debug(f"Cached jargon for user {user_id}: {jargon[:80]}...") | |
| logger.debug(f"Cached jargon for user {user_id}: {len(jargon.split(','))} terms") |
🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/workers/finetuning_jobs.py` at
line 173, The current logger.debug call logs user-generated jargon content
(logger.debug(f"Cached jargon for user {user_id}: {jargon[:80]}...")), which can
leak PII; change it to avoid printing any jargon content and instead log a safe
metric (e.g., length or token count) or a redacted indicator. Locate the
logger.debug that references jargon and user_id, remove the substring output of
the jargon variable, and replace it with a message like "Cached jargon for user
{user_id}: length={len(jargon)}" or "cached_jargon=REDACTED" so no user text is
emitted.
| const speakers: SpeakerSegment[] = backendSegments.map((segment: any) => ({ | ||
| start: segment.start, | ||
| end: segment.end, | ||
| speaker_id: segment.speakerId || `speaker_${segment.speaker}`, | ||
| speaker_name: segment.speakerName || `Speaker ${segment.speaker}`, | ||
| confidence: segment.confidence, | ||
| text: segment.text, | ||
| identified_speaker_id: segment.identifiedSpeakerId, | ||
| identified_speaker_name: segment.identifiedSpeakerName, | ||
| speaker_identification_confidence: segment.speakerIdentificationConfidence, | ||
| speaker_status: segment.speakerStatus | ||
| speaker_id: segment.speaker_id || segment.speaker, | ||
| speaker_name: segment.identified_as || `Unknown (${segment.speaker})`, | ||
| confidence: segment.confidence || 0, | ||
| text: segment.text | ||
| })) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for the backend endpoint /v1/diarize-identify-match to see what fields it returns
find extras/speaker-recognition -type f -name "*.py" | head -20Repository: SimpleOpenSoftware/chronicle
Length of output: 1379
🏁 Script executed:
#!/bin/bash
# Look for the backend route handler for diarize-identify-match
rg -n "diarize.*identify.*match" extras/speaker-recognition/ -t py -A 10 -B 2Repository: SimpleOpenSoftware/chronicle
Length of output: 16161
🏁 Script executed:
#!/bin/bash
# Search for response models or schemas that define the segment structure
rg -n "speaker_id\|identified_id" extras/speaker-recognition/ -t py -B 3 -A 3Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
#!/bin/bash
# Check the frontend file for context around both usages
cat -n extras/speaker-recognition/webui/src/services/speakerIdentification.ts | sed -n '280,290p;500,510p'Repository: SimpleOpenSoftware/chronicle
Length of output: 1161
🏁 Script executed:
#!/bin/bash
# Read the backend endpoint handler to see what fields it returns
wc -l extras/speaker-recognition/src/simple_speaker_recognition/api/routers/identification.pyRepository: SimpleOpenSoftware/chronicle
Length of output: 164
🏁 Script executed:
#!/bin/bash
# Read the diarize_identify_match function implementation
sed -n '306,570p' extras/speaker-recognition/src/simple_speaker_recognition/api/routers/identification.py | head -200Repository: SimpleOpenSoftware/chronicle
Length of output: 9661
🏁 Script executed:
#!/bin/bash
# Search more broadly for what fields are returned in segment responses
rg -n "identified_id|speaker_id" extras/speaker-recognition/ -A 2 -B 2 --type-list | head -100Repository: SimpleOpenSoftware/chronicle
Length of output: 461
🏁 Script executed:
#!/bin/bash
# Check if there's a response model or segment class definition
rg -n "class.*Segment\|identified_as\|identified_id" extras/speaker-recognition/src/ -A 2 -B 1 --type py | head -150Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
#!/bin/bash
# Continue reading the backend response building code
sed -n '520,650p' extras/speaker-recognition/src/simple_speaker_recognition/api/routers/identification.pyRepository: SimpleOpenSoftware/chronicle
Length of output: 6274
🏁 Script executed:
#!/bin/bash
# Look at how segments are constructed and returned - check later in the function
sed -n '420,550p' extras/speaker-recognition/src/simple_speaker_recognition/api/routers/identification.pyRepository: SimpleOpenSoftware/chronicle
Length of output: 5784
🏁 Script executed:
#!/bin/bash
# Check if there are any other methods that handle the same endpoint response
rg -n "diarize.*identify.*match\|identified_id" extras/speaker-recognition/webui/src/services/ -A 3 -B 1Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
#!/bin/bash
# Confirm the field names once more in both frontend locations
sed -n '502,509p' extras/speaker-recognition/webui/src/services/speakerIdentification.tsRepository: SimpleOpenSoftware/chronicle
Length of output: 478
Fix field mapping inconsistency for speaker_id in processWithDiarizeIdentifyMatch.
The backend endpoint /v1/diarize-identify-match returns segments with a speaker_id field. Line 283 correctly uses segment.speaker_id, but line 505 uses the non-existent field segment.identified_id, causing speaker identification to fail silently—it will always fall back to segment.speaker.
Change line 505 from segment.identified_id to segment.speaker_id to match the backend response and fix speaker identification in this processing mode.
Fix
const speakers: SpeakerSegment[] = backendSegments.map((segment: any) => ({
start: segment.start,
end: segment.end,
- speaker_id: segment.identified_id || segment.speaker,
+ speaker_id: segment.speaker_id || segment.speaker,
speaker_name: segment.identified_as || `Unknown (${segment.speaker})`,
confidence: segment.confidence || 0,
text: segment.text
}))🤖 Prompt for AI Agents
In `@extras/speaker-recognition/webui/src/services/speakerIdentification.ts`
around lines 280 - 287, In processWithDiarizeIdentifyMatch, fix the inconsistent
field mapping for speaker IDs: replace usages of the nonexistent
segment.identified_id with segment.speaker_id inside the speakers
mapping/processing logic (the same field used elsewhere for speaker_id and
consistent with the backend /v1/diarize-identify-match response) so speaker
identification no longer falls back to segment.speaker.
| if service_name == "langfuse" and not _ensure_langfuse_env(): | ||
| console.print("[yellow]⚠️ LangFuse not configured, skipping[/yellow]") | ||
| continue |
There was a problem hiding this comment.
start --all silently skips LangFuse auto-initialization.
_ensure_langfuse_env() on line 410 can create the missing .env, but when --all is used, the service list is pre-filtered on line 557 by check_service_configured(s), which returns False for langfuse if .env doesn't exist yet. So langfuse is excluded from the list before _ensure_langfuse_env ever runs.
A user with LangFuse configured in the backend but no extras/langfuse/.env will see langfuse silently skipped with start --all, but it would auto-init if they run start langfuse explicitly — an inconsistent experience.
Proposed fix — include langfuse in the `--all` list when backend has it enabled
if args.all:
- services = [s for s in SERVICES.keys() if check_service_configured(s)]
+ services = [
+ s for s in SERVICES.keys()
+ if check_service_configured(s)
+ or (s == "langfuse" and _langfuse_enabled_in_backend())
+ ]Also applies to: 555-557
🤖 Prompt for AI Agents
In `@services.py` around lines 410 - 412, The pre-filtering logic excludes
LangFuse from the --all run before _ensure_langfuse_env() can create its .env;
update the filtering so LangFuse is included when the backend indicates it is
enabled (or call _ensure_langfuse_env() prior to filtering). Concretely, modify
the service list pre-filter (used when assembling services for start --all,
which currently calls check_service_configured(s)) to treat "langfuse" as
configured if backend_has_langfuse_enabled() (or similar backend flag), or
invoke _ensure_langfuse_env() for "langfuse" before running
check_service_configured; ensure you reference the existing functions
_ensure_langfuse_env and check_service_configured and keep the rest of the
filtering behavior unchanged.
wizard.py
Outdated
| while True: | ||
| try: | ||
| neo4j_password = console.input("Neo4j password (min 8 chars) [default: neo4jpassword]: ").strip() | ||
| if not neo4j_password: | ||
| neo4j_password = "neo4jpassword" | ||
| if len(neo4j_password) >= 8: | ||
| break | ||
| console.print("[yellow][WARNING][/yellow] Password must be at least 8 characters") | ||
| except EOFError: | ||
| neo4j_password = "neo4jpassword" | ||
| console.print(f"Using default password") | ||
| break |
There was a problem hiding this comment.
Neo4j password is entered in plaintext (visible on terminal).
console.input() at line 747 does not mask the password, unlike backends/advanced/init.py which uses self.prompt_password() (which delegates to a utility that masks input). This inconsistency means running the wizard exposes the Neo4j password in the terminal history and to shoulder-surfers.
Consider using the shared prompt_password utility from setup_utils or getpass.getpass for consistency:
🔧 Proposed fix
+ from setup_utils import prompt_password as util_prompt_password
+
while True:
try:
- neo4j_password = console.input("Neo4j password (min 8 chars) [default: neo4jpassword]: ").strip()
+ neo4j_password = util_prompt_password("Neo4j password (min 8 chars)", min_length=8, allow_generated=True)
if not neo4j_password:
neo4j_password = "neo4jpassword"
- if len(neo4j_password) >= 8:
- break
- console.print("[yellow][WARNING][/yellow] Password must be at least 8 characters")
+ break
except EOFError:
neo4j_password = "neo4jpassword"
- console.print(f"Using default password")
+ console.print("Using default password")
breakAlso fixes the Ruff F541 f-string warning on line 755.
🧰 Tools
🪛 Ruff (0.14.14)
[error] 749-749: Possible hardcoded password assigned to: "neo4j_password"
(S105)
[error] 754-754: Possible hardcoded password assigned to: "neo4j_password"
(S105)
[error] 755-755: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
In `@wizard.py` around lines 745 - 756, Replace the plaintext prompt using
console.input in the wizard loop so the Neo4j password is masked: stop calling
console.input("...") and instead call the shared masked prompt utility (e.g.,
prompt_password from setup_utils or the wrapper method self.prompt_password if
available) to read neo4j_password, preserve the default behavior (use
"neo4jpassword" when empty or on EOF), and keep the length check on
neo4j_password; also change the console.print(f"Using default password") to a
plain string literal to avoid the Ruff F541 f-string warning. Reference symbols:
neo4j_password, console.input, prompt_password (or getpass.getpass), and the
EOFError handling in the same while True loop.
- Updated services.py to include LangFuse configuration checks during service startup, improving observability setup. - Refactored wizard.py to utilize a masked input for Neo4j password prompts, enhancing user experience and security. - Improved cron scheduler in advanced_omi_backend to manage active tasks and validate cron expressions, ensuring robust job execution. - Enhanced speaker recognition client documentation to clarify user_id limitations, preparing for future multi-user support. - Updated knowledge graph routes to enforce validation on entity updates, ensuring at least one field is provided for updates.
|
| Metric | Count |
|---|---|
| ✅ Passed | 102 |
| ❌ Failed | 20 |
| 📊 Total | 122 |
📊 View Reports
GitHub Pages (Live Reports):
Download Artifacts:
- robot-test-reports-html-no-api - HTML reports
- robot-test-results-xml-no-api - XML output
* Refactor connect-omi.py for improved device selection and user interaction - Replaced references to the chronicle Bluetooth library with friend_lite for device management. - Removed the list_devices function and implemented a new prompt_user_to_pick_device function to enhance user interaction when selecting OMI/Neo devices. - Updated the find_and_set_omi_mac function to utilize the new device selection method, improving the overall flow of device connection. - Added a new scan_devices.py script for quick scanning of neo/neosapien devices, enhancing usability. - Updated README.md to reflect new usage instructions and prerequisites for connecting to OMI devices over Bluetooth. - Enhanced start.sh to ensure proper environment variable setup for macOS users. * Add friend-lite-sdk: Initial implementation of Python SDK for OMI/Friend Lite BLE devices - Introduced the friend-lite-sdk, a Python SDK for OMI/Friend Lite BLE devices, enabling audio streaming, button events, and transcription functionalities. - Added LICENSE and NOTICE files to clarify licensing and attribution. - Created pyproject.toml for package management, specifying dependencies and project metadata. - Developed core modules including bluetooth connection handling, button event parsing, audio decoding, and transcription capabilities. - Implemented example usage in README.md to guide users on installation and basic functionality. - Enhanced connect-omi.py to utilize the new SDK for improved device management and event handling. - Updated requirements.txt to reference the new SDK for local development. This commit lays the foundation for further enhancements and integrations with OMI devices. * Enhance client state and plugin architecture for button event handling - Introduced a new `markers` list in `ClientState` to collect button event data during sessions. - Added `add_marker` method to facilitate the addition of markers to the current session. - Implemented `on_button_event` method in the `BasePlugin` class to handle device button events, providing context data for button state and timestamps. - Updated `PluginRouter` to route button events to the appropriate plugin handler. - Enhanced conversation job handling to attach markers from Redis sessions, improving the tracking of button events during conversations. * Move plugins locatino - Introduced the Email Summarizer plugin that automatically sends email summaries upon conversation completion. - Implemented SMTP email service for sending formatted HTML and plain text emails. - Added configuration options for SMTP settings and email content in `config.yml`. - Created setup script for easy configuration of SMTP credentials and plugin orchestration. - Enhanced documentation with usage instructions and troubleshooting tips for the plugin. - Updated existing plugin architecture to support new event handling for email summaries. * Enhance Docker Compose and Plugin Management - Added external plugins directory to Docker Compose files for better plugin management. - Updated environment variables for MongoDB and Redis services to ensure consistent behavior. - Introduced new dependencies in `uv.lock` for improved functionality. - Refactored audio processing to support various audio formats and enhance error handling. - Implemented new plugin event types and services for better integration and communication between plugins. - Enhanced conversation and session management to support new closing mechanisms and event logging. * Update audio processing and event logging - Increased the maximum event log size in PluginRouter from 200 to 1000 for improved event tracking. - Refactored audio stream producer to dynamically read audio format from Redis session metadata, enhancing flexibility in audio handling. - Updated transcription job processing to utilize session-specific audio format settings, ensuring accurate audio processing. - Enhanced audio file writing utility to accept PCM parameters, allowing for better control over audio data handling. * Add markers list to ClientState and update timeout trigger comment - Introduced a new `markers` list in `ClientState` to track button event data during conversations. - Updated comment in `open_conversation_job` to clarify the behavior of the `timeout_triggered` variable, ensuring better understanding of session management. * Refactor audio file logging and error handling - Updated audio processing logs to consistently use the `filename` variable instead of `file.filename` for clarity. - Enhanced error logging to utilize the `filename` variable, improving traceability of issues during audio processing. - Adjusted title generation logic to handle cases where the filename is "unknown," ensuring a default title is used. - Minor refactor in conversation closing logs to use `user.user_id` for better consistency in user identification. * Enhance conversation retrieval with pagination and orphan handling - Updated `get_conversations` function to support pagination through `limit` and `offset` parameters, improving performance for large datasets. - Consolidated query logic to fetch both normal and orphan conversations in a single database call, reducing round-trips and enhancing efficiency. - Modified the response structure to include total count, limit, and offset in the returned data for better client-side handling. - Adjusted database indexing to optimize queries for paginated results, ensuring faster access to conversation data. * Refactor connection logging in transcribe function - Moved connection logging for the Wyoming server to a more structured format within the `transcribe_wyoming` function. - Ensured that connection attempts and successes are logged consistently for better traceability during audio transcription processes.
* Enhance ASR service descriptions and provider feedback in wizard.py - Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general. - Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection. * Implement LangFuse integration for observability and prompt management - Added LangFuse configuration options in the .env.template for observability and prompt management. - Introduced setup_langfuse method in ChronicleSetup to handle LangFuse initialization and configuration prompts. - Enhanced prompt management by integrating a centralized PromptRegistry for dynamic prompt retrieval and registration. - Updated various services to utilize prompts from the PromptRegistry, improving flexibility and maintainability. - Refactored OpenAI client initialization to support optional LangFuse tracing, enhancing observability during API interactions. - Added new prompt defaults for memory management and conversation handling, ensuring consistent behavior across the application. * Enhance LangFuse integration and service management - Added LangFuse service configuration in services.py and wizard.py, including paths, commands, and descriptions. - Implemented auto-selection for LangFuse during service setup, improving user experience. - Enhanced service startup process to display prompt management tips for LangFuse, guiding users on editing AI prompts. - Updated run_service_setup to handle LangFuse-specific parameters, including admin credentials and API keys, ensuring seamless integration with backend services. * Feat/better reprocess memory (#300) * Enhance ASR service descriptions and provider feedback in wizard.py (#290) - Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general. - Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection. * Refactor Obsidian and Knowledge Graph integration in services and setup - Removed redundant Obsidian and Knowledge Graph configuration checks from services.py, streamlining the command execution process. - Updated wizard.py to enhance user experience by setting default options for speaker recognition during service selection. - Improved Neo4j password handling in setup processes, ensuring consistent configuration prompts and feedback. - Introduced a new cron scheduler for managing scheduled tasks, enhancing the backend's automation capabilities. - Added new entity annotation features, allowing for corrections and updates to knowledge graph entities directly through the API. * Enhance ASR services configuration and VibeVoice integration - Added new configuration options for VibeVoice ASR in defaults.yml, including batching parameters for audio processing. - Updated Docker Compose files to mount the config directory, ensuring access to ASR service configurations. - Enhanced the VibeVoice transcriber to load configuration settings from defaults.yml, allowing for dynamic adjustments via environment variables. - Introduced quantization options for model loading in the VibeVoice transcriber, improving performance and flexibility. - Refactored the speaker identification process to streamline audio handling and improve logging for better debugging. - Updated documentation to reflect new configuration capabilities and usage instructions for the VibeVoice ASR provider. * Enhance LangFuse integration and memory reprocessing capabilities - Introduced functions for checking LangFuse configuration in services.py, ensuring proper setup for observability. - Updated wizard.py to facilitate user input for LangFuse configuration, including options for local and external setups. - Implemented memory reprocessing logic in memory services to update existing memories based on speaker re-identification. - Enhanced speaker recognition client to support per-segment identification, improving accuracy during reprocessing. - Refactored various components to streamline handling of LangFuse parameters and improve overall service management. * Enhance service management and user input handling - Updated services.py to include LangFuse configuration checks during service startup, improving observability setup. - Refactored wizard.py to utilize a masked input for Neo4j password prompts, enhancing user experience and security. - Improved cron scheduler in advanced_omi_backend to manage active tasks and validate cron expressions, ensuring robust job execution. - Enhanced speaker recognition client documentation to clarify user_id limitations, preparing for future multi-user support. - Updated knowledge graph routes to enforce validation on entity updates, ensuring at least one field is provided for updates. * fix: Plugin System Refactor (#301) * Refactor connect-omi.py for improved device selection and user interaction - Replaced references to the chronicle Bluetooth library with friend_lite for device management. - Removed the list_devices function and implemented a new prompt_user_to_pick_device function to enhance user interaction when selecting OMI/Neo devices. - Updated the find_and_set_omi_mac function to utilize the new device selection method, improving the overall flow of device connection. - Added a new scan_devices.py script for quick scanning of neo/neosapien devices, enhancing usability. - Updated README.md to reflect new usage instructions and prerequisites for connecting to OMI devices over Bluetooth. - Enhanced start.sh to ensure proper environment variable setup for macOS users. * Add friend-lite-sdk: Initial implementation of Python SDK for OMI/Friend Lite BLE devices - Introduced the friend-lite-sdk, a Python SDK for OMI/Friend Lite BLE devices, enabling audio streaming, button events, and transcription functionalities. - Added LICENSE and NOTICE files to clarify licensing and attribution. - Created pyproject.toml for package management, specifying dependencies and project metadata. - Developed core modules including bluetooth connection handling, button event parsing, audio decoding, and transcription capabilities. - Implemented example usage in README.md to guide users on installation and basic functionality. - Enhanced connect-omi.py to utilize the new SDK for improved device management and event handling. - Updated requirements.txt to reference the new SDK for local development. This commit lays the foundation for further enhancements and integrations with OMI devices. * Enhance client state and plugin architecture for button event handling - Introduced a new `markers` list in `ClientState` to collect button event data during sessions. - Added `add_marker` method to facilitate the addition of markers to the current session. - Implemented `on_button_event` method in the `BasePlugin` class to handle device button events, providing context data for button state and timestamps. - Updated `PluginRouter` to route button events to the appropriate plugin handler. - Enhanced conversation job handling to attach markers from Redis sessions, improving the tracking of button events during conversations. * Move plugins locatino - Introduced the Email Summarizer plugin that automatically sends email summaries upon conversation completion. - Implemented SMTP email service for sending formatted HTML and plain text emails. - Added configuration options for SMTP settings and email content in `config.yml`. - Created setup script for easy configuration of SMTP credentials and plugin orchestration. - Enhanced documentation with usage instructions and troubleshooting tips for the plugin. - Updated existing plugin architecture to support new event handling for email summaries. * Enhance Docker Compose and Plugin Management - Added external plugins directory to Docker Compose files for better plugin management. - Updated environment variables for MongoDB and Redis services to ensure consistent behavior. - Introduced new dependencies in `uv.lock` for improved functionality. - Refactored audio processing to support various audio formats and enhance error handling. - Implemented new plugin event types and services for better integration and communication between plugins. - Enhanced conversation and session management to support new closing mechanisms and event logging. * Update audio processing and event logging - Increased the maximum event log size in PluginRouter from 200 to 1000 for improved event tracking. - Refactored audio stream producer to dynamically read audio format from Redis session metadata, enhancing flexibility in audio handling. - Updated transcription job processing to utilize session-specific audio format settings, ensuring accurate audio processing. - Enhanced audio file writing utility to accept PCM parameters, allowing for better control over audio data handling. * Add markers list to ClientState and update timeout trigger comment - Introduced a new `markers` list in `ClientState` to track button event data during conversations. - Updated comment in `open_conversation_job` to clarify the behavior of the `timeout_triggered` variable, ensuring better understanding of session management. * Refactor audio file logging and error handling - Updated audio processing logs to consistently use the `filename` variable instead of `file.filename` for clarity. - Enhanced error logging to utilize the `filename` variable, improving traceability of issues during audio processing. - Adjusted title generation logic to handle cases where the filename is "unknown," ensuring a default title is used. - Minor refactor in conversation closing logs to use `user.user_id` for better consistency in user identification. * Enhance conversation retrieval with pagination and orphan handling - Updated `get_conversations` function to support pagination through `limit` and `offset` parameters, improving performance for large datasets. - Consolidated query logic to fetch both normal and orphan conversations in a single database call, reducing round-trips and enhancing efficiency. - Modified the response structure to include total count, limit, and offset in the returned data for better client-side handling. - Adjusted database indexing to optimize queries for paginated results, ensuring faster access to conversation data. * Refactor connection logging in transcribe function - Moved connection logging for the Wyoming server to a more structured format within the `transcribe_wyoming` function. - Ensured that connection attempts and successes are logged consistently for better traceability during audio transcription processes. * Feat/neo sdk (#302) * Update friend-lite-sdk for Neo1 device support and enhance documentation - Updated the friend-lite-sdk to version 0.3.0, reflecting the transition to support OMI/Neo1 BLE wearable devices. - Refactored the Bluetooth connection handling to introduce a new `WearableConnection` class, enhancing the connection lifecycle management for wearable devices. - Added a new `Neo1Connection` class for controlling Neo1 devices, including methods for sleep and wake functionalities. - Updated UUID constants to include Neo1-specific characteristics, improving device interaction capabilities. - Revised the plugin development guide to reflect changes in device naming and connection processes. - Removed outdated local OMI Bluetooth scripts and documentation to streamline the project structure and focus on wearable client development. * Refactor backend audio streaming to use Opus codec and enhance menu app functionality - Updated backend_sender.py to stream raw Opus audio instead of PCM, improving bandwidth efficiency. - Modified stream_to_backend function to handle Opus audio data and adjusted audio chunk parameters accordingly. - Enhanced main.py with new CLI commands for device scanning and connection management, improving user experience. - Introduced menu_app.py for a macOS menu bar application, providing a user-friendly interface for device management and status display. - Added README.md to document usage instructions and configuration details for the local wearable client. - Updated requirements.txt to include new dependencies for the menu app and service management. - Implemented service.py for managing launchd service installation and configuration on macOS, enabling auto-start on login. * Refactor audio processing and queue management in local wearable client - Removed the audio queue in favor of a dedicated BLE data queue and backend queue for improved data handling. - Enhanced the `connect_and_stream` function to streamline audio decoding and writing to the local file sink. - Updated the handling of BLE data to ensure robust queue management and error logging. - Improved task management during device disconnection to ensure proper cleanup and error handling. - Updated requirements.txt to specify a minimum version for easy_audio_interfaces, ensuring compatibility.
* Enhance ASR service descriptions and provider feedback in wizard.py - Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general. - Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection. * Implement LangFuse integration for observability and prompt management - Added LangFuse configuration options in the .env.template for observability and prompt management. - Introduced setup_langfuse method in ChronicleSetup to handle LangFuse initialization and configuration prompts. - Enhanced prompt management by integrating a centralized PromptRegistry for dynamic prompt retrieval and registration. - Updated various services to utilize prompts from the PromptRegistry, improving flexibility and maintainability. - Refactored OpenAI client initialization to support optional LangFuse tracing, enhancing observability during API interactions. - Added new prompt defaults for memory management and conversation handling, ensuring consistent behavior across the application. * Enhance LangFuse integration and service management - Added LangFuse service configuration in services.py and wizard.py, including paths, commands, and descriptions. - Implemented auto-selection for LangFuse during service setup, improving user experience. - Enhanced service startup process to display prompt management tips for LangFuse, guiding users on editing AI prompts. - Updated run_service_setup to handle LangFuse-specific parameters, including admin credentials and API keys, ensuring seamless integration with backend services. * Feat/better reprocess memory (#300) * Enhance ASR service descriptions and provider feedback in wizard.py (#290) - Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general. - Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection. * Refactor Obsidian and Knowledge Graph integration in services and setup - Removed redundant Obsidian and Knowledge Graph configuration checks from services.py, streamlining the command execution process. - Updated wizard.py to enhance user experience by setting default options for speaker recognition during service selection. - Improved Neo4j password handling in setup processes, ensuring consistent configuration prompts and feedback. - Introduced a new cron scheduler for managing scheduled tasks, enhancing the backend's automation capabilities. - Added new entity annotation features, allowing for corrections and updates to knowledge graph entities directly through the API. * Enhance ASR services configuration and VibeVoice integration - Added new configuration options for VibeVoice ASR in defaults.yml, including batching parameters for audio processing. - Updated Docker Compose files to mount the config directory, ensuring access to ASR service configurations. - Enhanced the VibeVoice transcriber to load configuration settings from defaults.yml, allowing for dynamic adjustments via environment variables. - Introduced quantization options for model loading in the VibeVoice transcriber, improving performance and flexibility. - Refactored the speaker identification process to streamline audio handling and improve logging for better debugging. - Updated documentation to reflect new configuration capabilities and usage instructions for the VibeVoice ASR provider. * Enhance LangFuse integration and memory reprocessing capabilities - Introduced functions for checking LangFuse configuration in services.py, ensuring proper setup for observability. - Updated wizard.py to facilitate user input for LangFuse configuration, including options for local and external setups. - Implemented memory reprocessing logic in memory services to update existing memories based on speaker re-identification. - Enhanced speaker recognition client to support per-segment identification, improving accuracy during reprocessing. - Refactored various components to streamline handling of LangFuse parameters and improve overall service management. * Enhance service management and user input handling - Updated services.py to include LangFuse configuration checks during service startup, improving observability setup. - Refactored wizard.py to utilize a masked input for Neo4j password prompts, enhancing user experience and security. - Improved cron scheduler in advanced_omi_backend to manage active tasks and validate cron expressions, ensuring robust job execution. - Enhanced speaker recognition client documentation to clarify user_id limitations, preparing for future multi-user support. - Updated knowledge graph routes to enforce validation on entity updates, ensuring at least one field is provided for updates. * fix: Plugin System Refactor (#301) * Refactor connect-omi.py for improved device selection and user interaction - Replaced references to the chronicle Bluetooth library with friend_lite for device management. - Removed the list_devices function and implemented a new prompt_user_to_pick_device function to enhance user interaction when selecting OMI/Neo devices. - Updated the find_and_set_omi_mac function to utilize the new device selection method, improving the overall flow of device connection. - Added a new scan_devices.py script for quick scanning of neo/neosapien devices, enhancing usability. - Updated README.md to reflect new usage instructions and prerequisites for connecting to OMI devices over Bluetooth. - Enhanced start.sh to ensure proper environment variable setup for macOS users. * Add friend-lite-sdk: Initial implementation of Python SDK for OMI/Friend Lite BLE devices - Introduced the friend-lite-sdk, a Python SDK for OMI/Friend Lite BLE devices, enabling audio streaming, button events, and transcription functionalities. - Added LICENSE and NOTICE files to clarify licensing and attribution. - Created pyproject.toml for package management, specifying dependencies and project metadata. - Developed core modules including bluetooth connection handling, button event parsing, audio decoding, and transcription capabilities. - Implemented example usage in README.md to guide users on installation and basic functionality. - Enhanced connect-omi.py to utilize the new SDK for improved device management and event handling. - Updated requirements.txt to reference the new SDK for local development. This commit lays the foundation for further enhancements and integrations with OMI devices. * Enhance client state and plugin architecture for button event handling - Introduced a new `markers` list in `ClientState` to collect button event data during sessions. - Added `add_marker` method to facilitate the addition of markers to the current session. - Implemented `on_button_event` method in the `BasePlugin` class to handle device button events, providing context data for button state and timestamps. - Updated `PluginRouter` to route button events to the appropriate plugin handler. - Enhanced conversation job handling to attach markers from Redis sessions, improving the tracking of button events during conversations. * Move plugins locatino - Introduced the Email Summarizer plugin that automatically sends email summaries upon conversation completion. - Implemented SMTP email service for sending formatted HTML and plain text emails. - Added configuration options for SMTP settings and email content in `config.yml`. - Created setup script for easy configuration of SMTP credentials and plugin orchestration. - Enhanced documentation with usage instructions and troubleshooting tips for the plugin. - Updated existing plugin architecture to support new event handling for email summaries. * Enhance Docker Compose and Plugin Management - Added external plugins directory to Docker Compose files for better plugin management. - Updated environment variables for MongoDB and Redis services to ensure consistent behavior. - Introduced new dependencies in `uv.lock` for improved functionality. - Refactored audio processing to support various audio formats and enhance error handling. - Implemented new plugin event types and services for better integration and communication between plugins. - Enhanced conversation and session management to support new closing mechanisms and event logging. * Update audio processing and event logging - Increased the maximum event log size in PluginRouter from 200 to 1000 for improved event tracking. - Refactored audio stream producer to dynamically read audio format from Redis session metadata, enhancing flexibility in audio handling. - Updated transcription job processing to utilize session-specific audio format settings, ensuring accurate audio processing. - Enhanced audio file writing utility to accept PCM parameters, allowing for better control over audio data handling. * Add markers list to ClientState and update timeout trigger comment - Introduced a new `markers` list in `ClientState` to track button event data during conversations. - Updated comment in `open_conversation_job` to clarify the behavior of the `timeout_triggered` variable, ensuring better understanding of session management. * Refactor audio file logging and error handling - Updated audio processing logs to consistently use the `filename` variable instead of `file.filename` for clarity. - Enhanced error logging to utilize the `filename` variable, improving traceability of issues during audio processing. - Adjusted title generation logic to handle cases where the filename is "unknown," ensuring a default title is used. - Minor refactor in conversation closing logs to use `user.user_id` for better consistency in user identification. * Enhance conversation retrieval with pagination and orphan handling - Updated `get_conversations` function to support pagination through `limit` and `offset` parameters, improving performance for large datasets. - Consolidated query logic to fetch both normal and orphan conversations in a single database call, reducing round-trips and enhancing efficiency. - Modified the response structure to include total count, limit, and offset in the returned data for better client-side handling. - Adjusted database indexing to optimize queries for paginated results, ensuring faster access to conversation data. * Refactor connection logging in transcribe function - Moved connection logging for the Wyoming server to a more structured format within the `transcribe_wyoming` function. - Ensured that connection attempts and successes are logged consistently for better traceability during audio transcription processes. * Refactor configuration management and enhance plugin architecture - Replaced PyYAML with ruamel.yaml for improved YAML handling, preserving quotes and enhancing configuration loading. - Updated ConfigManager to utilize ruamel.yaml for loading and saving configuration files, ensuring better error handling and validation. - Enhanced service startup messages to display access URLs for backend services, improving user experience. - Introduced new plugin health tracking in PluginRouter, allowing for better monitoring of plugin initialization and error states. - Refactored audio stream client and conversation management to streamline audio processing and improve error handling. - Updated Docker and requirements configurations to include ruamel.yaml, ensuring compatibility across environments. * refactor clean up script * cleanup partial mycelia integration * Refactor configuration management and remove Mycelia integration - Updated ConfigManager to remove references to the Mycelia memory provider, simplifying the memory provider options to only include "chronicle" and "openmemory_mcp". - Cleaned up Makefile by removing Mycelia-related targets and help descriptions, streamlining the build process. - Enhanced cleanup script documentation for clarity on usage and options. - Introduced LLM operation configurations to improve model management and prompt optimization capabilities. * Refactor Docker and cleanup scripts to remove 'uv' command usage - Updated cleanup.sh to directly execute the Python script without 'uv' command. - Modified Docker Compose files to remove 'uv run' from service commands, simplifying execution. - Enhanced start.sh to reflect changes in command usage and improve clarity in usage instructions. - Introduced a new transcription job timeout configuration in the backend, allowing for dynamic timeout settings. - Added insert annotation functionality in the API, enabling users to insert new segments in conversations. - Implemented memory retrieval for conversations, enhancing the ability to fetch related memories. - Improved error handling and logging across various modules for better traceability and debugging. * Add backend worker health check and job clearing functionality - Introduced a new function `get_backend_worker_health` to retrieve health metrics from the backend's /health endpoint, including worker count and queue status. - Updated `show_quick_status` to display worker health information, alerting users to potential issues with registered workers. - Added a new API endpoint `/jobs` to allow admin users to clear finished and failed jobs from all queues, enhancing job management capabilities. - Updated the frontend Queue component to include a button for clearing jobs, improving user interaction and management of job statuses. * Update plugin event descriptions and refactor event handling - Reduced redundancy by embedding descriptions directly within the PluginEvent enum, enhancing clarity and maintainability. - Removed the EVENT_DESCRIPTIONS dictionary, streamlining the event handling process in the plugin assistant. - Updated references in the plugin assistant to utilize the new description attributes, ensuring consistent event metadata usage.
Summary by CodeRabbit
Release Notes
New Features
Improvements
Configuration