Skip to content

Comments

Feat/better reprocess memory#298

Closed
AnkushMalaker wants to merge 5 commits intofix/pre-releasefrom
feat/better-reprocess-memory
Closed

Feat/better reprocess memory#298
AnkushMalaker wants to merge 5 commits intofix/pre-releasefrom
feat/better-reprocess-memory

Conversation

@AnkushMalaker
Copy link
Collaborator

@AnkushMalaker AnkushMalaker commented Feb 8, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added LangFuse for LLM observability and centralized prompt management via web UI.
    • Introduced scheduled cron jobs with admin controls for training and ASR jargon extraction.
    • Added entity annotation and PATCH endpoint for knowledge graph entity updates.
    • Implemented audio batching for efficient long-form ASR transcription.
    • Added per-segment speaker identification mode for improved diarization accuracy.
    • Introduced orphaned annotation detection and cleanup tools.
  • Improvements

    • Reduced transcription job timeouts for better resource efficiency.
    • Enhanced speaker identification with fallback to segment-based processing.
  • Configuration

    • Added support for Neo4j setup during initialization.
    • Extended miscellaneous settings with per-segment speaker identification toggle.

…290)

- Updated the description for the 'asr-services' to remove the specific mention of 'Parakeet', making it more general.
- Improved the console output for auto-selected services to include the transcription provider label, enhancing user feedback during service selection.
- Removed redundant Obsidian and Knowledge Graph configuration checks from services.py, streamlining the command execution process.
- Updated wizard.py to enhance user experience by setting default options for speaker recognition during service selection.
- Improved Neo4j password handling in setup processes, ensuring consistent configuration prompts and feedback.
- Introduced a new cron scheduler for managing scheduled tasks, enhancing the backend's automation capabilities.
- Added new entity annotation features, allowing for corrections and updates to knowledge graph entities directly through the API.
- Added new configuration options for VibeVoice ASR in defaults.yml, including batching parameters for audio processing.
- Updated Docker Compose files to mount the config directory, ensuring access to ASR service configurations.
- Enhanced the VibeVoice transcriber to load configuration settings from defaults.yml, allowing for dynamic adjustments via environment variables.
- Introduced quantization options for model loading in the VibeVoice transcriber, improving performance and flexibility.
- Refactored the speaker identification process to streamline audio handling and improve logging for better debugging.
- Updated documentation to reflect new configuration capabilities and usage instructions for the VibeVoice ASR provider.
- Introduced functions for checking LangFuse configuration in services.py, ensuring proper setup for observability.
- Updated wizard.py to facilitate user input for LangFuse configuration, including options for local and external setups.
- Implemented memory reprocessing logic in memory services to update existing memories based on speaker re-identification.
- Enhanced speaker recognition client to support per-segment identification, improving accuracy during reprocessing.
- Refactored various components to streamline handling of LangFuse parameters and improve overall service management.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 8, 2026

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

This pull request introduces a comprehensive system for LLM observability, prompt management, and background job orchestration. It adds a prompt registry backed by LangFuse, a config-driven cron scheduler for automated tasks, entity-level knowledge graph corrections, per-segment speaker identification, audio batching for long-form ASR, transcript reprocessing with speaker-aware memory updates, and corresponding frontend capabilities for job and annotation management.

Changes

Cohort / File(s) Summary
Prompt Management System
backends/advanced/src/advanced_omi_backend/prompt_registry.py, backends/advanced/src/advanced_omi_backend/prompt_defaults.py
New PromptRegistry class with lazy LangFuse initialization, fallback to defaults, and seed_prompts support. Comprehensive default prompts registered for memory, conversation, knowledge graph, ASR, and transcription workflows.
LangFuse Integration
backends/advanced/.env.template, backends/advanced/src/advanced_omi_backend/openai_factory.py, backends/advanced/init.py, extras/langfuse/...
New environment variables (LANGFUSE_HOST, LANGFUSE_BASE_URL) and LangFuse setup flow in init.py with CLI args (--langfuse-public-key, --langfuse-secret-key, --langfuse-host). OpenAI factory centralizes client creation with optional LangFuse tracing. New LangFuse service configuration with init script and docker-compose setup.
Cron Scheduler System
backends/advanced/src/advanced_omi_backend/cron_scheduler.py, backends/advanced/src/advanced_omi_backend/workers/finetuning_jobs.py
New CronScheduler singleton managing config-driven jobs with Redis state persistence. Registers speaker finetuning and ASR jargon extraction jobs. Jobs expose enable/disable, schedule updates, and manual execution endpoints.
Entity Annotations & Knowledge Graph
backends/advanced/src/advanced_omi_backend/models/annotation.py, backends/advanced/src/advanced_omi_backend/routers/modules/annotation_routes.py, backends/advanced/src/advanced_omi_backend/routers/modules/knowledge_graph_routes.py, backends/advanced/src/advanced_omi_backend/services/knowledge_graph/...
New ENTITY annotation type with entity_id and entity_field support. POST /entity and GET /entity/{entity_id} endpoints for entity corrections. New PATCH /entities/{entity_id} endpoint with automatic annotation creation for name/details changes. KnowledgeGraphService.update_entity method for Neo4j updates.
ASR Batching & Context
extras/asr-services/common/batching.py, backends/advanced/src/advanced_omi_backend/services/transcription/context.py, extras/asr-services/providers/vibevoice/...
New audio batching utilities (split_audio_file, stitch_transcription_results). TranscriptionContext dataclass with hot_words and user_jargon. VibeVoice transcriber adds quantization config, batched vs. single-shot path selection, and context injection. context_info parameter propagated through all ASR providers.
Per-segment Speaker Identification
backends/advanced/src/advanced_omi_backend/speaker_recognition_client.py, backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py
New per_segment and min_segment_duration parameters for identify_provider_segments. Added _identify_per_segment helper for per-segment processing. Worker updated to detect per_segment_speaker_id config toggle and apply per-segment mode. Metadata now includes identification_mode field.
Transcript Reprocessing & Memory
backends/advanced/src/advanced_omi_backend/services/memory/..., backends/advanced/src/advanced_omi_backend/workers/memory_jobs.py
New reprocess_memory and propose_reprocess_actions methods on MemoryServiceBase and LLMProviderBase. Added compute_speaker_diff helper to detect speaker/text changes. OpenAIProvider implements propose_reprocess_actions with registry-backed prompts. MemoryService.reprocess_memory handles speaker re-identification workflow with fallback.
Conversation & Title Generation
backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py, backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
generate_title_and_summary merged into single function returning tuple (title, summary). Replaced separate title/summary/detailed calls with combined title_and_summary + detailed_summary via registry-backed prompts.
Timeout & Configuration Adjustments
backends/advanced/src/advanced_omi_backend/controllers/..., backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py
Reduced batch transcription job timeouts from 1800s to 900s across audio_controller, websocket_controller, and conversation_controller. Increased speaker reprocessing timeout to 1200s. Updated reprocess_speakers validation to accept segment-based diarization.
Service & Plugin Infrastructure
backends/advanced/src/advanced_omi_backend/plugins/base.py, backends/advanced/src/advanced_omi_backend/plugins/.../plugin.py, backends/advanced/src/advanced_omi_backend/services/plugin_service.py
BasePlugin adds register_prompts extension point. EmailSummarizerPlugin and HomeAssistantPlugin implement register_prompts for dynamic prompt registration. PluginService calls register_prompts during plugin initialization with error handling.
Frontend Finetuning Dashboard
backends/advanced/webui/src/pages/Finetuning.tsx
New CronJob and AnnotationTypeCounts interfaces. Refactored Finetuning page to display scheduled cron jobs with enable/disable/schedule-edit/run-now controls. Added orphaned annotation cleanup and reattach stubs. New per-type annotation statistics with StatCard components.
Frontend Entity Editing
backends/advanced/webui/src/components/knowledge-graph/EntityCard.tsx, backends/advanced/webui/src/components/knowledge-graph/EntityList.tsx
EntityCard adds inline editing mode with editable name/details inputs and Save/Cancel controls. EntityList propagates onEntityUpdated callback to trigger list and search result updates. Integrated knowledgeGraphApi.updateEntity API calls.
Frontend System Settings & API
backends/advanced/webui/src/pages/System.tsx, backends/advanced/webui/src/services/api.ts
Added per_segment_speaker_id toggle under Speaker Identification Mode in Misc Configuration. New finetuningApi methods for orphaned annotations and cron job management. Extended systemApi.saveMiscSettings to include per_segment_speaker_id. New knowledgeGraphApi.updateEntity method.
Dependency Updates
backends/advanced/pyproject.toml, backends/advanced/webui/package.json, extras/asr-services/pyproject.toml
Updated langfuse from >=3.3.0 to >=3.13.0,<4.0. Added croniter>=1.3.0 and bitsandbytes>=0.43.0 for vibevoice. Added cronstrue^2.50.0 to frontend for cron expression display.
Docker & Configuration
backends/advanced/docker-compose.yml, backends/advanced/src/advanced_omi_backend/config.py, config/defaults.yml, extras/asr-services/docker-compose.yml
Removed neo4j profiles block. Added per_segment_speaker_id to misc settings persistence. New asr_services.vibevoice config section for batch_threshold_seconds, batch_duration_seconds, batch_overlap_seconds. Updated ASR docker-compose with QUANTIZATION env var and config volume mount.
Cleanup & Neo4j Integration
backends/advanced/src/scripts/cleanup_state.py
BackupManager and CleanupManager now accept optional neo4j_driver. New _export_neo4j and _cleanup_neo4j methods. CleanupStats tracks neo4j_nodes_count, neo4j_relationships_count. Stats display and verification include Neo4j data counts.
Wizard & Service Management
wizard.py, services.py
Added langfuse service to SERVICES configuration. New setup_langfuse_choice function for local/external/skip selection. Extended run_service_setup with LangFuse credential propagation. Updated service selection logic to skip LangFuse in default prompting, deferring to dedicated setup path.
Test Infrastructure
extras/asr-services/tests/test_batching.py, tests/asr/batching_tests.robot, tests/resources/asr_keywords.robot
New Python unit tests for audio batching, stitching, context extraction, and GPU-based integration tests. New Robot Framework suite for VibeVoice batching validation with 4-minute and 1-minute audio tests. Tests verify segment coverage, speaker diarization, and no gaps between batches.
ASR Service Extensions
extras/asr-services/common/base_service.py, extras/asr-services/providers/.../service.py, extras/asr-services/scripts/convert_to_ct2.py
All ASR providers (FasterWhisper, Nemo, Transformers, VibeVoice) updated to accept optional context_info parameter. New convert_to_ct2.py script for HuggingFace Whisper to CTranslate2 conversion with quantization support.
Speaker Recognition Frontend
extras/speaker-recognition/webui/src/services/...
Removed hybridTranscribeAndDiarize from ApiService. Updated Deepgram service to always use /v1/listen endpoint. Reworked speakerIdentification hybrid path to use server-side diarize-identify-match endpoint instead of frontend Deepgram diarization. Updated response mapping for backend-provided speaker segments.
App & Job Initialization
backends/advanced/src/advanced_omi_backend/app_factory.py, backends/advanced/src/advanced_omi_backend/models/job.py
App startup now initializes prompt registry with defaults and starts cron scheduler. RQ worker path initializes prompt registry after Beanie setup. LLM client initialization simplified to use create_openai_client factory.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant API as API Server
    participant PromptRegistry as PromptRegistry
    participant LangFuse as LangFuse
    participant LLMProvider as LLMProvider

    Client->>API: Request with context
    API->>PromptRegistry: get_prompt(prompt_id, **variables)
    PromptRegistry->>LangFuse: Fetch prompt (async)
    alt LangFuse Available
        LangFuse-->>PromptRegistry: Prompt template
    else LangFuse Unavailable
        PromptRegistry->>PromptRegistry: Use default template
    end
    PromptRegistry-->>API: Compiled prompt
    API->>LLMProvider: Call with compiled prompt
    LLMProvider-->>API: Response
    API-->>Client: Result
Loading
sequenceDiagram
    participant Worker as RQ Worker
    participant TranscriptVersion as Transcript Version
    participant MemoryService as Memory Service
    participant SpeakerDiff as Speaker Diff
    participant LLMProvider as LLM Provider
    participant KnowledgeGraph as Knowledge Graph

    Worker->>TranscriptVersion: Fetch old/new segments
    Worker->>SpeakerDiff: compute_speaker_diff(old, new)
    SpeakerDiff-->>Worker: Change records
    Worker->>MemoryService: reprocess_memory(diff, context)
    MemoryService->>LLMProvider: propose_reprocess_actions
    LLMProvider-->>MemoryService: Memory updates (ADD/UPDATE/DELETE)
    MemoryService->>KnowledgeGraph: Apply entity updates
    KnowledgeGraph-->>MemoryService: Confirmation
    MemoryService-->>Worker: Success
Loading
sequenceDiagram
    participant Transcriber as VibeVoice Transcriber
    participant AudioFile as Audio File
    participant Batching as Batching Module
    participant Batch as Single Batch
    participant LLMContext as Context Manager
    participant Stitching as Stitching Module

    Transcriber->>AudioFile: Load duration
    alt Duration > Batch Threshold
        Transcriber->>Batching: split_audio_file()
        Batching-->>Transcriber: List of windows with times
        loop For Each Window
            Transcriber->>Batch: transcribe(window, context_info)
            LLMContext->>Batch: Inject hot_words + previous_text
            Batch-->>Transcriber: Batch result
            Transcriber->>LLMContext: extract_context_tail()
            LLMContext-->>Transcriber: Context for next window
        end
        Transcriber->>Stitching: stitch_transcription_results()
        Stitching-->>Transcriber: Merged result
    else Duration <= Batch Threshold
        Transcriber->>Batch: transcribe(audio, context_info)
        Batch-->>Transcriber: Single result
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Feat/better reprocess memory' is vague and uses non-descriptive terminology that does not clearly convey the actual scope of changes, which includes LangFuse integration, cron scheduling, entity annotations, VibeVoice enhancements, and comprehensive memory reprocessing improvements. Revise the title to be more specific and descriptive, such as 'Add memory reprocessing, LangFuse integration, and cron job scheduling' or break into smaller PRs focusing on individual features.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 86.62% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/better-reprocess-memory

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AnkushMalaker
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 8, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@AnkushMalaker AnkushMalaker changed the base branch from dev to fix/pre-release February 8, 2026 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant