Skip to content

Comments

chore:test docs and test improvements #288

Merged
AnkushMalaker merged 3 commits intofix/onborading-improvementsfrom
fix/test-improvements
Feb 7, 2026
Merged

chore:test docs and test improvements #288
AnkushMalaker merged 3 commits intofix/onborading-improvementsfrom
fix/test-improvements

Conversation

@AnkushMalaker
Copy link
Collaborator

@AnkushMalaker AnkushMalaker commented Feb 6, 2026

Summary by CodeRabbit

  • Documentation

    • Restructured project documentation with new overview guide and system initialization docs; rebranded references from Friend-Lite to Chronicle.
  • New Features

    • Added interactive test environment setup tool for configuring API keys.
  • Improvements

    • Enhanced transcription error handling with clearer failure messages; increased request timeout for improved reliability.

- Added a new interactive setup script for configuring test API keys (Deepgram, OpenAI) to streamline the testing process.
- Introduced a template for the .env.test file to guide users in setting up their API keys.
- Updated the Makefile to include a new 'configure' target for setting up API keys.
- Enhanced the start-containers script to warn users if API keys are still set to placeholder values, improving user awareness during testing.
- Updated .gitignore to include the new .env.test.template file.
- Deleted the `features.md` file, consolidating its content into the new `overview.md` for a more streamlined documentation structure.
- Updated `init-system.md` to link to the new `overview.md` instead of the removed `features.md`.
- Removed `ports-and-access.md` as its content was integrated into other documentation files, enhancing clarity and reducing redundancy.
- Revised the `README.md` in the advanced backend to reflect the new naming conventions and updated links to documentation.
- Introduced a new `plugin-development-guide.md` to assist users in creating custom plugins, expanding the documentation for developers.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 6, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
📝 Walkthrough

Walkthrough

The PR reorganizes documentation structure (moving features content to a new overview file), rebrand project references from Friend-Lite to Chronicle across docs and code, enhance test infrastructure with environment setup automation, improve transcription error handling with descriptive messages, and adjust timeout and version comparison logic in backend services.

Changes

Cohort / File(s) Summary
Documentation Restructuring & Reorganization
Docs/overview.md, Docs/init-system.md, Docs/features.md, Docs/ports-and-access.md, backends/advanced/Docs/...
New overview.md file created to replace features.md; init-system.md expanded with Ports & Access and Unified Service Management sections; features.md, ports-and-access.md, memory-configuration-guide.md, and MOCK_SPEAKER_IMPLEMENTATION.md deleted; architecture links updated across docs.
Project Branding Updates
README.md, README-K8S.md, quickstart.md, backends/advanced/README.md, config/README.md
Terminology changed from "Friend-Lite" to "Chronicle" across multiple readme and quickstart files; updated architecture documentation links; adjusted memory system references to reflect Chronicle as default.
Test Infrastructure & Setup Automation
.gitignore, tests/Makefile, tests/setup/.env.test.template, tests/setup/init.py, tests/bin/start-containers.sh
Added .env.test.template for test configuration; new init.py interactive setup script for API key configuration; Makefile configure target for test setup; start-containers.sh validation checks for placeholder API keys.
Transcription Error Handling & Logic Improvements
backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py, backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py
Enhanced error handling with descriptive exceptions (FileNotFoundError, RuntimeError with context); refactored speaker segment construction with speaker field normalization (None → "Speaker 0", integers → "Speaker N"); increased HTTP timeout from 120s to 300s.
Utility & Configuration Logic
setup_utils.py
Replaced string-based CUDA version comparison with integer tuple parsing for (major, minor) version checks against (12, 8), (12, 6), and (12, 1).
Removed Design Documentation
extras/asr-services/providers/vibevoice/impl.md
Deleted design discussion file covering vibevoice ASR integration considerations and architecture comparisons.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'chore: test docs and test improvements' partially describes the changeset, covering test-related improvements and some documentation changes, but does not capture the substantial restructuring of documentation files (removal of features.md, ports-and-access.md, and creation of overview.md) which represent a significant portion of the changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/test-improvements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AnkushMalaker
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 6, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py`:
- Around line 211-227: The exception handlers in transcription_jobs.py lose the
original exception chain; update the three raises to re-raise using exception
chaining (raise ... from e) so callers retain the original __cause__: change the
FileNotFoundError in the ValueError handler, the RuntimeError after "Failed to
reconstruct audio from MongoDB", and the RuntimeError after the
provider.transcribe failure to use "from e" while keeping the existing messages
and logging; reference the conversation_id variable,
wav_data/provider.transcribe call, and logger usage when making the edits.
🧹 Nitpick comments (3)
Docs/overview.md (1)

42-62: Minor: ASCII diagram box alignment is slightly off.

The top border (line 43), separator (line 45), and bottom border (line 61) have slightly different widths, and some inner rows (e.g., lines 46, 57) have trailing spaces that don't align with the box edges. This is cosmetic but may render oddly in some Markdown viewers.

backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)

151-152: Consider separating connect and read timeouts.

The 5-minute default is reasonable for transcribing large audio files, but passing a single scalar to httpx.AsyncClient(timeout=...) applies it uniformly to connect, read, write, and pool timeouts. A slow/unreachable server would block for 5 minutes before failing the connection.

Consider using httpx.Timeout to keep a short connect timeout while allowing a longer read timeout:

♻️ Suggested timeout separation
-        timeout = op.get("timeout", 300)
-        async with httpx.AsyncClient(timeout=timeout) as client:
+        timeout_val = op.get("timeout", 300)
+        timeout = httpx.Timeout(timeout_val, connect=10.0)
+        async with httpx.AsyncClient(timeout=timeout) as client:
Docs/init-system.md (1)

147-151: Inconsistent address for Speaker Recognition in the container-to-container section.

The heading states services use host.docker.internal, but Line 149 lists Speaker Recognition as http://127.0.0.1:8085 while the other services use host.docker.internal. This predates this PR, but since you're editing adjacent lines, it might be worth fixing for consistency.

Suggested fix
 ### Container-to-Container Communication
 Services use `host.docker.internal` for inter-container communication:
-- `http://127.0.0.1:8085` - Speaker Recognition
+- `http://host.docker.internal:8085` - Speaker Recognition
 - `http://host.docker.internal:8767` - Parakeet ASR
 - `http://host.docker.internal:8765` - OpenMemory MCP

Comment on lines 211 to 227
except ValueError as e:
# No chunks found for conversation
raise FileNotFoundError(f"No audio chunks found for conversation {conversation_id}: {e}")
except Exception as e:
logger.error(f"Failed to reconstruct audio from MongoDB: {e}", exc_info=True)
raise RuntimeError(f"Audio reconstruction failed: {e}")

try:
# Transcribe the audio directly from memory (no disk I/O needed)
transcription_result = await provider.transcribe(
audio_data=wav_data, # Pass bytes directly, already in memory
sample_rate=16000,
diarize=True,
)

except ValueError as e:
# No chunks found for conversation
raise FileNotFoundError(f"No audio chunks found for conversation {conversation_id}: {e}")
except Exception as e:
logger.error(f"Failed to reconstruct audio from MongoDB: {e}", exc_info=True)
raise RuntimeError(f"Audio reconstruction failed: {e}")
logger.error(f"Transcription failed for conversation {conversation_id}: {type(e).__name__}: {e}", exc_info=True)
raise RuntimeError(f"Transcription failed ({type(e).__name__}): {e}")
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Chain re-raised exceptions with from e to preserve traceback context.

All three raise statements inside except blocks lose the original exception chain. While exc_info=True in the logger preserves it in logs, any caller catching these exceptions won't see the original cause via __cause__.

This is also flagged by Ruff B904.

Proposed fix
     except ValueError as e:
         # No chunks found for conversation
-        raise FileNotFoundError(f"No audio chunks found for conversation {conversation_id}: {e}")
+        raise FileNotFoundError(f"No audio chunks found for conversation {conversation_id}: {e}") from e
     except Exception as e:
         logger.error(f"Failed to reconstruct audio from MongoDB: {e}", exc_info=True)
-        raise RuntimeError(f"Audio reconstruction failed: {e}")
+        raise RuntimeError(f"Audio reconstruction failed: {e}") from e
 
     try:
         # Transcribe the audio directly from memory (no disk I/O needed)
         transcription_result = await provider.transcribe(
             audio_data=wav_data,  # Pass bytes directly, already in memory
             sample_rate=16000,
             diarize=True,
         )
     except Exception as e:
         logger.error(f"Transcription failed for conversation {conversation_id}: {type(e).__name__}: {e}", exc_info=True)
-        raise RuntimeError(f"Transcription failed ({type(e).__name__}): {e}")
+        raise RuntimeError(f"Transcription failed ({type(e).__name__}): {e}") from e
🧰 Tools
🪛 Ruff (0.14.14)

[warning] 213-213: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


[warning] 213-213: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 216-216: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


[warning] 216-216: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 227-227: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


[warning] 227-227: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py`
around lines 211 - 227, The exception handlers in transcription_jobs.py lose the
original exception chain; update the three raises to re-raise using exception
chaining (raise ... from e) so callers retain the original __cause__: change the
FileNotFoundError in the ValueError handler, the RuntimeError after "Failed to
reconstruct audio from MongoDB", and the RuntimeError after the
provider.transcribe failure to use "from e" while keeping the existing messages
and logging; reference the conversation_id variable,
wav_data/provider.transcribe call, and logger usage when making the edits.

Copy link
Collaborator Author

@AnkushMalaker AnkushMalaker Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai

All three raise statements inside except blocks lose the original exception chain. While exc_info=True in the logger preserves it in logs, any caller catching these exceptions won't see the original cause via cause.

We should use the raise from pattern properly with logger.exception, I believe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AnkushMalaker, you're absolutely right! Let me clarify the relationship between raise from and logger.exception:

They serve complementary purposes:

  1. raise ... from e — Preserves the exception chain for callers. When code catches your raised exception, it can inspect __cause__ to see the original error. This is critical for debugging and proper exception handling up the call stack.

  2. logger.exception() — A shorthand for logger.error(..., exc_info=True) that automatically logs the current exception with its traceback. It's purely for logging/observability.

Best practice:

except Exception as e:
    logger.exception(f"Failed to reconstruct audio from MongoDB")
    raise RuntimeError(f"Audio reconstruction failed: {e}") from e

Or equivalently:

except Exception as e:
    logger.error(f"Failed to reconstruct audio from MongoDB: {e}", exc_info=True)
    raise RuntimeError(f"Audio reconstruction failed: {e}") from e

Both approaches log the full traceback and preserve the chain for callers. The original review suggestion is correct — all three raise statements should add from e as proposed in the diff suggestion above. 🎯

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gimme a prompt to fix

@AnkushMalaker AnkushMalaker changed the title Fix/test improvements chore/test docs and test improvements Feb 6, 2026
@AnkushMalaker AnkushMalaker changed the title chore/test docs and test improvements chore:test docs and test improvements Feb 6, 2026
@AnkushMalaker AnkushMalaker merged commit 26bf1c2 into fix/onborading-improvements Feb 7, 2026
2 of 3 checks passed
AnkushMalaker added a commit that referenced this pull request Feb 7, 2026
* Enhance setup utilities and wizard functionality

- Introduced `detect_tailscale_info` function to automatically retrieve Tailscale DNS name and IP address, improving user experience for service configuration.
- Added `detect_cuda_version` function to identify the system's CUDA version, streamlining compatibility checks for GPU-based services.
- Updated `wizard.py` to utilize the new detection functions, enhancing service selection and configuration processes based on user input.
- Improved error handling and user feedback in service setup, ensuring clearer communication during configuration steps.
- Refactored existing code to improve maintainability and code reuse across setup utilities.

* Update ASR service capabilities and improve speaker identification handling

- Modified the capabilities of the VibeVoice ASR provider to include 'speaker_identification' and 'long_form', enhancing its feature set.
- Adjusted the speaker identification logic in the VibeVoiceTranscriber to prevent double-prefixing and ensure accurate speaker representation.
- Updated protocol tests to reflect the expanded list of known ASR capabilities, ensuring comprehensive validation of reported features.

* Refactor audio recording controls for improved UI and functionality

- Replaced MicOff icon with Square icon in MainRecordingControls and SimplifiedControls for a more intuitive user experience.
- Enhanced button interactions to streamline recording start/stop actions, including a pulsing effect during recording.
- Updated status messages and button states to provide clearer feedback on recording status and actions.
- Improved accessibility by ensuring buttons are disabled appropriately based on recording state and microphone access.

* chore:test docs and test improvements  (#288)

* Enhance test environment setup and configuration

- Added a new interactive setup script for configuring test API keys (Deepgram, OpenAI) to streamline the testing process.
- Introduced a template for the .env.test file to guide users in setting up their API keys.
- Updated the Makefile to include a new 'configure' target for setting up API keys.
- Enhanced the start-containers script to warn users if API keys are still set to placeholder values, improving user awareness during testing.
- Updated .gitignore to include the new .env.test.template file.

* Remove outdated documentation and restructure feature overview

- Deleted the `features.md` file, consolidating its content into the new `overview.md` for a more streamlined documentation structure.
- Updated `init-system.md` to link to the new `overview.md` instead of the removed `features.md`.
- Removed `ports-and-access.md` as its content was integrated into other documentation files, enhancing clarity and reducing redundancy.
- Revised the `README.md` in the advanced backend to reflect the new naming conventions and updated links to documentation.
- Introduced a new `plugin-development-guide.md` to assist users in creating custom plugins, expanding the documentation for developers.

* tech debt
AnkushMalaker added a commit that referenced this pull request Feb 7, 2026
* Enhance StreamingTranscriptionConsumer and conversation job handling

- Removed cumulative audio offset tracking from StreamingTranscriptionConsumer as Deepgram provides cumulative timestamps directly.
- Updated store_final_result method to utilize Deepgram's cumulative timestamps without adjustments.
- Implemented completion signaling for transcription sessions in Redis, ensuring conversation jobs wait for all results before processing.
- Improved error handling to signal completion even in case of errors, preventing conversation jobs from hanging.
- Enhanced logging for better visibility of transcription completion and error states.

* Enhance ASR services configuration and provider management

- Updated `config.yml.template` to include capabilities for ASR providers, detailing features like word timestamps and speaker segments.
- Added a new `vibevoice` provider configuration for Microsoft VibeVoice ASR, supporting speaker diarization.
- Enhanced `.env.template` with clearer provider selection and model configuration options, including CUDA settings and voice activity detection.
- Improved `docker-compose.yml` to support multiple ASR providers with detailed service configurations.
- Introduced common utilities for audio processing and ASR service management in the `common` module, enhancing code reusability and maintainability.
- Updated `README.md` to reflect the new provider-based architecture and usage instructions for starting different ASR services.

* Enhance transcription provider support and capabilities management

- Added support for the new `vibevoice` transcription provider, including configuration options for built-in speaker diarization.
- Updated `ChronicleSetup` to include `vibevoice` in the transcription provider selection and adjusted related descriptions.
- Enhanced the `ModelDef` and `Conversation` models to reflect the addition of `vibevoice` in provider options.
- Introduced a new capabilities management system to validate provider features, allowing conditional execution of tasks based on provider capabilities.
- Improved logging and user feedback in transcription and speaker recognition jobs to reflect the capabilities of the selected provider.
- Updated documentation to include details on the new `vibevoice` provider and its features.

* Enhance conversation reprocessing and job management

- Introduced a new job for regenerating title and summary after memory processing to ensure fresh context is available.
- Updated the reprocess_transcript and reprocess_speakers functions to enqueue title/summary jobs based on memory job dependencies, improving job chaining and execution order.
- Enhanced validation for transcripts to account for provider capabilities, ensuring proper handling of diarization and segment data.
- Improved logging for job enqueuing and processing stages, providing clearer insights into the workflow and dependencies.

* Enhance Knowledge Graph integration and service management

- Introduced support for Knowledge Graph functionality, enabling entity and relationship extraction from conversations using Neo4j.
- Updated `services.py` to manage Knowledge Graph profiles and integrate with existing service commands.
- Enhanced Docker Compose configurations to include Neo4j service and environment variables for Knowledge Graph setup.
- Added new API routes and models for Knowledge Graph operations, including entity and relationship management.
- Improved documentation and configuration templates to reflect the new Knowledge Graph features and setup instructions.

* Add Knowledge Graph API routes and integrate into backend

- Introduced new `knowledge_graph_routes.py` to handle API endpoints for managing knowledge graph entities, relationships, and promises.
- Updated `__init__.py` to include the new knowledge graph router in the main router module.
- Enhanced documentation to reflect the addition of knowledge graph functionality, improving clarity on available API routes and their purposes.

* Update .gitignore to include individual plugin configuration files and SDK directory

- Added entries for individual plugin config files to ensure user-specific settings are ignored.
- Included the SDK directory in .gitignore to prevent unnecessary files from being tracked.

* Fix: onborading improvements (#287)

* Enhance setup utilities and wizard functionality

- Introduced `detect_tailscale_info` function to automatically retrieve Tailscale DNS name and IP address, improving user experience for service configuration.
- Added `detect_cuda_version` function to identify the system's CUDA version, streamlining compatibility checks for GPU-based services.
- Updated `wizard.py` to utilize the new detection functions, enhancing service selection and configuration processes based on user input.
- Improved error handling and user feedback in service setup, ensuring clearer communication during configuration steps.
- Refactored existing code to improve maintainability and code reuse across setup utilities.

* Update ASR service capabilities and improve speaker identification handling

- Modified the capabilities of the VibeVoice ASR provider to include 'speaker_identification' and 'long_form', enhancing its feature set.
- Adjusted the speaker identification logic in the VibeVoiceTranscriber to prevent double-prefixing and ensure accurate speaker representation.
- Updated protocol tests to reflect the expanded list of known ASR capabilities, ensuring comprehensive validation of reported features.

* Refactor audio recording controls for improved UI and functionality

- Replaced MicOff icon with Square icon in MainRecordingControls and SimplifiedControls for a more intuitive user experience.
- Enhanced button interactions to streamline recording start/stop actions, including a pulsing effect during recording.
- Updated status messages and button states to provide clearer feedback on recording status and actions.
- Improved accessibility by ensuring buttons are disabled appropriately based on recording state and microphone access.

* chore:test docs and test improvements  (#288)

* Enhance test environment setup and configuration

- Added a new interactive setup script for configuring test API keys (Deepgram, OpenAI) to streamline the testing process.
- Introduced a template for the .env.test file to guide users in setting up their API keys.
- Updated the Makefile to include a new 'configure' target for setting up API keys.
- Enhanced the start-containers script to warn users if API keys are still set to placeholder values, improving user awareness during testing.
- Updated .gitignore to include the new .env.test.template file.

* Remove outdated documentation and restructure feature overview

- Deleted the `features.md` file, consolidating its content into the new `overview.md` for a more streamlined documentation structure.
- Updated `init-system.md` to link to the new `overview.md` instead of the removed `features.md`.
- Removed `ports-and-access.md` as its content was integrated into other documentation files, enhancing clarity and reducing redundancy.
- Revised the `README.md` in the advanced backend to reflect the new naming conventions and updated links to documentation.
- Introduced a new `plugin-development-guide.md` to assist users in creating custom plugins, expanding the documentation for developers.

* tech debt
@github-actions
Copy link

github-actions bot commented Feb 7, 2026

⚠️ Robot Framework Test Results (No API Keys)

Status: ❌ Some tests failed

ℹ️ Note: This run excludes tests requiring external API keys (Deepgram, OpenAI).
Tests tagged with requires-api-keys will run on dev/main branches.

Metric Count
✅ Passed 102
❌ Failed 17
📊 Total 119

📊 View Reports

GitHub Pages (Live Reports):

Download Artifacts:


View full workflow run

@AnkushMalaker AnkushMalaker deleted the fix/test-improvements branch February 7, 2026 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant