chore:test docs and test improvements #288
chore:test docs and test improvements #288AnkushMalaker merged 3 commits intofix/onborading-improvementsfrom
Conversation
- Added a new interactive setup script for configuring test API keys (Deepgram, OpenAI) to streamline the testing process. - Introduced a template for the .env.test file to guide users in setting up their API keys. - Updated the Makefile to include a new 'configure' target for setting up API keys. - Enhanced the start-containers script to warn users if API keys are still set to placeholder values, improving user awareness during testing. - Updated .gitignore to include the new .env.test.template file.
- Deleted the `features.md` file, consolidating its content into the new `overview.md` for a more streamlined documentation structure. - Updated `init-system.md` to link to the new `overview.md` instead of the removed `features.md`. - Removed `ports-and-access.md` as its content was integrated into other documentation files, enhancing clarity and reducing redundancy. - Revised the `README.md` in the advanced backend to reflect the new naming conventions and updated links to documentation. - Introduced a new `plugin-development-guide.md` to assist users in creating custom plugins, expanding the documentation for developers.
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
📝 WalkthroughWalkthroughThe PR reorganizes documentation structure (moving features content to a new overview file), rebrand project references from Friend-Lite to Chronicle across docs and code, enhance test infrastructure with environment setup automation, improve transcription error handling with descriptive messages, and adjust timeout and version comparison logic in backend services. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py`:
- Around line 211-227: The exception handlers in transcription_jobs.py lose the
original exception chain; update the three raises to re-raise using exception
chaining (raise ... from e) so callers retain the original __cause__: change the
FileNotFoundError in the ValueError handler, the RuntimeError after "Failed to
reconstruct audio from MongoDB", and the RuntimeError after the
provider.transcribe failure to use "from e" while keeping the existing messages
and logging; reference the conversation_id variable,
wav_data/provider.transcribe call, and logger usage when making the edits.
🧹 Nitpick comments (3)
Docs/overview.md (1)
42-62: Minor: ASCII diagram box alignment is slightly off.The top border (line 43), separator (line 45), and bottom border (line 61) have slightly different widths, and some inner rows (e.g., lines 46, 57) have trailing spaces that don't align with the box edges. This is cosmetic but may render oddly in some Markdown viewers.
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
151-152: Consider separating connect and read timeouts.The 5-minute default is reasonable for transcribing large audio files, but passing a single scalar to
httpx.AsyncClient(timeout=...)applies it uniformly to connect, read, write, and pool timeouts. A slow/unreachable server would block for 5 minutes before failing the connection.Consider using
httpx.Timeoutto keep a short connect timeout while allowing a longer read timeout:♻️ Suggested timeout separation
- timeout = op.get("timeout", 300) - async with httpx.AsyncClient(timeout=timeout) as client: + timeout_val = op.get("timeout", 300) + timeout = httpx.Timeout(timeout_val, connect=10.0) + async with httpx.AsyncClient(timeout=timeout) as client:Docs/init-system.md (1)
147-151: Inconsistent address for Speaker Recognition in the container-to-container section.The heading states services use
host.docker.internal, but Line 149 lists Speaker Recognition ashttp://127.0.0.1:8085while the other services usehost.docker.internal. This predates this PR, but since you're editing adjacent lines, it might be worth fixing for consistency.Suggested fix
### Container-to-Container Communication Services use `host.docker.internal` for inter-container communication: -- `http://127.0.0.1:8085` - Speaker Recognition +- `http://host.docker.internal:8085` - Speaker Recognition - `http://host.docker.internal:8767` - Parakeet ASR - `http://host.docker.internal:8765` - OpenMemory MCP
| except ValueError as e: | ||
| # No chunks found for conversation | ||
| raise FileNotFoundError(f"No audio chunks found for conversation {conversation_id}: {e}") | ||
| except Exception as e: | ||
| logger.error(f"Failed to reconstruct audio from MongoDB: {e}", exc_info=True) | ||
| raise RuntimeError(f"Audio reconstruction failed: {e}") | ||
|
|
||
| try: | ||
| # Transcribe the audio directly from memory (no disk I/O needed) | ||
| transcription_result = await provider.transcribe( | ||
| audio_data=wav_data, # Pass bytes directly, already in memory | ||
| sample_rate=16000, | ||
| diarize=True, | ||
| ) | ||
|
|
||
| except ValueError as e: | ||
| # No chunks found for conversation | ||
| raise FileNotFoundError(f"No audio chunks found for conversation {conversation_id}: {e}") | ||
| except Exception as e: | ||
| logger.error(f"Failed to reconstruct audio from MongoDB: {e}", exc_info=True) | ||
| raise RuntimeError(f"Audio reconstruction failed: {e}") | ||
| logger.error(f"Transcription failed for conversation {conversation_id}: {type(e).__name__}: {e}", exc_info=True) | ||
| raise RuntimeError(f"Transcription failed ({type(e).__name__}): {e}") |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Chain re-raised exceptions with from e to preserve traceback context.
All three raise statements inside except blocks lose the original exception chain. While exc_info=True in the logger preserves it in logs, any caller catching these exceptions won't see the original cause via __cause__.
This is also flagged by Ruff B904.
Proposed fix
except ValueError as e:
# No chunks found for conversation
- raise FileNotFoundError(f"No audio chunks found for conversation {conversation_id}: {e}")
+ raise FileNotFoundError(f"No audio chunks found for conversation {conversation_id}: {e}") from e
except Exception as e:
logger.error(f"Failed to reconstruct audio from MongoDB: {e}", exc_info=True)
- raise RuntimeError(f"Audio reconstruction failed: {e}")
+ raise RuntimeError(f"Audio reconstruction failed: {e}") from e
try:
# Transcribe the audio directly from memory (no disk I/O needed)
transcription_result = await provider.transcribe(
audio_data=wav_data, # Pass bytes directly, already in memory
sample_rate=16000,
diarize=True,
)
except Exception as e:
logger.error(f"Transcription failed for conversation {conversation_id}: {type(e).__name__}: {e}", exc_info=True)
- raise RuntimeError(f"Transcription failed ({type(e).__name__}): {e}")
+ raise RuntimeError(f"Transcription failed ({type(e).__name__}): {e}") from e🧰 Tools
🪛 Ruff (0.14.14)
[warning] 213-213: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
[warning] 213-213: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 216-216: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
[warning] 216-216: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 227-227: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
[warning] 227-227: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py`
around lines 211 - 227, The exception handlers in transcription_jobs.py lose the
original exception chain; update the three raises to re-raise using exception
chaining (raise ... from e) so callers retain the original __cause__: change the
FileNotFoundError in the ValueError handler, the RuntimeError after "Failed to
reconstruct audio from MongoDB", and the RuntimeError after the
provider.transcribe failure to use "from e" while keeping the existing messages
and logging; reference the conversation_id variable,
wav_data/provider.transcribe call, and logger usage when making the edits.
There was a problem hiding this comment.
All three raise statements inside except blocks lose the original exception chain. While exc_info=True in the logger preserves it in logs, any caller catching these exceptions won't see the original cause via cause.
We should use the raise from pattern properly with logger.exception, I believe?
There was a problem hiding this comment.
@AnkushMalaker, you're absolutely right! Let me clarify the relationship between raise from and logger.exception:
They serve complementary purposes:
-
raise ... from e— Preserves the exception chain for callers. When code catches your raised exception, it can inspect__cause__to see the original error. This is critical for debugging and proper exception handling up the call stack. -
logger.exception()— A shorthand forlogger.error(..., exc_info=True)that automatically logs the current exception with its traceback. It's purely for logging/observability.
Best practice:
except Exception as e:
logger.exception(f"Failed to reconstruct audio from MongoDB")
raise RuntimeError(f"Audio reconstruction failed: {e}") from eOr equivalently:
except Exception as e:
logger.error(f"Failed to reconstruct audio from MongoDB: {e}", exc_info=True)
raise RuntimeError(f"Audio reconstruction failed: {e}") from eBoth approaches log the full traceback and preserve the chain for callers. The original review suggestion is correct — all three raise statements should add from e as proposed in the diff suggestion above. 🎯
There was a problem hiding this comment.
gimme a prompt to fix
26bf1c2
into
fix/onborading-improvements
* Enhance setup utilities and wizard functionality - Introduced `detect_tailscale_info` function to automatically retrieve Tailscale DNS name and IP address, improving user experience for service configuration. - Added `detect_cuda_version` function to identify the system's CUDA version, streamlining compatibility checks for GPU-based services. - Updated `wizard.py` to utilize the new detection functions, enhancing service selection and configuration processes based on user input. - Improved error handling and user feedback in service setup, ensuring clearer communication during configuration steps. - Refactored existing code to improve maintainability and code reuse across setup utilities. * Update ASR service capabilities and improve speaker identification handling - Modified the capabilities of the VibeVoice ASR provider to include 'speaker_identification' and 'long_form', enhancing its feature set. - Adjusted the speaker identification logic in the VibeVoiceTranscriber to prevent double-prefixing and ensure accurate speaker representation. - Updated protocol tests to reflect the expanded list of known ASR capabilities, ensuring comprehensive validation of reported features. * Refactor audio recording controls for improved UI and functionality - Replaced MicOff icon with Square icon in MainRecordingControls and SimplifiedControls for a more intuitive user experience. - Enhanced button interactions to streamline recording start/stop actions, including a pulsing effect during recording. - Updated status messages and button states to provide clearer feedback on recording status and actions. - Improved accessibility by ensuring buttons are disabled appropriately based on recording state and microphone access. * chore:test docs and test improvements (#288) * Enhance test environment setup and configuration - Added a new interactive setup script for configuring test API keys (Deepgram, OpenAI) to streamline the testing process. - Introduced a template for the .env.test file to guide users in setting up their API keys. - Updated the Makefile to include a new 'configure' target for setting up API keys. - Enhanced the start-containers script to warn users if API keys are still set to placeholder values, improving user awareness during testing. - Updated .gitignore to include the new .env.test.template file. * Remove outdated documentation and restructure feature overview - Deleted the `features.md` file, consolidating its content into the new `overview.md` for a more streamlined documentation structure. - Updated `init-system.md` to link to the new `overview.md` instead of the removed `features.md`. - Removed `ports-and-access.md` as its content was integrated into other documentation files, enhancing clarity and reducing redundancy. - Revised the `README.md` in the advanced backend to reflect the new naming conventions and updated links to documentation. - Introduced a new `plugin-development-guide.md` to assist users in creating custom plugins, expanding the documentation for developers. * tech debt
* Enhance StreamingTranscriptionConsumer and conversation job handling - Removed cumulative audio offset tracking from StreamingTranscriptionConsumer as Deepgram provides cumulative timestamps directly. - Updated store_final_result method to utilize Deepgram's cumulative timestamps without adjustments. - Implemented completion signaling for transcription sessions in Redis, ensuring conversation jobs wait for all results before processing. - Improved error handling to signal completion even in case of errors, preventing conversation jobs from hanging. - Enhanced logging for better visibility of transcription completion and error states. * Enhance ASR services configuration and provider management - Updated `config.yml.template` to include capabilities for ASR providers, detailing features like word timestamps and speaker segments. - Added a new `vibevoice` provider configuration for Microsoft VibeVoice ASR, supporting speaker diarization. - Enhanced `.env.template` with clearer provider selection and model configuration options, including CUDA settings and voice activity detection. - Improved `docker-compose.yml` to support multiple ASR providers with detailed service configurations. - Introduced common utilities for audio processing and ASR service management in the `common` module, enhancing code reusability and maintainability. - Updated `README.md` to reflect the new provider-based architecture and usage instructions for starting different ASR services. * Enhance transcription provider support and capabilities management - Added support for the new `vibevoice` transcription provider, including configuration options for built-in speaker diarization. - Updated `ChronicleSetup` to include `vibevoice` in the transcription provider selection and adjusted related descriptions. - Enhanced the `ModelDef` and `Conversation` models to reflect the addition of `vibevoice` in provider options. - Introduced a new capabilities management system to validate provider features, allowing conditional execution of tasks based on provider capabilities. - Improved logging and user feedback in transcription and speaker recognition jobs to reflect the capabilities of the selected provider. - Updated documentation to include details on the new `vibevoice` provider and its features. * Enhance conversation reprocessing and job management - Introduced a new job for regenerating title and summary after memory processing to ensure fresh context is available. - Updated the reprocess_transcript and reprocess_speakers functions to enqueue title/summary jobs based on memory job dependencies, improving job chaining and execution order. - Enhanced validation for transcripts to account for provider capabilities, ensuring proper handling of diarization and segment data. - Improved logging for job enqueuing and processing stages, providing clearer insights into the workflow and dependencies. * Enhance Knowledge Graph integration and service management - Introduced support for Knowledge Graph functionality, enabling entity and relationship extraction from conversations using Neo4j. - Updated `services.py` to manage Knowledge Graph profiles and integrate with existing service commands. - Enhanced Docker Compose configurations to include Neo4j service and environment variables for Knowledge Graph setup. - Added new API routes and models for Knowledge Graph operations, including entity and relationship management. - Improved documentation and configuration templates to reflect the new Knowledge Graph features and setup instructions. * Add Knowledge Graph API routes and integrate into backend - Introduced new `knowledge_graph_routes.py` to handle API endpoints for managing knowledge graph entities, relationships, and promises. - Updated `__init__.py` to include the new knowledge graph router in the main router module. - Enhanced documentation to reflect the addition of knowledge graph functionality, improving clarity on available API routes and their purposes. * Update .gitignore to include individual plugin configuration files and SDK directory - Added entries for individual plugin config files to ensure user-specific settings are ignored. - Included the SDK directory in .gitignore to prevent unnecessary files from being tracked. * Fix: onborading improvements (#287) * Enhance setup utilities and wizard functionality - Introduced `detect_tailscale_info` function to automatically retrieve Tailscale DNS name and IP address, improving user experience for service configuration. - Added `detect_cuda_version` function to identify the system's CUDA version, streamlining compatibility checks for GPU-based services. - Updated `wizard.py` to utilize the new detection functions, enhancing service selection and configuration processes based on user input. - Improved error handling and user feedback in service setup, ensuring clearer communication during configuration steps. - Refactored existing code to improve maintainability and code reuse across setup utilities. * Update ASR service capabilities and improve speaker identification handling - Modified the capabilities of the VibeVoice ASR provider to include 'speaker_identification' and 'long_form', enhancing its feature set. - Adjusted the speaker identification logic in the VibeVoiceTranscriber to prevent double-prefixing and ensure accurate speaker representation. - Updated protocol tests to reflect the expanded list of known ASR capabilities, ensuring comprehensive validation of reported features. * Refactor audio recording controls for improved UI and functionality - Replaced MicOff icon with Square icon in MainRecordingControls and SimplifiedControls for a more intuitive user experience. - Enhanced button interactions to streamline recording start/stop actions, including a pulsing effect during recording. - Updated status messages and button states to provide clearer feedback on recording status and actions. - Improved accessibility by ensuring buttons are disabled appropriately based on recording state and microphone access. * chore:test docs and test improvements (#288) * Enhance test environment setup and configuration - Added a new interactive setup script for configuring test API keys (Deepgram, OpenAI) to streamline the testing process. - Introduced a template for the .env.test file to guide users in setting up their API keys. - Updated the Makefile to include a new 'configure' target for setting up API keys. - Enhanced the start-containers script to warn users if API keys are still set to placeholder values, improving user awareness during testing. - Updated .gitignore to include the new .env.test.template file. * Remove outdated documentation and restructure feature overview - Deleted the `features.md` file, consolidating its content into the new `overview.md` for a more streamlined documentation structure. - Updated `init-system.md` to link to the new `overview.md` instead of the removed `features.md`. - Removed `ports-and-access.md` as its content was integrated into other documentation files, enhancing clarity and reducing redundancy. - Revised the `README.md` in the advanced backend to reflect the new naming conventions and updated links to documentation. - Introduced a new `plugin-development-guide.md` to assist users in creating custom plugins, expanding the documentation for developers. * tech debt
|
| Metric | Count |
|---|---|
| ✅ Passed | 102 |
| ❌ Failed | 17 |
| 📊 Total | 119 |
📊 View Reports
GitHub Pages (Live Reports):
Download Artifacts:
- robot-test-reports-html-no-api - HTML reports
- robot-test-results-xml-no-api - XML output
Summary by CodeRabbit
Documentation
New Features
Improvements