Conversation
- Updated `services.py` to include new options for service management, allowing for forced recreation of containers during startup. - Added LangFuse configuration options in the setup wizard, improving user experience for observability setup. - Introduced new API endpoints for retrieving observability configuration, enhancing integration with the frontend. - Enhanced error handling and logging for service startup processes, ensuring better visibility of configuration issues. - Updated documentation to reflect changes in service management and LangFuse integration.
- Updated README.md to include a new section for the Desktop Menu Bar Client with an accompanying screenshot. - Added configuration options for the new `galileo` ASR provider in `pyproject.toml` and `uv.lock`, enhancing support for additional audio processing capabilities. - Modified Dockerfile to include `galileo` as an extra dependency for both main and test environments, improving service management. - Enhanced job handling in `queue_controller.py` to track batch progress for transcription jobs, providing better user feedback during processing. - Updated Queue.tsx to display batch progress for audio transcription jobs, improving user experience in the web interface. - Refactored System.tsx to allow for restarting both workers and backend services, enhancing service management capabilities.
📝 WalkthroughWalkthroughThis change introduces batch progress streaming support for ASR services, enabling long-audio transcription to report progress updates via NDJSON streaming. It updates the backend ASR base class and VibeVoice provider with progress-aware methods, extends the UI with event details modal and batch progress visualization, adds a combined restart option, and updates dependencies and configurations. Changes
Sequence DiagramsequenceDiagram
participant Client as Client / Frontend
participant Backend as ASR Backend Service
participant Transcriber as Transcriber
participant UI as Queue UI
Client->>Backend: transcribe_endpoint(audio_file)
Backend->>Backend: Get audio duration
Backend->>Backend: Check supports_batch_progress(duration)
alt Batch Progress Supported
Backend->>Transcriber: transcribe_with_progress(audio_file)
loop For Each Window
Transcriber->>Transcriber: Process audio window
Transcriber-->>Backend: yield {type: "progress", percent, message}
Backend-->>Client: Stream progress (NDJSON)
Client->>UI: Update batch_progress display
end
Transcriber-->>Backend: yield {type: "result", data}
Backend-->>Client: Final result via stream
else No Batch Progress
Backend->>Transcriber: transcribe(audio_file)
Transcriber-->>Backend: result
Backend-->>Client: result (JSONResponse)
end
Client->>UI: Display final result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
| Metric | Count |
|---|---|
| ✅ Passed | 104 |
| ❌ Failed | 17 |
| 📊 Total | 121 |
📊 View Reports
GitHub Pages (Live Reports):
Download Artifacts:
- robot-test-reports-html-no-api - HTML reports
- robot-test-results-xml-no-api - XML output
|
| Metric | Count |
|---|---|
| ✅ Passed | 104 |
| ❌ Failed | 17 |
| 📊 Total | 121 |
📊 View Reports
GitHub Pages (Live Reports):
Download Artifacts:
- robot-test-reports-html-no-api - HTML reports
- robot-test-results-xml-no-api - XML output
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (9)
backends/advanced/webui/src/pages/System.tsx (2)
111-122: Missing error handling in sequential restart flow.If the worker restart succeeds but the backend restart mutation fails, the user receives no feedback — the modal is already dismissed (line 112), no overlay appears, and no error is shown. While the individual handlers (
handleRestartWorkers,handleRestartBackend) also lackonError, the sequential composition makes silent partial failure more confusing since the user expects both to happen.Consider adding
onErrorcallbacks to surface failures, at minimum on the inner mutation:Suggested improvement
const handleRestartBoth = () => { setConfirmModal(null) restartWorkersMutation.mutate(undefined, { onSuccess: () => { restartBackendMutation.mutate(undefined, { onSuccess: () => { pollHealth() }, + onError: () => { + // Workers restarted but backend failed — still show the worker banner + setWorkerBanner(true) + setTimeout(() => setWorkerBanner(false), 8000) + }, }) }, + onError: () => { + // Could not restart workers; don't proceed to backend + }, }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backends/advanced/webui/src/pages/System.tsx` around lines 111 - 122, The sequential restart flow in handleRestartBoth dismisses the confirmation modal immediately and calls restartWorkersMutation followed by restartBackendMutation but lacks onError handling, so failures (especially in restartBackendMutation) are silent; update handleRestartBoth to add onError callbacks for restartBackendMutation (and ideally restartWorkersMutation) that restore or keep the modal/overlay and surface an error to the user via the existing UI error mechanism (e.g., call setConfirmModal to show an error state or trigger the same error handler used elsewhere), and ensure pollHealth() is called only on overall success; reference restartWorkersMutation, restartBackendMutation, handleRestartBoth, setConfirmModal, and pollHealth when making these changes.
341-373: Consider extracting shared modal content to reduce duplication.The 'both' confirmation modal block (lines 341-373) is nearly identical to the 'backend' block (lines 308-340) — same warning box, same styling, same cancel button structure. The only differences are the icon, title, description text, and the action handler. A small helper component or parameterized block could DRY this up and make future changes to the confirmation UX less error-prone.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backends/advanced/webui/src/pages/System.tsx` around lines 341 - 373, The 'both' and 'backend' confirmation modal JSX are nearly identical; extract the shared UI into a small reusable component (e.g., ConfirmModalContent or ConfirmActionModal) that accepts props for icon, title, description text, warning message, confirm button label, and the confirm handler (pass handleRestartBoth or handleRestartBackend), and use the existing cancel handler (setConfirmModal(null)) for the cancel button; replace the duplicated JSX in System.tsx with this new component to DRY up the modal rendering while preserving existing handlers and styles.backends/advanced/src/advanced_omi_backend/controllers/queue_controller.py (1)
230-231: Extractbatch_progressto avoid defensive coding anti-patternThe
batch_progresskey is never set in the codebase, so theNonevalue edge case is currently theoretical. However, the chained.get()pattern is a defensive-coding anti-pattern:.get("batch_progress", {})returns the stored value (even ifNone) when the key exists, bypassing the default{}. Additionally,(job.meta or {})is computed twice (lines 230 and 231).For robustness, extract
batch_progressto a local variable and guard withor {}to defend against future code that might set it toNone:Suggested refactor
+ batch_progress = (job.meta or {}).get("batch_progress") or {} all_jobs.append({ ... - "progress_percent": (job.meta or {}).get("batch_progress", {}).get("percent", 0), - "progress_message": (job.meta or {}).get("batch_progress", {}).get("message", ""), + "progress_percent": batch_progress.get("percent", 0), + "progress_message": batch_progress.get("message", ""), })This eliminates duplication and makes the defensive intent explicit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backends/advanced/src/advanced_omi_backend/controllers/queue_controller.py` around lines 230 - 231, The two chained .get() calls should be simplified by extracting batch_progress into a local variable to avoid the defensive-coding pitfall and duplicate (job.meta or {}) computation: compute meta = (job.meta or {}) once, then batch_progress = meta.get("batch_progress") or {} and use batch_progress.get("percent", 0) and batch_progress.get("message", "") when building the dict (referencing job.meta and batch_progress to locate the code in queue_controller.py).backends/advanced/Dockerfile (1)
26-27: galileo extras are unconditionally baked into the production image.Adding
--extra galileoto the builder stage installs the full galileo + OpenTelemetry stack into every production image.opentelemetry-exporter-otlppulls in the gRPC variant of the OTLP exporter, which depends ongrpcio— a sizeable native C extension. Users who never configure Galileo observability pay the image-size and attack-surface cost unconditionally.If galileo tracing is opt-in (activated via
GALILEO_API_KEY/ config), consider omitting--extra galileofrom the production stage and documenting how to build a galileo-enabled variant:♻️ Suggested approach
RUN --mount=type=cache,target=/root/.cache/uv \ - uv export --frozen --no-dev --extra deepgram --extra galileo --no-emit-project -o requirements.txt && \ + uv export --frozen --no-dev --extra deepgram --no-emit-project -o requirements.txt && \ uv pip install --system -r requirements.txtAdd a separate build-arg-gated stage or a distinct
Dockerfile.galileofor users who need it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backends/advanced/Dockerfile` around lines 26 - 27, The Dockerfile unconditionally includes the galileo extras by using "uv export ... --extra galileo" which bakes the galileo/OpenTelemetry stack (and heavy deps like grpcio) into every production image; remove the "--extra galileo" from the production-stage uv export/pip install lines and instead add an opt-in build variant (either a build-arg-gated stage or a separate Dockerfile.galileo) that runs "uv export ... --extra galileo" and "uv pip install" only when GALILEO is requested so default images stay small and free of grpcio.backends/advanced/webui/src/pages/Queue.tsx (1)
2664-2666: Event Details Modal missing ARIA dialog attributes.Like the existing Job Details Modal, this modal lacks
role="dialog",aria-modal="true",aria-labelledby, and programmatic focus management (focus should move to the modal/close button on open). Screen readers won't announce the modal and keyboard users can't navigate to it correctly.♿ Suggested minimal accessibility fix
-<div className="fixed inset-0 bg-gray-600 bg-opacity-50 overflow-y-auto h-full w-full z-50"> - <div className="relative top-20 mx-auto p-5 border w-11/12 md:w-2/3 lg:w-1/2 shadow-lg rounded-md bg-white"> - <div className="flex justify-between items-center mb-4"> - <h3 className="text-lg font-medium text-gray-900">Event Details</h3> - <button onClick={() => setSelectedEvent(null)} className="text-gray-400 hover:text-gray-600"> +<div className="fixed inset-0 bg-gray-600 bg-opacity-50 overflow-y-auto h-full w-full z-50" role="presentation"> + <div + className="relative top-20 mx-auto p-5 border w-11/12 md:w-2/3 lg:w-1/2 shadow-lg rounded-md bg-white" + role="dialog" + aria-modal="true" + aria-labelledby="event-detail-title" + > + <div className="flex justify-between items-center mb-4"> + <h3 id="event-detail-title" className="text-lg font-medium text-gray-900">Event Details</h3> + <button onClick={() => setSelectedEvent(null)} className="text-gray-400 hover:text-gray-600" autoFocus>The same improvement applies to the pre-existing Job Details and Flush Jobs modals. Addressing all three consistently would be ideal.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backends/advanced/webui/src/pages/Queue.tsx` around lines 2664 - 2666, The Event Details modal rendered when selectedEvent is truthy is missing proper ARIA dialog attributes and focus management; update that modal container (the div opened where selectedEvent && (...) is returned) to include role="dialog", aria-modal="true", and aria-labelledby pointing to the modal title element (add an id on the title), and implement programmatic focus management by creating a ref for the modal close button and moving focus to it in a useEffect when selectedEvent opens and restoring focus on close; apply the same attribute+focus fixes consistently to the Job Details modal and the Flush Jobs modal (the components/blocks that render when selectedJob or flushJobsModal/open state is true) so all three modals announce to screen readers and receive keyboard focus correctly.extras/asr-services/providers/vibevoice/transcriber.py (3)
38-45: Third-party importomegaconfis placed inside a local-import blockLine 40 (
from omegaconf import OmegaConf) is sandwiched between twofrom common.*local imports. It should sit with the other third-party imports (import torch), following the conventional stdlib → third-party → local grouping.♻️ Proposed fix
import torch from common.audio_utils import STANDARD_SAMPLE_RATE, load_audio_file -from omegaconf import OmegaConf -from common.batching import ( +from omegaconf import OmegaConf # move above common.* imports +# (place alongside `import torch`) + +from common.batching import (Move
from omegaconf import OmegaConfimmediately afterimport torch.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extras/asr-services/providers/vibevoice/transcriber.py` around lines 38 - 45, The third-party import OmegaConf is incorrectly placed among local imports; move the line "from omegaconf import OmegaConf" so it sits with other third-party imports (e.g., immediately after "import torch") to follow stdlib → third-party → local grouping and keep imports organized.
488-531:_transcribe_batched_with_progressduplicates_transcribe_batchedentirelyThe two methods share identical
split_audio_filesetup, thebatch_resultsaccumulation loop,_transcribe_singlecalls, temp-file cleanup, andstitch_transcription_results— differing only in theyieldof progress events. Consider collapsing by having_transcribe_batchedconsume the progress generator:♻️ Proposed refactor
def _transcribe_batched(self, audio_file_path: str, hotwords: Optional[str] = None) -> TranscriptionResult: - windows = split_audio_file(...) - batch_results = [] - for i, (temp_path, start_time, end_time) in enumerate(windows): - try: - ... - result = self._transcribe_single(temp_path, context_info=hotwords) - batch_results.append((result, start_time, end_time)) - ... - finally: - os.unlink(temp_path) - return stitch_transcription_results(batch_results, overlap_seconds=self.batch_overlap) + final_result = None + for event in self._transcribe_batched_with_progress(audio_file_path, hotwords=hotwords): + if event["type"] == "result": + final_result = stitch_transcription_results( + # already stitched inside _transcribe_batched_with_progress + [], overlap_seconds=self.batch_overlap + ) + # _transcribe_batched_with_progress already yields the stitched result dict; + # reconstruct TranscriptionResult from it or return it directly. + ...(exact reconstruction depends on
TranscriptionResult.from_dictavailability; if not present, expose a small_stitch_from_windowshelper shared by both methods instead)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extras/asr-services/providers/vibevoice/transcriber.py` around lines 488 - 531, _transcribe_batched_with_progress duplicates most of _transcribe_batched; collapse them by extracting the shared batching/transcription logic into a single generator or helper and have the other method reuse it. Concretely: move split_audio_file, the for-loop that calls _transcribe_single, temp-file cleanup and batch_results accumulation into a single generator function (or keep _transcribe_batched_with_progress as the generator) and have _transcribe_batched consume it to only return the final stitched result; alternatively add a small helper like _stitch_from_windows(batch_results, overlap_seconds) or rely on TranscriptionResult.from_dict if available and call stitch_transcription_results once — update _transcribe_batched and _transcribe_batched_with_progress to call the shared generator/helper and remove duplicated code.
40-45: Remove unusedextract_context_tailimportThe
extract_context_tailimport is unused in this file. Following the removal of inter-window context threading (as indicated by the NOTE at line 473), this import is no longer referenced in_transcribe_batchedor any other method in the file.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extras/asr-services/providers/vibevoice/transcriber.py` around lines 40 - 45, The import extract_context_tail from common.batching is unused; remove it from the import list at the top of the file so only split_audio_file and stitch_transcription_results (and any other used symbols) are imported, and verify there are no remaining references (e.g., in _transcribe_batched) before committing the change.extras/asr-services/common/base_service.py (1)
227-251: Streaming path is safe from event-loop blocking, but watch thread-pool capacity under loadStarlette's
StreamingResponsecorrectly wraps non-async iterables withiterate_in_threadpool, dispatching eachnext()call viaanyio.to_thread.run_sync— so the event loop is not blocked. However, the default thread pool has only 40 tokens, meaning only 40 threads can run concurrently across the entire Starlette/FastAPI process. Each in-flight batch window holds a thread for the duration of one GPU inference call (potentially minutes). Under modest parallelism this can exhaust the pool and queue other async operations.If concurrent streaming transcriptions are expected, consider raising the token limit at startup or enforcing a streaming-request concurrency cap.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@extras/asr-services/common/base_service.py` around lines 227 - 251, StreamingResponse wrapping the synchronous _ndjson_generator (which iterates service.transcribe_with_progress over tmp_filename) can exhaust AnyIO's default thread-pool tokens (40) under concurrent long-running GPU inferences; update startup or request handling to prevent starvation by either (A) increasing AnyIO thread pool token limit at application startup (configure anyio.to_thread.MAX_WORKERS / AnyIO thread settings used by Starlette) or (B) enforce a concurrency cap for this streaming path (e.g., semaphore or limiter around the branch that sets streaming_response and calls service.transcribe_with_progress) so only a controlled number of _ndjson_generator instances run concurrently and tmp_filename cleanup still runs in finally.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backends/advanced/webui/src/pages/Queue.tsx`:
- Around line 1404-1415: Clamp the batch progress percent before using it for
the width style: compute a sanitized value (e.g., let pct =
job.meta?.batch_progress?.percent ?? 0; pct = Math.max(0, Math.min(100, pct)))
and use that sanitized pct in the inline style and displayed percent text;
update the rendering logic around job.job_type === 'transcribe_full_audio_job'
&& job.status === 'started' where job.meta.batch_progress is referenced so
out-of-range or missing values won't break the progress bar.
- Around line 2197-2205: The Eye button is currently only shown when
evt.plugins_executed has entries, preventing inspection of events with metadata
but no plugin results; update the JSX conditional that renders the Eye button
(the block that calls setSelectedEvent(evt) and renders the Eye icon) to also
display when the event has inspectable metadata—i.e., render the button when
evt.plugins_executed.length > 0 OR when evt.metadata exists and has at least one
key (check Object.keys(evt.metadata).length > 0) so users can open the detail
modal for metadata-only events like button.single_press.
- Around line 2686-2689: The modal currently shows selectedEvent.user_id in
full, exposing potential PII; update the modal to apply the same
truncation/display rule used by the events table (truncate to show only the last
8 characters when the value length > 12). Locate the modal rendering of
selectedEvent.user_id in Queue.tsx and either call the existing
truncation/helper function used by the table or extract that logic into a shared
utility (e.g., formatUserId or truncateIdentifier) and use it when rendering the
<p> so the modal and table display are consistent.
- Around line 1848-1853: The completed-conversations timeline contains
unreachable batch-progress rendering: move the batch_progress tooltip and the
conditional display logic that references job.meta.batch_progress and
job.meta.batch_progress.current/total from the completed-conversations timeline
into the active-conversations timeline where job.status can be 'started' (apply
the same two changes you introduced), and revert the completed-conversations
timeline's span to the original duration-only display; also guard the numeric
fields by using job.meta?.batch_progress?.current ?? '?' and
job.meta?.batch_progress?.total ?? '?' so the UI shows '?' instead of undefined
when current/total are missing.
In `@extras/asr-services/common/base_service.py`:
- Around line 134-140: The helper _get_audio_duration currently uses wave.open
and fails on non-WAV files; replace its implementation to use the soundfile API
(e.g., soundfile.info or soundfile.SoundFile) to read duration in a
format-agnostic way: open the file with soundfile, compute duration as frames /
samplerate (or use info.duration if available), and keep the same behavior of
returning None on error; update any imports to import soundfile as sf and ensure
_get_audio_duration continues to return Optional[float] to preserve the gating
logic that decides streaming progress.
In `@extras/asr-services/providers/vibevoice/service.py`:
- Around line 111-121: The RuntimeError check in transcribe_with_progress is
executed lazily because the method is a generator; update the
transcribe_with_progress docstring to explicitly state that the self.transcriber
None check (and any errors) will only occur when the returned generator is first
iterated, and advise callers to verify supports_batch_progress or ensure
transcriber is initialized before calling; mention the transcribe_with_progress
method name and the self.transcriber attribute so maintainers can find the
relevant code.
In `@extras/asr-services/providers/vibevoice/transcriber.py`:
- Around line 61-72: The code in VibeVoiceTranscriber.__init__ assigns
asr_config and then calls OmegaConf.to_container which fails when
asr_services.vibevoice is missing because asr_config is a plain dict; replace
the current lookup with OmegaConf.select (e.g., OmegaConf.select(merged,
"asr_services.vibevoice", default=OmegaConf.create({}))) so to_container always
gets a DictConfig/ListConfig, then call OmegaConf.to_container on that result;
also remove the unused import extract_context_tail and (optionally) consolidate
duplicated logic between _transcribe_batched and
_transcribe_batched_with_progress into a shared helper to avoid repetition.
In `@README.md`:
- Line 27: The README contains a broken image reference: ""; fix it by either (a) adding the missing
image file named menu-bar-client.png into the .assets directory and committing
it, or (b) updating that markdown image reference to point to an existing asset
or removing the entire image markdown if no image is needed—ensure the changed
README uses the exact updated path or removes the line so the image no longer
renders broken.
---
Nitpick comments:
In `@backends/advanced/Dockerfile`:
- Around line 26-27: The Dockerfile unconditionally includes the galileo extras
by using "uv export ... --extra galileo" which bakes the galileo/OpenTelemetry
stack (and heavy deps like grpcio) into every production image; remove the
"--extra galileo" from the production-stage uv export/pip install lines and
instead add an opt-in build variant (either a build-arg-gated stage or a
separate Dockerfile.galileo) that runs "uv export ... --extra galileo" and "uv
pip install" only when GALILEO is requested so default images stay small and
free of grpcio.
In `@backends/advanced/src/advanced_omi_backend/controllers/queue_controller.py`:
- Around line 230-231: The two chained .get() calls should be simplified by
extracting batch_progress into a local variable to avoid the defensive-coding
pitfall and duplicate (job.meta or {}) computation: compute meta = (job.meta or
{}) once, then batch_progress = meta.get("batch_progress") or {} and use
batch_progress.get("percent", 0) and batch_progress.get("message", "") when
building the dict (referencing job.meta and batch_progress to locate the code in
queue_controller.py).
In `@backends/advanced/webui/src/pages/Queue.tsx`:
- Around line 2664-2666: The Event Details modal rendered when selectedEvent is
truthy is missing proper ARIA dialog attributes and focus management; update
that modal container (the div opened where selectedEvent && (...) is returned)
to include role="dialog", aria-modal="true", and aria-labelledby pointing to the
modal title element (add an id on the title), and implement programmatic focus
management by creating a ref for the modal close button and moving focus to it
in a useEffect when selectedEvent opens and restoring focus on close; apply the
same attribute+focus fixes consistently to the Job Details modal and the Flush
Jobs modal (the components/blocks that render when selectedJob or
flushJobsModal/open state is true) so all three modals announce to screen
readers and receive keyboard focus correctly.
In `@backends/advanced/webui/src/pages/System.tsx`:
- Around line 111-122: The sequential restart flow in handleRestartBoth
dismisses the confirmation modal immediately and calls restartWorkersMutation
followed by restartBackendMutation but lacks onError handling, so failures
(especially in restartBackendMutation) are silent; update handleRestartBoth to
add onError callbacks for restartBackendMutation (and ideally
restartWorkersMutation) that restore or keep the modal/overlay and surface an
error to the user via the existing UI error mechanism (e.g., call
setConfirmModal to show an error state or trigger the same error handler used
elsewhere), and ensure pollHealth() is called only on overall success; reference
restartWorkersMutation, restartBackendMutation, handleRestartBoth,
setConfirmModal, and pollHealth when making these changes.
- Around line 341-373: The 'both' and 'backend' confirmation modal JSX are
nearly identical; extract the shared UI into a small reusable component (e.g.,
ConfirmModalContent or ConfirmActionModal) that accepts props for icon, title,
description text, warning message, confirm button label, and the confirm handler
(pass handleRestartBoth or handleRestartBackend), and use the existing cancel
handler (setConfirmModal(null)) for the cancel button; replace the duplicated
JSX in System.tsx with this new component to DRY up the modal rendering while
preserving existing handlers and styles.
In `@extras/asr-services/common/base_service.py`:
- Around line 227-251: StreamingResponse wrapping the synchronous
_ndjson_generator (which iterates service.transcribe_with_progress over
tmp_filename) can exhaust AnyIO's default thread-pool tokens (40) under
concurrent long-running GPU inferences; update startup or request handling to
prevent starvation by either (A) increasing AnyIO thread pool token limit at
application startup (configure anyio.to_thread.MAX_WORKERS / AnyIO thread
settings used by Starlette) or (B) enforce a concurrency cap for this streaming
path (e.g., semaphore or limiter around the branch that sets streaming_response
and calls service.transcribe_with_progress) so only a controlled number of
_ndjson_generator instances run concurrently and tmp_filename cleanup still runs
in finally.
In `@extras/asr-services/providers/vibevoice/transcriber.py`:
- Around line 38-45: The third-party import OmegaConf is incorrectly placed
among local imports; move the line "from omegaconf import OmegaConf" so it sits
with other third-party imports (e.g., immediately after "import torch") to
follow stdlib → third-party → local grouping and keep imports organized.
- Around line 488-531: _transcribe_batched_with_progress duplicates most of
_transcribe_batched; collapse them by extracting the shared
batching/transcription logic into a single generator or helper and have the
other method reuse it. Concretely: move split_audio_file, the for-loop that
calls _transcribe_single, temp-file cleanup and batch_results accumulation into
a single generator function (or keep _transcribe_batched_with_progress as the
generator) and have _transcribe_batched consume it to only return the final
stitched result; alternatively add a small helper like
_stitch_from_windows(batch_results, overlap_seconds) or rely on
TranscriptionResult.from_dict if available and call stitch_transcription_results
once — update _transcribe_batched and _transcribe_batched_with_progress to call
the shared generator/helper and remove duplicated code.
- Around line 40-45: The import extract_context_tail from common.batching is
unused; remove it from the import list at the top of the file so only
split_audio_file and stitch_transcription_results (and any other used symbols)
are imported, and verify there are no remaining references (e.g., in
_transcribe_batched) before committing the change.
| {/* transcribe_full_audio_job batch progress */} | ||
| {job.job_type === 'transcribe_full_audio_job' && job.status === 'started' && job.meta?.batch_progress && ( | ||
| <div className="mt-1"> | ||
| <div className="flex items-center justify-between text-xs mb-1"> | ||
| <span className="text-blue-700">{job.meta.batch_progress.message}</span> | ||
| <span className="text-blue-600 font-medium">{job.meta.batch_progress.percent}%</span> | ||
| </div> | ||
| <div className="w-full bg-blue-200 rounded-full h-1.5"> | ||
| <div className="bg-blue-600 h-1.5 rounded-full transition-all duration-300" style={{ width: `${job.meta.batch_progress.percent}%` }} /> | ||
| </div> | ||
| </div> | ||
| )} |
There was a problem hiding this comment.
Clamp batch_progress.percent before using it in the inline width style.
job.meta.batch_progress.percent is used directly in style={{ width: \${percent}%` }}without range validation. A backend value outside[0, 100]` would produce an overflowing or invisible bar.
🛡️ Proposed fix
-<div className="bg-blue-600 h-1.5 rounded-full transition-all duration-300" style={{ width: `${job.meta.batch_progress.percent}%` }} />
+<div className="bg-blue-600 h-1.5 rounded-full transition-all duration-300" style={{ width: `${Math.min(100, Math.max(0, job.meta.batch_progress.percent))}%` }} />📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {/* transcribe_full_audio_job batch progress */} | |
| {job.job_type === 'transcribe_full_audio_job' && job.status === 'started' && job.meta?.batch_progress && ( | |
| <div className="mt-1"> | |
| <div className="flex items-center justify-between text-xs mb-1"> | |
| <span className="text-blue-700">{job.meta.batch_progress.message}</span> | |
| <span className="text-blue-600 font-medium">{job.meta.batch_progress.percent}%</span> | |
| </div> | |
| <div className="w-full bg-blue-200 rounded-full h-1.5"> | |
| <div className="bg-blue-600 h-1.5 rounded-full transition-all duration-300" style={{ width: `${job.meta.batch_progress.percent}%` }} /> | |
| </div> | |
| </div> | |
| )} | |
| {/* transcribe_full_audio_job batch progress */} | |
| {job.job_type === 'transcribe_full_audio_job' && job.status === 'started' && job.meta?.batch_progress && ( | |
| <div className="mt-1"> | |
| <div className="flex items-center justify-between text-xs mb-1"> | |
| <span className="text-blue-700">{job.meta.batch_progress.message}</span> | |
| <span className="text-blue-600 font-medium">{job.meta.batch_progress.percent}%</span> | |
| </div> | |
| <div className="w-full bg-blue-200 rounded-full h-1.5"> | |
| <div className="bg-blue-600 h-1.5 rounded-full transition-all duration-300" style={{ width: `${Math.min(100, Math.max(0, job.meta.batch_progress.percent))}%` }} /> | |
| </div> | |
| </div> | |
| )} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backends/advanced/webui/src/pages/Queue.tsx` around lines 1404 - 1415, Clamp
the batch progress percent before using it for the width style: compute a
sanitized value (e.g., let pct = job.meta?.batch_progress?.percent ?? 0; pct =
Math.max(0, Math.min(100, pct))) and use that sanitized pct in the inline style
and displayed percent text; update the rendering logic around job.job_type ===
'transcribe_full_audio_job' && job.status === 'started' where
job.meta.batch_progress is referenced so out-of-range or missing values won't
break the progress bar.
| title={`Started: ${new Date(startTime).toLocaleTimeString()}\nDuration: ${formatDuration(duration)}${job.meta?.batch_progress ? `\n${job.meta.batch_progress.message}` : ''}`} | ||
| > | ||
| <span className="text-xs text-white font-medium px-2 truncate"> | ||
| {formatDuration(duration)} | ||
| {job.status === 'started' && job.meta?.batch_progress | ||
| ? `${job.meta.batch_progress.current}/${job.meta.batch_progress.total}` | ||
| : formatDuration(duration)} |
There was a problem hiding this comment.
Batch-progress bar content is dead code here — it belongs in the active-conversations timeline.
The completed-conversations section filters conversations where all jobs have status === 'finished' | 'failed' (line 1558). As a result, the condition job.status === 'started' is structurally impossible inside this block, so the ternary on lines 1851–1853 always evaluates to formatDuration(duration).
The active-conversations timeline (lines 1310–1320) is the correct placement because transcribe_full_audio_job can be started there, but that section was not updated with these changes.
🐛 Proposed fix — move changes to the active-conversations timeline
In the active-conversations timeline bar (around line 1316), apply the same two changes:
- title={`Started: ${new Date(startTime).toLocaleTimeString()}\nDuration: ${formatDuration(duration)}`}
+ title={`Started: ${new Date(startTime).toLocaleTimeString()}\nDuration: ${formatDuration(duration)}${job.meta?.batch_progress ? `\n${job.meta.batch_progress.message}` : ''}`}- {formatDuration(duration)}
+ {job.status === 'started' && job.meta?.batch_progress
+ ? `${job.meta.batch_progress.current ?? '?'}/${job.meta.batch_progress.total ?? '?'}`
+ : formatDuration(duration)}And revert the completed-conversations timeline bar (lines 1848–1853) to the original (no batch_progress logic), unless the tooltip hint for historical context is intentional.
Additionally, job.meta.batch_progress.current and .total are accessed without individual null-guards (only batch_progress itself is guarded). If either field is missing during initialization, the bar would render undefined/undefined. The fix above uses ?? '?' to handle that.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backends/advanced/webui/src/pages/Queue.tsx` around lines 1848 - 1853, The
completed-conversations timeline contains unreachable batch-progress rendering:
move the batch_progress tooltip and the conditional display logic that
references job.meta.batch_progress and job.meta.batch_progress.current/total
from the completed-conversations timeline into the active-conversations timeline
where job.status can be 'started' (apply the same two changes you introduced),
and revert the completed-conversations timeline's span to the original
duration-only display; also guard the numeric fields by using
job.meta?.batch_progress?.current ?? '?' and job.meta?.batch_progress?.total ??
'?' so the UI shows '?' instead of undefined when current/total are missing.
| {evt.plugins_executed.length > 0 && ( | ||
| <button | ||
| onClick={() => setSelectedEvent(evt)} | ||
| className="text-gray-400 hover:text-gray-600 p-0.5 rounded hover:bg-gray-100" | ||
| title="View details" | ||
| > | ||
| <Eye className="w-3.5 h-3.5" /> | ||
| </button> | ||
| )} |
There was a problem hiding this comment.
Eye button is inaccessible for events that carry metadata but no plugin results.
The button is guarded by evt.plugins_executed.length > 0, so events such as button.single_press (which produce no plugin executions but may have useful metadata) cannot be inspected in the detail modal.
✨ Proposed fix — show Eye button whenever there is anything to inspect
-{evt.plugins_executed.length > 0 && (
+{(evt.plugins_executed.length > 0 || Object.keys(evt.metadata).length > 0) && (
<button
onClick={() => setSelectedEvent(evt)}
className="text-gray-400 hover:text-gray-600 p-0.5 rounded hover:bg-gray-100"
title="View details"
>
<Eye className="w-3.5 h-3.5" />
</button>
)}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {evt.plugins_executed.length > 0 && ( | |
| <button | |
| onClick={() => setSelectedEvent(evt)} | |
| className="text-gray-400 hover:text-gray-600 p-0.5 rounded hover:bg-gray-100" | |
| title="View details" | |
| > | |
| <Eye className="w-3.5 h-3.5" /> | |
| </button> | |
| )} | |
| {(evt.plugins_executed.length > 0 || Object.keys(evt.metadata).length > 0) && ( | |
| <button | |
| onClick={() => setSelectedEvent(evt)} | |
| className="text-gray-400 hover:text-gray-600 p-0.5 rounded hover:bg-gray-100" | |
| title="View details" | |
| > | |
| <Eye className="w-3.5 h-3.5" /> | |
| </button> | |
| )} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backends/advanced/webui/src/pages/Queue.tsx` around lines 2197 - 2205, The
Eye button is currently only shown when evt.plugins_executed has entries,
preventing inspection of events with metadata but no plugin results; update the
JSX conditional that renders the Eye button (the block that calls
setSelectedEvent(evt) and renders the Eye icon) to also display when the event
has inspectable metadata—i.e., render the button when
evt.plugins_executed.length > 0 OR when evt.metadata exists and has at least one
key (check Object.keys(evt.metadata).length > 0) so users can open the detail
modal for metadata-only events like button.single_press.
| <div> | ||
| <label className="block text-sm font-medium text-gray-700">User</label> | ||
| <p className="text-sm text-gray-900 font-mono">{selectedEvent.user_id}</p> | ||
| </div> |
There was a problem hiding this comment.
user_id shown in full — inconsistent with table truncation and potential PII exposure.
The events table truncates user_id to the last 8 characters for values longer than 12 characters (line 2172), suggesting the field may contain UUIDs, emails, or other identifiers. The modal renders the full value. If user_id can be a user email or username, this unnecessarily exposes PII to every viewer of the dashboard.
Apply the same truncation (or at minimum the same display logic) as the table:
-<p className="text-sm text-gray-900 font-mono">{selectedEvent.user_id}</p>
+<p className="text-sm text-gray-900 font-mono" title={selectedEvent.user_id}>
+ {selectedEvent.user_id.length > 12 ? `...${selectedEvent.user_id.slice(-8)}` : selectedEvent.user_id}
+</p>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <div> | |
| <label className="block text-sm font-medium text-gray-700">User</label> | |
| <p className="text-sm text-gray-900 font-mono">{selectedEvent.user_id}</p> | |
| </div> | |
| <div> | |
| <label className="block text-sm font-medium text-gray-700">User</label> | |
| <p className="text-sm text-gray-900 font-mono" title={selectedEvent.user_id}> | |
| {selectedEvent.user_id.length > 12 ? `...${selectedEvent.user_id.slice(-8)}` : selectedEvent.user_id} | |
| </p> | |
| </div> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backends/advanced/webui/src/pages/Queue.tsx` around lines 2686 - 2689, The
modal currently shows selectedEvent.user_id in full, exposing potential PII;
update the modal to apply the same truncation/display rule used by the events
table (truncate to show only the last 8 characters when the value length > 12).
Locate the modal rendering of selectedEvent.user_id in Queue.tsx and either call
the existing truncation/helper function used by the table or extract that logic
into a shared utility (e.g., formatUserId or truncateIdentifier) and use it when
rendering the <p> so the modal and table display are consistent.
| def _get_audio_duration(file_path: str) -> Optional[float]: | ||
| """Return audio duration in seconds, or None if unreadable.""" | ||
| try: | ||
| with wave.open(file_path, "rb") as wf: | ||
| return wf.getnframes() / wf.getframerate() | ||
| except Exception: | ||
| return None |
There was a problem hiding this comment.
_get_audio_duration is WAV-only; non-WAV long audio silently bypasses streaming progress
wave.open() raises on anything other than a PCM WAV file (MP3, FLAC, M4A, OGG all fail), causing _get_audio_duration to return None. When None is returned the if audio_duration and ... gate at Line 229 is never entered, so long non-WAV uploads fall straight to the non-streaming path with no progress reporting — even though split_audio_file / load_audio_file in batching.py can load those formats.
Consider using soundfile (already a top-level dependency) for a format-agnostic duration read:
🛠️ Proposed fix
-def _get_audio_duration(file_path: str) -> Optional[float]:
- """Return audio duration in seconds, or None if unreadable."""
- try:
- with wave.open(file_path, "rb") as wf:
- return wf.getnframes() / wf.getframerate()
- except Exception:
- return None
+def _get_audio_duration(file_path: str) -> Optional[float]:
+ """Return audio duration in seconds, or None if unreadable."""
+ try:
+ import soundfile as sf
+ info = sf.info(file_path)
+ return info.duration
+ except Exception:
+ return None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _get_audio_duration(file_path: str) -> Optional[float]: | |
| """Return audio duration in seconds, or None if unreadable.""" | |
| try: | |
| with wave.open(file_path, "rb") as wf: | |
| return wf.getnframes() / wf.getframerate() | |
| except Exception: | |
| return None | |
| def _get_audio_duration(file_path: str) -> Optional[float]: | |
| """Return audio duration in seconds, or None if unreadable.""" | |
| try: | |
| import soundfile as sf | |
| info = sf.info(file_path) | |
| return info.duration | |
| except Exception: | |
| return None |
🧰 Tools
🪛 Ruff (0.15.1)
[warning] 139-139: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extras/asr-services/common/base_service.py` around lines 134 - 140, The
helper _get_audio_duration currently uses wave.open and fails on non-WAV files;
replace its implementation to use the soundfile API (e.g., soundfile.info or
soundfile.SoundFile) to read duration in a format-agnostic way: open the file
with soundfile, compute duration as frames / samplerate (or use info.duration if
available), and keep the same behavior of returning None on error; update any
imports to import soundfile as sf and ensure _get_audio_duration continues to
return Optional[float] to preserve the gating logic that decides streaming
progress.
| def transcribe_with_progress(self, audio_file_path: str, context_info=None): | ||
| """Yield progress counters then final result for long audio. | ||
|
|
||
| Delegates to the transcriber's _transcribe_batched_with_progress() generator. | ||
| Runs synchronously (called via run_in_executor by the endpoint). | ||
| """ | ||
| if self.transcriber is None: | ||
| raise RuntimeError("Service not initialized") | ||
| yield from self.transcriber._transcribe_batched_with_progress( | ||
| audio_file_path, hotwords=context_info, | ||
| ) |
There was a problem hiding this comment.
RuntimeError guard in generator body fires lazily, not at call time
Because yield from makes transcribe_with_progress a generator function, the body (including the if self.transcriber is None check on Line 117) does not execute when the method is called — it executes only when the returned generator is first iterated. Any caller that invokes transcribe_with_progress(...) directly (bypassing supports_batch_progress) would get a generator object back without error, and the RuntimeError would surface only once Starlette starts streaming, after HTTP 200 headers have already been sent.
The current flow is safe because base_service.py only enters this path when supports_batch_progress returns True (which already implies transcriber is not None), but the behavior is worth documenting in the docstring.
🧰 Tools
🪛 Ruff (0.15.1)
[warning] 118-118: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extras/asr-services/providers/vibevoice/service.py` around lines 111 - 121,
The RuntimeError check in transcribe_with_progress is executed lazily because
the method is a generator; update the transcribe_with_progress docstring to
explicitly state that the self.transcriber None check (and any errors) will only
occur when the returned generator is first iterated, and advise callers to
verify supports_batch_progress or ensure transcriber is initialized before
calling; mention the transcribe_with_progress method name and the
self.transcriber attribute so maintainers can find the relevant code.
| if not defaults_path.exists() and not config_path.exists(): | ||
| logger.info("No config files found in %s, using env/defaults", config_dir) | ||
| return {} | ||
|
|
||
| defaults = OmegaConf.load(defaults_path) if defaults_path.exists() else {} | ||
| user_config = OmegaConf.load(config_path) if config_path.exists() else {} | ||
| merged = OmegaConf.merge(defaults, user_config) | ||
|
|
||
| asr_config = merged.get("asr_services", {}).get("vibevoice", {}) | ||
| resolved = OmegaConf.to_container(asr_config, resolve=True) | ||
| logger.info(f"Loaded vibevoice config: {resolved}") | ||
| return resolved |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd /tmp && find . -name "transcriber.py" -type f 2>/dev/null | head -5Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
git ls-files | grep -E "vibevoice.*transcriber\.py"Repository: SimpleOpenSoftware/chronicle
Length of output: 126
🏁 Script executed:
rg -n "extract_context_tail" --type pyRepository: SimpleOpenSoftware/chronicle
Length of output: 974
🏁 Script executed:
rg -n "load_vibevoice_config" --type py -A 3Repository: SimpleOpenSoftware/chronicle
Length of output: 900
🏁 Script executed:
# Check the actual file to understand the structure
git ls-files | grep -E "asr.*vibevoice"Repository: SimpleOpenSoftware/chronicle
Length of output: 323
🏁 Script executed:
cat -n extras/asr-services/providers/vibevoice/transcriber.py | sed -n '35,130p'Repository: SimpleOpenSoftware/chronicle
Length of output: 4668
🏁 Script executed:
# Check OmegaConf version and verify to_container behavior
rg "omegaconf" --type py -i | grep -E "import|version"Repository: SimpleOpenSoftware/chronicle
Length of output: 717
🏁 Script executed:
# Check for any error handling around load_vibevoice_config call
cat -n extras/asr-services/providers/vibevoice/transcriber.py | sed -n '115,135p'Repository: SimpleOpenSoftware/chronicle
Length of output: 1089
🏁 Script executed:
# Check requirements or setup files for OmegaConf version
find . -name "requirements*.txt" -o -name "setup.py" -o -name "pyproject.toml" | xargs grep -l omegaconf 2>/dev/nullRepository: SimpleOpenSoftware/chronicle
Length of output: 187
🏁 Script executed:
# Check if there's a requirements file and what version of OmegaConf is specified
cat extras/asr-services/requirements.txt 2>/dev/null || cat requirements.txt 2>/dev/null || find . -name "requirements.txt" -exec cat {} \; 2>/dev/null | grep -i omegaconfRepository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Verify extract_context_tail is truly not used anywhere in transcriber.py
rg "extract_context_tail" extras/asr-services/providers/vibevoice/transcriber.pyRepository: SimpleOpenSoftware/chronicle
Length of output: 97
🏁 Script executed:
# Look at the file to confirm what was removed regarding prev_context
git log --oneline -n 20 -- extras/asr-services/providers/vibevoice/transcriber.py 2>/dev/null || echo "Git log not available"Repository: SimpleOpenSoftware/chronicle
Length of output: 121
🏁 Script executed:
# Check the actual behavior of OmegaConf with a simple test
python3 << 'EOF'
try:
from omegaconf import OmegaConf
# Test 1: to_container on plain dict
plain_dict = {}
try:
result = OmegaConf.to_container(plain_dict, resolve=True)
print(f"✓ to_container on plain dict succeeded: {result}")
except ValueError as e:
print(f"✗ to_container on plain dict raised ValueError: {e}")
# Test 2: to_container on DictConfig
dict_config = OmegaConf.create({})
result = OmegaConf.to_container(dict_config, resolve=True)
print(f"✓ to_container on DictConfig succeeded: {result}")
# Test 3: merged.get() behavior
merged = OmegaConf.create({"foo": "bar"})
plain_default = merged.get("missing_key", {})
print(f"Type of merged.get('missing_key', {{}}) result: {type(plain_default).__name__}")
try:
OmegaConf.to_container(plain_default, resolve=True)
print(f"✓ to_container on plain dict from .get() succeeded")
except ValueError as e:
print(f"✗ to_container on plain dict from .get() raised ValueError: {e}")
except ImportError:
print("OmegaConf not installed")
EOFRepository: SimpleOpenSoftware/chronicle
Length of output: 95
🏁 Script executed:
cat extras/asr-services/pyproject.toml | grep -A 5 -B 5 -i omegaconfRepository: SimpleOpenSoftware/chronicle
Length of output: 269
🏁 Script executed:
# Search web for OmegaConf.to_container behavior on plain dicts
echo "Checking OmegaConf documentation for to_container behavior..."Repository: SimpleOpenSoftware/chronicle
Length of output: 133
🌐 Web query:
OmegaConf.to_container plain dict ValueError
💡 Result:
OmegaConf.to_container() only accepts OmegaConf containers (DictConfig / ListConfig). If you pass a plain Python dict, OmegaConf raises:
ValueError: Input cfg is not an OmegaConf config object (...)(omegaconf.readthedocs.io)
Fixes
1) If you already have a plain dict, don’t call to_container:
plain = my_dict # already a dict2) If you need to convert something into an OmegaConf config first:
from omegaconf import OmegaConf
cfg = OmegaConf.create(my_dict) # dict -> DictConfig
plain = OmegaConf.to_container(cfg) # DictConfig -> dictto_container() converts OmegaConf configs to primitive dict/list, optionally resolving interpolations. (omegaconf.readthedocs.io)
3) If the error happens because you thought you had a DictConfig but don’t, check:
from omegaconf import OmegaConf
print(type(cfg), OmegaConf.is_config(cfg))If you paste the exact traceback and the type(cfg) you’re calling it on, I can pinpoint which object in your code is a plain dict.
Citations:
- 1: https://omegaconf.readthedocs.io/en/2.3_branch/_modules/omegaconf/omegaconf.html?utm_source=openai
- 2: https://omegaconf.readthedocs.io/en/latest/usage.html?utm_source=openai
🏁 Script executed:
# Check if OmegaConf.merge preserves type when merging plain dict with DictConfig
python3 << 'EOF'
try:
from omegaconf import OmegaConf
# Case 1: Both plain dicts
result = OmegaConf.merge({}, {})
print(f"merge({{}}, {{}}) returns type: {type(result).__name__}")
# Case 2: DictConfig and plain dict
dc = OmegaConf.create({"foo": "bar"})
result = OmegaConf.merge(dc, {})
print(f"merge(DictConfig, {{}}) returns type: {type(result).__name__}")
# Case 3: What merged.get() does
result = OmegaConf.merge(dc, {})
val = result.get("nonexistent", {})
print(f"DictConfig.get('nonexistent', {{}}) returns type: {type(val).__name__}")
# Case 4: Simulate the actual scenario
defaults = {} # File doesn't exist
user_config = OmegaConf.load("/dev/null") if False else {} # Simulate missing file
merged = OmegaConf.merge(defaults, user_config)
print(f"\nActual scenario (both files missing): merged type = {type(merged).__name__}")
# Case 5: One file exists, one doesn't
defaults = OmegaConf.create({"test": "value"})
user_config = {}
merged = OmegaConf.merge(defaults, user_config)
print(f"Actual scenario (one file exists): merged type = {type(merged).__name__}")
# Now the critical part
asr_config = merged.get("asr_services", {})
print(f"merged.get('asr_services', {{}}) with missing key returns type: {type(asr_config).__name__}")
try:
OmegaConf.to_container(asr_config, resolve=True)
print("✓ to_container succeeded")
except ValueError as e:
print(f"✗ to_container FAILED: {e}")
except Exception as e:
print(f"Error during test: {e}")
import traceback
traceback.print_exc()
EOFRepository: SimpleOpenSoftware/chronicle
Length of output: 257
🏁 Script executed:
# Check if there are any try-except blocks around load_vibevoice_config() calls
rg -B 5 -A 10 "load_vibevoice_config()" extras/asr-services/providers/vibevoice/transcriber.pyRepository: SimpleOpenSoftware/chronicle
Length of output: 1318
🏁 Script executed:
# Check for _transcribe_batched functions to verify duplication claim
rg -n "_transcribe_batched" extras/asr-services/providers/vibevoice/transcriber.py | head -20Repository: SimpleOpenSoftware/chronicle
Length of output: 359
🏁 Script executed:
# Get the actual line counts and content of both methods to assess duplication
cat -n extras/asr-services/providers/vibevoice/transcriber.py | sed -n '400,550p'Repository: SimpleOpenSoftware/chronicle
Length of output: 7012
OmegaConf.to_container raises ValueError on plain-dict asr_config
When merged lacks an asr_services key, line 69 produces:
asr_config = {}.get("vibevoice", {}) # → plain dict {}OmegaConf.to_container() requires a DictConfig or ListConfig—it raises ValueError: Input cfg is not an OmegaConf config object on plain dicts. This crashes VibeVoiceTranscriber.__init__ for deployments where config files exist but lack the asr_services.vibevoice section.
Use OmegaConf.select() with a default DictConfig instead:
Proposed fix
- asr_config = merged.get("asr_services", {}).get("vibevoice", {})
- resolved = OmegaConf.to_container(asr_config, resolve=True)
- logger.info(f"Loaded vibevoice config: {resolved}")
- return resolved
+ asr_config = OmegaConf.select(merged, "asr_services.vibevoice", default=OmegaConf.create({}))
+ if not OmegaConf.is_config(asr_config):
+ asr_config = OmegaConf.create({})
+ resolved = OmegaConf.to_container(asr_config, resolve=True)
+ logger.info("Loaded vibevoice config: %s", resolved)
+ return resolvedMinor: extract_context_tail imported at line 42 but not used in this file (removed with the prev_context threading).
Optional refactor: _transcribe_batched_with_progress (lines 488–531) duplicates _transcribe_batched (lines 440–486); consider extracting the core loop to reduce duplication.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@extras/asr-services/providers/vibevoice/transcriber.py` around lines 61 - 72,
The code in VibeVoiceTranscriber.__init__ assigns asr_config and then calls
OmegaConf.to_container which fails when asr_services.vibevoice is missing
because asr_config is a plain dict; replace the current lookup with
OmegaConf.select (e.g., OmegaConf.select(merged, "asr_services.vibevoice",
default=OmegaConf.create({}))) so to_container always gets a
DictConfig/ListConfig, then call OmegaConf.to_container on that result; also
remove the unused import extract_context_tail and (optionally) consolidate
duplicated logic between _transcribe_batched and
_transcribe_batched_with_progress into a shared helper to avoid repetition.
| *[Mobile App - Screenshot coming soon]* | ||
| ### Desktop Menu Bar Client | ||
|
|
||
|  |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd --type f "menu-bar-client.png"Repository: SimpleOpenSoftware/chronicle
Length of output: 54
The asset .assets/menu-bar-client.png does not exist in the repository. The image reference on line 27 is broken and will render as a broken image in the README. Either add the image file to the .assets/ directory or remove/update the reference.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` at line 27, The README contains a broken image reference: ""; fix it by either (a) adding the
missing image file named menu-bar-client.png into the .assets directory and
committing it, or (b) updating that markdown image reference to point to an
existing asset or removing the entire image markdown if no image is
needed—ensure the changed README uses the exact updated path or removes the line
so the image no longer renders broken.
Summary by CodeRabbit
Release Notes
New Features
Documentation
Chores