Skip to content

Sync with upstream#6

Open
AIWintermuteAI wants to merge 191 commits intoAIWintermuteAI:mainfrom
collabora:main
Open

Sync with upstream#6
AIWintermuteAI wants to merge 191 commits intoAIWintermuteAI:mainfrom
collabora:main

Conversation

@AIWintermuteAI
Copy link
Owner

No description provided.

fraic and others added 30 commits March 25, 2024 19:11
Improve cpu and gpu Dockerfiles, resulting in much smaller images
Add option: save network stream to local file while transcribing
… OMP_NUM_THREADS

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>
fix: limit CPU usage for VAD onnxruntime inference session by setting…
Signed-off-by: makaveli10 <suryanvineet47@gmail.com>
Signed-off-by: makaveli10 <suryanvineet47@gmail.com>
Signed-off-by: makaveli10 <suryanvineet47@gmail.com>
Signed-off-by: makaveli10 <suryanvineet47@gmail.com>
Signed-off-by: makaveli10 <suryanvineet47@gmail.com>
Signed-off-by: makaveli10 <suryanvineet47@gmail.com>
…optional

Make writing audio frames optional
- Use a threadlock around the model in single model mode
Signed-off-by: makaveli10 <vineet.suryan@collabora.com>
Signed-off-by: makaveli10 <vineet.suryan@collabora.com>
Signed-off-by: makaveli10 <vineet.suryan@collabora.com>
Expose the srt file location of Transcription client
fix spelling of detection in README.md.
makaveli10 and others added 30 commits July 22, 2025 15:47
Signed-off-by: makaveli10 <vineet.suryan@collabora.com>
The help text for `--max_connection_time` is incorrect. Looks like a copy-paste mistake from `--cache_path`.
fix(run_server.py): help text for max_connection_time argument
Previously, the server only accepted local file paths for custom Faster Whisper
models. This change allows passing HuggingFace repo IDs which are automatically
downloaded and converted to CTranslate2 format by the backend if not already in
CTranslate2 format.

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>
…tom-model-loading

Feat: support HuggingFace model IDs for faster_whisper_custom_model_path.
Specify command to run client script.

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>
README.md: add instructions for running client
Add `--enable-timestamps` option to `run_client.py`
script to print out transcripted text with timestamps.

Sample output with translation enabled:
```
[0.000 -> 7.440]  And so, my fellow Americans, ask not what your country can do for you.
[7.440 -> 10.300]  Ask what you can do for your country.

TRANSLATION to fr:
[0.000 -> 7.440] Et donc, mes camarades américains, ne demandez pas ce que votre pays peut faire pour vous.
[7.440 -> 10.300] Demandez ce que vous pouvez faire pour votre pays.
```

Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>
Enable timestamps for transcripted text
feat: update to support faster whisper 1.2.0
Resolves pkg_resources missing during wheel build

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>
Bump openai-whisper version to 20250625.
Replace hardcoded [-4:] truncation with a configurable display_segments
parameter (default: 4) in both Client and TranscriptionClient classes.

Fixes #377
Add cross-client GPU batch inference for faster_whisper backend
When VAD removes all speech from an audio chunk, transcriber.transcribe() returns (None, info). Calling list(None) raises TypeError. The _process_multi path already handles this case; this aligns _process_single to match.
Fix NoneType crash in _process_single when VAD filters all audio
feat: make display_segments configurable in Client/TranscriptionClient
Signed-off-by: makaveli10 <vineet.suryan@collabora.com>
Expose __version__ in package root and update dependencies in setup.py
Fix crash when no --files provided; use microphone input instead
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.