Sync with upstream by AIWintermuteAI · Pull Request #6 · AIWintermuteAI/WhisperLive

AIWintermuteAI · 2024-10-24T19:23:36Z

No description provided.

Improve cpu and gpu Dockerfiles, resulting in much smaller images

Add option: save network stream to local file while transcribing

… OMP_NUM_THREADS Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

fix: limit CPU usage for VAD onnxruntime inference session by setting…

Add support for RTSP stream

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

…optional Make writing audio frames optional

- Use a threadlock around the model in single model mode

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Expose the srt file location of Transcription client

Update tensorrt llm to v0.9.0

fix spelling of detection in README.md.

Single model mode

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Integrate live translation

The help text for `--max_connection_time` is incorrect. Looks like a copy-paste mistake from `--cache_path`.

fix(run_server.py): help text for max_connection_time argument

Previously, the server only accepted local file paths for custom Faster Whisper models. This change allows passing HuggingFace repo IDs which are automatically downloaded and converted to CTranslate2 format by the backend if not already in CTranslate2 format. Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

…tom-model-loading Feat: support HuggingFace model IDs for faster_whisper_custom_model_path.

Specify command to run client script. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

README.md: add instructions for running client

Add `--enable-timestamps` option to `run_client.py` script to print out transcripted text with timestamps. Sample output with translation enabled: ``` [0.000 -> 7.440] And so, my fellow Americans, ask not what your country can do for you. [7.440 -> 10.300] Ask what you can do for your country. TRANSLATION to fr: [0.000 -> 7.440] Et donc, mes camarades américains, ne demandez pas ce que votre pays peut faire pour vous. [7.440 -> 10.300] Demandez ce que vous pouvez faire pour votre pays. ``` Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

Enable timestamps for transcripted text

feat: update to support faster whisper 1.2.0

Resolves pkg_resources missing during wheel build Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Bump openai-whisper version to 20250625.

Replace hardcoded [-4:] truncation with a configurable display_segments parameter (default: 4) in both Client and TranscriptionClient classes. Fixes #377

Add cross-client GPU batch inference for faster_whisper backend

When VAD removes all speech from an audio chunk, transcriber.transcribe() returns (None, info). Calling list(None) raises TypeError. The _process_multi path already handles this case; this aligns _process_single to match.

Fix NoneType crash in _process_single when VAD filters all audio

feat: make display_segments configurable in Client/TranscriptionClient

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Expose __version__ in package root and update dependencies in setup.py

Fix crash when no --files provided; use microphone input instead

fraic and others added 30 commits March 25, 2024 19:11

Add option: save network stream to local file while transcribing

f78fc47

Improve cpu and gpu Dockerfiles, resulting in much smaller images

dccfce2

Merge pull request #206 from peldszus/smaller-dockerimages

0a2d92c

Improve cpu and gpu Dockerfiles, resulting in much smaller images

Merge pull request #192 from fraic/dev1

a968331

Add option: save network stream to local file while transcribing

Add support for RTSP stream

615c9c7

fix: limit CPU usage for VAD onnxruntime inference session by setting…

819ab35

… OMP_NUM_THREADS Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Merge pull request #215 from makaveli10/fix/omp-num-threads

c0a947a

fix: limit CPU usage for VAD onnxruntime inference session by setting…

Merge pull request #212 from dshepelev15/feat/RTSP_support

03e30e1

Add support for RTSP stream

Make writing output audio file optional when using microphone

61d07ed

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Update README

8f373c3

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Ignore linting as this file is a copy from faster_whisper

225a98b

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Refactor to make record function more readable

9d2ea75

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Fix README typo

399e9e7

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Remove flake8 warning suppression

3d043dc

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Merge pull request #216 from makaveli10/feature/writing_audio_frames_…

e1a42c2

…optional Make writing audio frames optional

Add single model mode for custom models

3c09289

- Use a threadlock around the model in single model mode

Raise error for invalid model paths

3a96f60

Update README

1ac7a27

Expose the srt file location of Transcription client

cfba5b3

Update TensorRT backend tensorrt_llm==0.9.0

f73a146

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Dockerfile tensorrt use cuda-runtimee as base image to reduce size

e4579ef

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Update ci to build and push teensorrt docker image

22a37e7

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Merge pull request #224 from chien-liu/expose-client-srt-location

d1de2ec

Expose the srt file location of Transcription client

Merge pull request #227 from makaveli10/update_tensorrt_llm

5e24211

Update tensorrt llm to v0.9.0

Make single model mode the default, update readme

ab17c4d

Fix argparser option

1407731

fix spelling of detection in README.md

761bb61

Merge pull request #228 from anshulkharb/patch-1

ee13251

fix spelling of detection in README.md.

Merge pull request #223 from peldszus/single-model-mode

5b9bc2b

Single model mode

Bump version v0.5.0

815441e

makaveli10 and others added 30 commits July 22, 2025 15:47

ServeClientTranslation import only when enable_tranlsation is True

04db671

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Merge pull request #391 from makaveli10/integrate_live_translation

5e6be74

Integrate live translation

fix(run_server.py): help text for max_connection_time argument

95a9b7e

The help text for `--max_connection_time` is incorrect. Looks like a copy-paste mistake from `--cache_path`.

Merge pull request #397 from locnnil/patch-1

3b17bda

fix(run_server.py): help text for max_connection_time argument

update to support faster whisper 1.2.0

c43eb1d

Merge pull request #412 from makaveli10/vineet/fix-faster-whisper-cus…

c6ee9a6

…tom-model-loading Feat: support HuggingFace model IDs for faster_whisper_custom_model_path.

README.md: add instructions for running client

9251394

Specify command to run client script. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

setup.sh: support Fedora

b9ae2af

api: add support OpenAI REST transcription api

29ee640

Merge pull request #415 from JenySadadia/run-client-docs

6c8142a

README.md: add instructions for running client

Merge pull request #418 from JenySadadia/enable-timestamps

98fcc51

Enable timestamps for transcripted text

Merge pull request #398 from AlexStansfield/feature/faster-whisper-1.2.0

e48d16f

feat: update to support faster whisper 1.2.0

Bump openai-whisper version to 20250625

9fa7005

Resolves pkg_resources missing during wheel build Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Merge pull request #419 from makaveli10/bump-whisper-version

f886990

Bump openai-whisper version to 20250625.

Add cross-client GPU batch inference for faster_whisper backend

e8bd4fd

feat: make display_segments configurable in Client/TranscriptionClient

067573a

Replace hardcoded [-4:] truncation with a configurable display_segments parameter (default: 4) in both Client and TranscriptionClient classes. Fixes #377

Fix missing batch_config init causing CI test hang

3508b39

Add unit tests for BatchInferenceWorker

e7e78a7

Merge pull request #422 from ianwh02/feature/batch-inference

5d8629e

Add cross-client GPU batch inference for faster_whisper backend

feat: add --n_display_segments CLI arg to run_client.py

6ae57c8

Fix NoneType crash in _process_single when VAD filters all audio

89466f7

When VAD removes all speech from an audio chunk, transcriber.transcribe() returns (None, info). Calling list(None) raises TypeError. The _process_multi path already handles this case; this aligns _process_single to match.

Merge pull request #427 from ianwh02/fix/batch-single-vad-none

bc441de

Fix NoneType crash in _process_single when VAD filters all audio

Merge pull request #425 from nightcityblade/fix/issue-377

6fcae6a

feat: make display_segments configurable in Client/TranscriptionClient

Expose __version__ in package root and update dependencies in setup.py

5f0010d

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Merge pull request #429 from makaveli10/vineet/update-setup-packages

710bdff

Expose __version__ in package root and update dependencies in setup.py

Fix crash when no --files provided; use microphone input instead

4943c25

Merge pull request #430 from makaveli10/vineet/fix-run-client

8e09d16

Fix crash when no --files provided; use microphone input instead

Bump version v0.8.0

6de5c87

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync with upstream#6

Sync with upstream#6
AIWintermuteAI wants to merge 191 commits intoAIWintermuteAI:mainfrom
collabora:main

AIWintermuteAI commented Oct 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

AIWintermuteAI commented Oct 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants