Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fixed tts_server.py on macOS #418

Merged
merged 1 commit into from
Jan 24, 2025
Merged

Conversation

ErikBjare
Copy link
Owner

@ErikBjare ErikBjare commented Jan 24, 2025

Important

Centralized audio output device selection in tts.py and improved espeak library detection for macOS in tts_server.py.

  • Audio Output Device Selection:
    • Added get_output_device() in tts.py to centralize logic for selecting audio output devices.
    • Replaced inline device selection logic in audio_player_thread() and speak() with get_output_device().
  • macOS Compatibility:
    • In tts_server.py, set espeak library path explicitly for macOS using EspeakWrapper.set_library().
    • Replaced subprocess with shutil.which() for espeak detection in _check_espeak().
  • Misc:
    • Added comments for exception handling in tts.py when TTS extras are not installed.

This description was created by Ellipsis for ac3e2a0. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 4eb7d88 in 1 minute and 13 seconds

More details
  • Looked at 151 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. gptme/tools/tts.py:241
  • Draft comment:
    Using hostapi == 2 to identify CoreAudio is unreliable as hostapi indices can vary. Consider using sd.query_hostapis() to find the correct index for CoreAudio.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The code has multiple fallbacks - first trying system default, then CoreAudio devices, then any output device. Even if the hostapi index is wrong, the code will still work through the third fallback. The hostapi check is just an optimization to prefer CoreAudio devices when available. The risk of the index being wrong seems low compared to the complexity of adding hostapi querying logic.
    The comment raises a valid technical point - hardcoding hostapi indices is indeed not ideal. The code could be more robust by properly detecting CoreAudio.
    While technically correct, the current approach is pragmatic given the multiple fallbacks. The benefit of adding hostapi detection logic doesn't justify the added complexity.
    Delete the comment. While technically valid, the current code handles device selection robustly through fallbacks, making this optimization suggestion not important enough to warrant a change.

Workflow ID: wflow_8r08XAHdgRMvpQgt


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@codecov-commenter
Copy link

codecov-commenter commented Jan 24, 2025

Codecov Report

Attention: Patch coverage is 3.33333% with 29 lines in your changes missing coverage. Please review.

Project coverage is 69.66%. Comparing base (db212bd) to head (ac3e2a0).

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
gptme/tools/tts.py 3.33% 29 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #418      +/-   ##
==========================================
- Coverage   70.09%   69.66%   -0.43%     
==========================================
  Files          70       70              
  Lines        5849     5862      +13     
==========================================
- Hits         4100     4084      -16     
- Misses       1749     1778      +29     
Flag Coverage Δ
anthropic/claude-3-haiku-20240307 68.42% <3.33%> (-0.19%) ⬇️
openai/gpt-4o-mini 67.57% <3.33%> (-0.58%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ErikBjare ErikBjare force-pushed the dev/tts-server-macos-fixes branch from 4eb7d88 to e186316 Compare January 24, 2025 11:23
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on e186316 in 20 seconds

More details
  • Looked at 199 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. scripts/tts_server.py:84
  • Draft comment:
    The use of shutil.which to check for espeak and espeak-ng is a good improvement for reliability. This ensures the executables are checked in the system's PATH.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The use of shutil.which is a more reliable way to check for the presence of executables, as it checks the system's PATH for the executable. This change improves the robustness of the _check_espeak function.
2. gptme/tools/tts.py:207
  • Draft comment:
    The get_output_device function centralizes the logic for selecting an audio output device, reducing code duplication and improving maintainability. This is a good practice.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The get_output_device function is a good addition to centralize the logic for selecting an audio output device. This reduces code duplication and improves maintainability.
3. gptme/tools/tts.py:299
  • Draft comment:
    Ensure that the exception handling in audio_player_thread and speak allows the program to continue running even if no output device is found. This prevents the entire application from crashing due to a missing audio device.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The get_output_device function is used in multiple places, which is a good practice to avoid code duplication. However, the exception handling in audio_player_thread and speak should ensure that the program can continue running even if no device is found.
4. gptme/tools/tts.py:362
  • Draft comment:
    Using os.getenv to retrieve the GPTME_TTS_VOICE environment variable is a good practice for configuration flexibility, allowing users to set preferences without code changes.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The use of os.getenv to get the GPTME_TTS_VOICE environment variable is a good practice for configuration flexibility. This allows users to set their preferred voice without modifying the code.

Workflow ID: wflow_7XPfb8vo4Yr3DoWH


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@ErikBjare ErikBjare force-pushed the dev/tts-server-macos-fixes branch from e186316 to ac3e2a0 Compare January 24, 2025 11:26
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on ac3e2a0 in 15 seconds

More details
  • Looked at 190 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. scripts/tts_server.py:84
  • Draft comment:
    The use of shutil.which to check for espeak or espeak-ng is a more robust and efficient method than using subprocess.run. Good improvement!
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The use of shutil.which is a more robust way to check for the presence of executables, replacing the previous subprocess call. This is a good change.
2. gptme/tools/tts.py:207
  • Draft comment:
    The get_output_device function is a good refactoring to centralize the logic for selecting the audio output device. This improves code reuse and maintainability.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The PR introduces a new function get_output_device which is used in multiple places to determine the audio output device. This refactoring improves code reuse and maintainability.
3. scripts/tts_server.py:48
  • Draft comment:
    The macOS-specific workaround for setting the espeak library path ensures compatibility on macOS systems. This is a necessary change for macOS support.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The PR adds a specific workaround for macOS to set the espeak library path. This is necessary for compatibility and should be noted.

Workflow ID: wflow_dgdwctbJgeKyPq8T


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.


script_dir = Path(__file__).parent

# Add Kokoro-82M to Python path
kokoro_path = (script_dir / "Kokoro-82M").absolute()
sys.path.insert(0, str(kokoro_path))

# on macOS, use workaround for espeak detection
if sys.platform == "darwin":
_ESPEAK_LIBRARY = "/opt/homebrew/Cellar/espeak/1.48.04_1/lib/libespeak.1.1.48.dylib"
Copy link
Owner Author

@ErikBjare ErikBjare Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need to search for this path more intelligently in the future as new espeak versions are released

@ErikBjare ErikBjare merged commit 477a81e into master Jan 24, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants