feat: add voice dictation via whisper.cpp by gapmiss · Pull Request #28 · my-claude-utils/clsh

gapmiss · 2026-03-18T02:18:09Z

Summary

Hold-to-talk mic button on the context strip — records audio on the phone, transcribes locally via whisper.cpp, injects text into the terminal as stdin
Fully local — no cloud APIs, audio never leaves the machine
Optional feature — only activates when WHISPER_MODEL env var is set; zero impact otherwise

Changes

Server (packages/agent):

POST /api/transcribe endpoint — JWT-authenticated, accepts multipart audio via multer, converts to WAV via ffmpeg, runs whisper-cli, returns transcribed text
config.ts — added WHISPER_CPP_PATH and WHISPER_MODEL env vars

Client (packages/web):

useDictation hook — MediaRecorder with webm/opus + mp4/aac iOS fallback
Mic button in ContextStrip with hold-to-talk, recording/processing toast indicators, haptic feedback
Wired into TerminalView to send transcribed text as stdin

Other:

Added .ngrok-free.app to Vite allowedHosts (was missing, only had .ngrok-free.dev)
README updated with voice dictation setup instructions and configuration

Dependencies

multer (new server dependency for multipart upload)
Requires ffmpeg and whisper-cpp on the host (both available via brew)

Test plan

Tap mic button to start → "Recording..." toast appears, mic access granted
Tap mic button to stop → "Transcribing..." toast, then text appears in terminal
Tested on iOS Safari (PWA) and desktop Chrome
Verified no impact when WHISPER_MODEL is not set (mic button still shows, server returns 503 gracefully)
Typecheck passes on both packages

🤖 Generated with Claude Code

Hold-to-talk mic button on the context strip records audio on the phone, sends it to the agent server, and transcribes it locally using whisper.cpp. Transcribed text is injected into the terminal as stdin. Everything stays local — no cloud APIs. Server: - POST /api/transcribe endpoint (JWT-authenticated, multipart audio via multer) - Converts audio to 16kHz WAV via ffmpeg, runs whisper-cli, returns text - Configurable via WHISPER_CPP_PATH and WHISPER_MODEL env vars Client: - useDictation hook (MediaRecorder with webm/opus + mp4/aac fallback) - Mic button with hold-to-talk, recording/processing toast indicators - Haptic feedback on recording start Also fixes .ngrok-free.app missing from Vite allowedHosts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

my-claude-utils · 2026-03-18T14:23:13Z

Hey @gapmiss, thanks for the PR and for taking the time to build this out! Voice dictation via whisper.cpp is a genuinely cool idea, and the local-only approach fits well with clsh's philosophy.

That said, I'm going to close this one. The CLI is still very early and I want to keep the core lean and focused on the terminal experience for now. Adding external system dependencies (ffmpeg, whisper-cpp) and a new API surface is more than I'm comfortable merging without community discussion first.

I've opened a feature request issue so the community can weigh in. If there's enough interest, we can figure out the right way to integrate it (maybe as a plugin system down the road).

If you want to use this yourself in the meantime, a fork would work great. Appreciate the contribution!

gapmiss force-pushed the feat/voice-dictation branch from 9c8b97c to 795bef0 Compare March 18, 2026 02:47

my-claude-utils closed this Mar 18, 2026

my-claude-utils mentioned this pull request Mar 18, 2026

[Feature Request]: Voice dictation (speech-to-text for terminal input) #30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add voice dictation via whisper.cpp#28

feat: add voice dictation via whisper.cpp#28
gapmiss wants to merge 1 commit intomy-claude-utils:mainfrom
gapmiss:feat/voice-dictation

gapmiss commented Mar 18, 2026 •

edited

Loading

Uh oh!

my-claude-utils commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gapmiss commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Dependencies

Test plan

Uh oh!

my-claude-utils commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gapmiss commented Mar 18, 2026 •

edited

Loading