Native speech-to-text for Arch / Omarchy - Fast, accurate and easy system-wide dictation
local or cloud | waybar integration | audio and visual feedback | any asr model! | gpu accel
pssst...un-mute!
2025-08-27.15-22-53.mp4
- Optimized for Arch Linux / Omarchy - Seamless integration with Arch Linux via the AUR
- Local, very fast defaults - State-of-the-art, fast speech recognition via in memory Whisper
- Cross-platform GPU support - Automatic detection and acceleration for NVIDIA (CUDA) / AMD (ROCm)
- Supports >any< ASR backend - Parakeet-v3? Cloud API? New-thing? Use the API and templates
- Word overrides - Customize transcriptions, prompt and corrections
- Multi-lingual - Use a multi-language model and speak your own language
- Paste anywhere - Start record or push to talk, and paste in active buffer without additional keypresses
- Omarchy or Arch Linux
- NVIDIA GPU (optional, for CUDA acceleration)
- AMD GPU (optional, for ROCm acceleration)
For Arch Linux and Omarchy users, hyprwhspr is available via the AUR.
# Install hyprwhspr
yay -S hyprwhspr
# Run interactive setup
hyprwhspr setup
The setup will:
- ✅ Configure transcription backend (pywhispercpp, Parakeet-v3 or REST API)
- ✅ Set up systemd user services
- ✅ Configure Waybar integration (if Waybar is installed)
- ✅ Download models (if using pywhispercpp backend)
- ✅ Set up permissions
- ✅ Validate installation
Ensure your microphone of choice is available in audio settings!
- Log out and back in (for group permissions)
- Press
Super+Alt+Dto start dictation - beep! - Speak naturally
- Press
Super+Alt+Dagain to stop dictation - boop! - Bam! Text appears in active buffer!
Any snags, please create an issue or visit Omarchy Discord.
# Update via your AUR helper
yay -Syu hyprwhspr
# If needed, re-run setup (idempotent)
hyprwhspr setupAfter installation, use the hyprwhspr CLI to manage your installation:
hyprwhspr setup- Interactive initial setuphyprwhspr config- Manage configuration (init/show/edit)hyprwhspr waybar- Manage Waybar integration (install/remove/status)hyprwhspr systemd- Manage systemd services (install/enable/disable/status/restart)hyprwhspr model- Manage models (download/list/status)hyprwhspr status- Overall status checkhyprwhspr validate- Validate installation
hyprwhspr supports two configurable interaction modes:
Toggle mode (default):
Super+Alt+D- Toggle dictation on/off
Push-to-talk mode:
- Hold
Super+Alt+D- Start dictation - Release
Super+Alt+D- Stop dictation
Edit ~/.config/hyprwhspr/config.json:
Minimal config - only 2 essential options:
Push-to-talk mode - hold to record, release to stop:
{
"push_to_talk": true
}push_to_talk: false(default) - Toggle mode: press to start, press again to stoppush_to_talk: true- Push-to-talk mode: hold to record, release to stop
REST API backends - use any ASR backend via HTTP API (can run locally or remotely):
The installer will walk you through remote / Cloud endpoints:
Local Parakeet v3
Fastest, latest, and apparently the best! GPU accel rcommended, not required.
OpenAI
Bring an API key from OpenAI, and choose from:
- GPT-4o Transcribe - Latest model with best accuracy
- GPT-4o Mini Transcribe - Faster, lighter model
- Whisper 1 - Legacy Whisper model
Groq
Bring an API key from Grok, and choose from:
- Whisper Large V3 - High accuracy processing
- Whisper Large V3 Turbo - Fastest transcription speed
Any arbitrary backend:
Or connect to any backend, local of Cloud, via your own custom backend:
{
"transcription_backend": "rest-api",
"rest_endpoint_url": "https://your-server.example.com/transcribe",
"rest_headers": { // optional arbitrary headers
"authorization": "Bearer your-api-key-here"
},
"rest_body": { // optional body fields merged with defaults
"model": "custom-model"
},
"rest_api_key": "your-api-key-here", // equivalent to rest_headers: { authorization: Bearer your-api-key-here }
"rest_timeout": 30 // optional, default: 30
}Custom hotkey - extensive key support:
{
"primary_shortcut": "CTRL+SHIFT+SPACE"
}Supported key types:
- Modifiers:
ctrl,alt,shift,super(left) orrctrl,ralt,rshift,rsuper(right) - Function keys:
f1throughf24 - Letters:
athroughz - Numbers:
1through9,0 - Arrow keys:
up,down,left,right - Special keys:
enter,space,tab,esc,backspace,delete,home,end,pageup,pagedown - Lock keys:
capslock,numlock,scrolllock - Media keys:
mute,volumeup,volumedown,play,nextsong,previoussong - Numpad:
kp0throughkp9,kpenter,kpplus,kpminus
Or use direct evdev key names for any key not in the alias list:
{
"primary_shortcut": "SUPER+KEY_COMMA"
}Examples:
"SUPER+SHIFT+M"- Super + Shift + M"CTRL+ALT+F1"- Ctrl + Alt + F1"F12"- Just F12 (no modifier)"RCTRL+RSHIFT+ENTER"- Right Ctrl + Right Shift + Enter
Word overrides - customize transcriptions:
{
"word_overrides": {
"hyperwhisper": "hyprwhspr",
"omarchie": "Omarchy"
}
}Audio feedback - optional sound notifications:
{
"audio_feedback": true, // Enable audio feedback (default: false)
"start_sound_volume": 0.3, // Start recording sound volume (0.1 to 1.0)
"stop_sound_volume": 0.3, // Stop recording sound volume (0.1 to 1.0)
"start_sound_path": "custom-start.ogg", // Custom start sound (relative to assets)
"stop_sound_path": "custom-stop.ogg" // Custom stop sound (relative to assets)
}Default sounds included:
- Start recording:
ping-up.ogg(ascending tone) - Stop recording:
ping-down.ogg(descending tone)
Custom sounds:
- Supported formats:
.ogg,.wav,.mp3 - Fallback: Uses defaults if custom files don't exist
Thanks for the sounds, @akx!
Text replacement: Automatically converts spoken words to symbols / punctuation:
Punctuation:
- "period" → "."
- "comma" → ","
- "question mark" → "?"
- "exclamation mark" → "!"
- "colon" → ":"
- "semicolon" → ";"
Symbols:
- "at symbol" → "@"
- "hash" → "#"
- "plus" → "+"
- "equals" → "="
- "dash" → "-"
- "underscore" → "_"
Brackets:
- "open paren" → "("
- "close paren" → ")"
- "open bracket" → "["
- "close bracket" → "]"
- "open brace" → "{"
- "close brace" → "}"
Special commands:
- "new line" → new line
- "tab" → tab character
Speech-to-text replacement list via WhisperTux, thanks @cjams!
Clipboard behavior - control what happens to clipboard after text injection:
{
"clipboard_behavior": false, // Boolean: true = clear after delay, false = keep (default: false)
"clipboard_clear_delay": 5.0 // Float: seconds to wait before clearing (default: 5.0, only used if clipboard_behavior is true)
}clipboard_behavior: true- Clipboard is automatically cleared after the specified delayclipboard_clear_delay- How long to wait before clearing (only matters whenclipboard_behavioristrue)
Paste behavior - control how text is pasted into applications:
{
"paste_mode": "ctrl_shift" // "super" | "ctrl_shift" | "ctrl" (default: "ctrl_shift")
}Paste behavior options:
-
"ctrl_shift"(default) — Sends Ctrl+Shift+V. Works in most terminals. -
"super"— Sends Super+V. Omarchy default. Maybe finicky. -
"ctrl"— Sends Ctrl+V. Standard GUI paste.
Add dynamic tray icon to your ~/.config/waybar/config:
{
"custom/hyprwhspr": {
"exec": "/usr/lib/hyprwhspr/config/hyprland/hyprwhspr-tray.sh status",
"interval": 2,
"return-type": "json",
"exec-on-event": true,
"format": "{}",
"on-click": "/usr/lib/hyprwhspr/config/hyprland/hyprwhspr-tray.sh toggle",
"on-click-right": "/usr/lib/hyprwhspr/config/hyprland/hyprwhspr-tray.sh start",
"on-click-middle": "/usr/lib/hyprwhspr/config/hyprland/hyprwhspr-tray.sh restart",
"tooltip": true
}
}Add CSS styling to your ~/.config/waybar/style.css:
@import "/usr/lib/hyprwhspr/config/waybar/hyprwhspr-style.css";Waybar icon click interactions:
- Left-click: Toggle Hyprwhspr on/off
- Right-click: Start Hyprwhspr (if not running)
- Middle-click: Restart Hyprwhspr
Default model installed: ggml-base.en.bin (~148MB) to ~/.local/share/pywhispercpp/models/
GPU Acceleration (NVIDIA & AMD):
-
NVIDIA (CUDA) and AMD (ROCm) are detected automatically; pywhispercpp will use GPU when selected
-
⚠️ AMD ROCm 7.x / HIPBLAS ROCm 7.0+ introduced breaking changes to hipBLAS datatype signatures. As of now, ggml’s HIP backend is compatible with ROCm 6.x, but ROCm 7.x will fail to build with errors and fallback to CPU.
CPU performance options - improve cpu transcription speed:
{
"threads": 4 // thread count for whisper cpu processing
}Available models to download:
tiny- Fastest, good for real-time dictationbase- Best balance of speed/accuracy (recommended)small- Better accuracy, still fastmedium- High accuracy, slower processinglarge- Best accuracy, requires GPU acceleration for reasonable speedlarge-v3- Latest large model, requires GPU acceleration for reasonable speed
large and large-v3 require GPU acceleration to perform.
cd ~/.local/share/pywhispercpp/models/
# Tiny models (fastest, least accurate)
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.en.bin
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.bin
# Base models (good balance)
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin
# Small models (better accuracy)
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.en.bin
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin
# Medium models (high accuracy)
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.en.bin
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.bin
# Large models (best accuracy, requires GPU)
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large.bin
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3.binUpdate config after downloading:
{
"model": "small.en" // Or just small if multi-lingual model. If both available, general model is chosen.
}Language detection - control transcription language:
English only speakers use .en models which are smaller.
For multi-language detection, ensure you select a model which does not say .en:
{
"language": null // null = auto-detect (default), or specify language code
}Language options:
null(default) - Auto-detect language from audio"en"- English transcription"nl"- Dutch transcription"fr"- French transcription"de"- German transcription"es"- Spanish transcriptionetc.- Any supported language code
Whisper prompt - customize transcription behavior:
{
"whisper_prompt": "Transcribe with proper capitalization, including sentence beginnings, proper nouns, titles, and standard English capitalization rules."
}The prompt influences how Whisper interprets and transcribes your audio, eg:
-
"Transcribe as technical documentation with proper capitalization, acronyms and technical terminology." -
"Transcribe as casual conversation with natural speech patterns." -
"Transcribe as an ornery pirate on the cusp of scurvy."
Whisper is the default, but any model works via API.
Select Parakeet within hyprwhspr setup.
If you're having persistent issues, you can completely reset hyprwhspr:
# Stop services
systemctl --user stop hyprwhspr ydotool
# Remove runtime data
rm -rf ~/.local/share/hyprwhspr/
# Remove user config
rm -rf ~/.config/hyprwhspr/
# Remove system files
sudo rm -rf /usr/lib/hyprwhspr/And then...
# Then reinstall fresh via AUR
yay -S hyprwhspr
hyprwhspr setupI heard the sound, but don't see text!
It's fairly common in Arch and other distros for the microphone to need to be plugged in and set each time you log in and out of your session, including during a restart. Within sound options, ensure that the microphone is indeed set. The sound utility will show feedback from the microphone if it is.
Hotkey not working:
# Check service status for hyprwhspr
systemctl --user status hyprwhspr.service
# Check logs
journalctl --user -u hyprwhspr.service -f# Check service statusr for ydotool
systemctl --user status ydotool.service
# Check logs
journalctl --user -u ydotool.service -fPermission denied:
# Fix uinput permissions
hyprwhspr setup
# Log out and back inNo audio input:
If your mic actually available?
# Check audio devices
pactl list short sources
# Restart PipeWire
systemctl --user restart pipewireAudio feedback not working:
# Check if audio feedback is enabled in config
cat ~/.config/hyprwhspr/config.json | grep audio_feedback
# Verify sound files exist
ls -la /usr/lib/hyprwhspr/share/assets/
# Check if ffplay/aplay/paplay is available
which ffplay aplay paplayModel not found:
# Check if model exists
ls -la ~/.local/share/pywhispercpp/models/
# Download a different model
cd ~/.local/share/pywhispercpp/models/
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin
# Verify model path in config
cat ~/.config/hyprwhspr/config.json | grep modelStuck recording state:
# Check service health and auto-recover
/usr/lib/hyprwhspr/config/hyprland/hyprwhspr-tray.sh health
# Manual restart if needed
systemctl --user restart hyprwhspr.service
# Check service status
systemctl --user status hyprwhspr.service- Check logs:
journalctl --user -u hyprwhspr.servicejournalctl --user -u ydotool.service - Verify permissions: Run the permissions fix script
- Test components: Check ydotool, audio devices, whisper.cpp
- Report issues: Create an issue or visit Omarchy Discord - logging info helpful!
MIT License - see LICENSE file.
Create an issue, happy to help!
For pull requests, also best to start with an issue.
Built with ❤️ in 🇨🇦 for the Omarchy community
Integrated and natural speech-to-text.
{ "primary_shortcut": "SUPER+ALT+D", "model": "base.en" }