Speak2

Local voice dictation for macOS. Hold the fn key (configurable) to speak, release to transcribe. Works with any application.

100% on-device using WhisperKit or Parakeet - no cloud services, no data leaves your Mac.

Speech Recognition Models

Speak2 supports two speech recognition models:

Model	Size	Languages	Best For
Whisper (base.en)	~140 MB	English only	Fast, accurate English transcription
Parakeet v3	~600 MB	25 languages	Multilingual users

You can download both and switch between them from the menu bar. Only one model is loaded at a time to conserve memory.

Requirements

macOS 14.0 or later
Apple Silicon Mac (M1/M2/M3)

Installation

From DMG (recommended)

Download the latest .dmg from the releases page and install.

Build from source

git clone https://github.com/zachswift615/speak2.git
cd speak2
swift build -c release

Run

swift run

Or run the release binary directly:

.build/release/Speak2

First Launch Setup

On first launch, a setup window will appear. You need to:

1. Grant Accessibility Permission

This is required for global fn key detection.

DMG installs

Click "Grant" next to Accessibility on the first launch window

Then click Open System Settings

Then find speak2 in the list and toggle the permission switch on and authenticate with password or fingerprint. If Speak2 is not in the list, click the + button and nagivate to your Applications directory where you dragged it to install, and Add Speak2 to the list of apps.

Building from source

Option A: Add Speak2 directly

Open System Settings > Privacy & Security > Accessibility
Click the + button
Press Cmd+Shift+G and paste: ~/.build/release/Speak2 (or wherever you built it)
Select the Speak2 executable and enable it

Option B: Enable Terminal (easier for development)

Open System Settings > Privacy & Security > Accessibility
Find Terminal in the list and toggle it ON
This allows any app run from Terminal to use accessibility features

2. Grant Microphone Permission

Click "Grant" next to Microphone. And click "Allow" on the permission window that pops up.

3. Download Speech Model

Choose a model and click "Download":

Whisper (base.en) - ~140MB, English only, faster
Parakeet v3 - ~600MB, 25 languages, best for multilingual users

Note: Parakeet takes longer to load initially (~20-30 seconds) as it compiles the neural engine model. Subsequent loads are faster. The menu bar icon will show a spinning indicator while loading.

Once all three items show checkmarks, the setup window will indicate completion and you can close it.

Usage

Hold the fn key - Recording starts (menu bar icon turns red)
Speak - Say what you want to type
Release fn key - Transcription happens (icon shows spinner), then text is pasted

The transcribed text is automatically pasted into whatever application text field has focus.

Menu Bar

Speak2 runs as a menu bar app (no dock icon). Look for the microphone icon:

White/Black (depending on macOS theme) - Idle, ready to record
Yellow spinning arrows - Loading model
Red mic - Recording in progress
Cyan spinner - Transcribing

The menu shows a status line at the top indicating the current state (e.g., "Ready – Whisper (base.en)").

Switching Models

Click the menu bar icon and select Model to switch between downloaded models. Models not yet downloaded show a ↓ indicator - clicking them opens the setup window to download.

Manage Models

Click Manage Models... to open the setup window where you can download additional models or delete existing ones to free up disk space.

Choosing Hotkey

You can choose from several hotkey options. Sometimes external keyboards don't send the function key reliably. In that case, you can choose one of the other options from the menu.

Launch at Login

You can choose to have Speak2 launch at login. If selected, a checkmark will appear beside this option. Click it again to remove it from the list of start up apps. You'll see this when you choose the start up option:

Quit Speak2

Click the menu bar icon and click "Quit Speak2".

How It Works

HotkeyManager - Detects hotkey press/release using CGEvent tap
AudioRecorder - Captures microphone audio at 16kHz mono PCM
ModelManager - Handles model downloading, loading, and switching
WhisperTranscriber - Runs WhisperKit on-device for speech-to-text
ParakeetTranscriber - Runs FluidAudio/Parakeet on-device for speech-to-text
TextInjector - Copies transcription to clipboard and simulates Cmd+V to paste

The selected model stays loaded in memory (~300-600MB RAM depending on model) for instant transcription.

Tips

Speak naturally with punctuation inflection - Whisper handles periods, commas, and question marks based on your tone
Keep recordings under 30 seconds for best performance
First transcription may be slightly slower as the model warms up

Known Limitations

Parakeet model takes ~20-30 seconds to load on first use (compiling neural engine model)
Uses clipboard for text injection (temporarily overwrites clipboard contents)
fn key detection requires Accessibility permission
Only tested on Apple Silicon Macs

Tech Stack

Swift + SwiftUI
WhisperKit - Apple's optimized Whisper implementation
FluidAudio - Parakeet speech recognition for Apple Silicon
AVFoundation for audio capture
CGEvent for global hotkey detection

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
Resources		Resources
Sources		Sources
scripts		scripts
.gitignore		.gitignore
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speak2

Speech Recognition Models

Requirements

Installation

From DMG (recommended)

Build from source

Run

First Launch Setup

1. Grant Accessibility Permission

DMG installs

Building from source

2. Grant Microphone Permission

3. Download Speech Model

Usage

Menu Bar

Switching Models

Manage Models

Choosing Hotkey

Launch at Login

Quit Speak2

How It Works

Tips

Known Limitations

Tech Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Speak2

Speech Recognition Models

Requirements

Installation

From DMG (recommended)

Build from source

Run

First Launch Setup

1. Grant Accessibility Permission

DMG installs

Building from source

2. Grant Microphone Permission

3. Download Speech Model

Usage

Menu Bar

Switching Models

Manage Models

Choosing Hotkey

Launch at Login

Quit Speak2

How It Works

Tips

Known Limitations

Tech Stack

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages