This branch is up to date with master.

Name	Name	Last commit message	Last commit date
Latest commit Fedir Zadniprovskyi and fedirz deps: upgrade `kokoro-onnx` Mar 6, 2025 ea01c10 · Mar 6, 2025 History 452 Commits
.github/workflows	.github/workflows	feat: add cuda 12.4.1 support (#276 )	Jan 26, 2025
configuration	configuration	chore: format tempo.yaml	Jan 26, 2025
docs	docs	deps: install piper on ARM	Mar 6, 2025
examples	examples	config: remove `whisper.model` option (#312 and #313 )	Feb 14, 2025
realtime-console/dist	realtime-console/dist	feat: realtime ui	Mar 3, 2025
scripts	scripts	feat/realtime: add a demo transcription only client	Feb 17, 2025
src/speaches	src/speaches	deps: install piper on ARM	Mar 6, 2025
tests	tests	deps: install piper on ARM	Mar 6, 2025
.dockerignore	.dockerignore	feat: realtime ui	Mar 3, 2025
.envrc	.envrc	init	May 20, 2024
.gitignore	.gitignore	feat: realtime ui	Mar 3, 2025
.pre-commit-config.yaml	.pre-commit-config.yaml	chore(deps): update pre-commit hook google/yamlfmt to v0.16.0	Feb 19, 2025
Dockerfile	Dockerfile	feat: realtime ui	Mar 3, 2025
LICENSE	LICENSE	init	May 20, 2024
README.md	README.md	docs: add Realtime API page (WIP)	Mar 3, 2025
Taskfile.yaml	Taskfile.yaml	chore: update Taskfile `server` command	Feb 17, 2025
audio.wav	audio.wav	chore: update volume names and mount points	Jan 10, 2025
compose.cpu.yaml	compose.cpu.yaml	config: remove `whisper.model` option (#312 and #313 )	Feb 14, 2025
compose.cuda-cdi.yaml	compose.cuda-cdi.yaml	rename to `speaches`	Jan 12, 2025
compose.cuda.yaml	compose.cuda.yaml	config: remove `whisper.model` option (#312 and #313 )	Feb 14, 2025
compose.observability.yaml	compose.observability.yaml	chore(deps): update grafana/tempo docker tag to v2.7.1	Feb 19, 2025
compose.yaml	compose.yaml	feat: configure docker compose watch	Mar 3, 2025
contributing.md	contributing.md	deps: upgrade `kokoro-onnx`	Mar 6, 2025
flake.lock	flake.lock	deps: add nixpkgs-master flake input	Jan 29, 2025
flake.nix	flake.nix	deps: add docker and docker-compose to flake.nix	Jan 29, 2025
mkdocs.yml	mkdocs.yml	docs: add Realtime API page (WIP)	Mar 3, 2025
model_aliases.json	model_aliases.json	feat: store model aliases in a JSON file	Mar 2, 2025
pyproject.toml	pyproject.toml	deps: upgrade `kokoro-onnx`	Mar 6, 2025
renovate.json	renovate.json	feat: renovate handle pre-commit	Nov 1, 2024
uv.lock	uv.lock	deps: upgrade `kokoro-onnx`	Mar 6, 2025

Repository files navigation

Note

This project was previously named faster-whisper-server. I've decided to change the name from faster-whisper-server, as the project has evolved to support more than just ASR.

Speaches

speaches is an OpenAI API-compatible server supporting streaming transcription, translation, and speech generation. Speach-to-Text is powered by faster-whisper and for Text-to-Speech piper and Kokoro are used. This project aims to be Ollama, but for TTS/STT models.

Try it out on the HuggingFace Space

See the documentation for installation instructions and usage: speaches.ai

Features:

OpenAI API compatible. All tools and SDKs that work with OpenAI's API should work with speaches.
Audio generation (chat completions endpoint) | OpenAI Documentation
- Generate a spoken audio summary of a body of text (text in, audio out)
- Perform sentiment analysis on a recording (audio in, text out)
- Async speech to speech interactions with a model (audio in, audio out)
Streaming support (transcription is sent via SSE as the audio is transcribed. You don't need to wait for the audio to fully be transcribed before receiving it).
Dynamic model loading / offloading. Just specify which model you want to use in the request and it will be loaded automatically. It will then be unloaded after a period of inactivity.
Text-to-Speech via kokoro(Ranked #1 in the TTS Arena) and piper models.
GPU and CPU support.
Deployable via Docker Compose / Docker
Highly configurable
Realtime API

Please create an issue if you find a bug, have a question, or a feature suggestion.

Demo

Streaming Transcription

TODO

Speech Generation

2025-01-12_13-20-58.webm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speaches

Features:

Demo

Streaming Transcription

Speech Generation

About

Releases 2

Packages 1

Contributors 23

Languages

License

speaches-ai/speaches

Folders and files

Latest commit

History

Repository files navigation

Speaches

Features:

Demo

Streaming Transcription

Speech Generation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 1

Contributors 23

Languages