Skip to content
@pyannote

pyannote

Speaker Intelligence Platform for developers

Identify who speaks when with pyannote

💚 Simply detect, segment, label, and separate speakers in any language

Github Hugging Face Discord LinkedIn X
Playground Documentation

🎤 What is speaker diarization?

Diarization

Speaker diarization is the process of automatically partitioning the audio recording of a conversation into segments and labeling them by speaker, answering the question "who spoke when?". As the foundational layer of conversational AI, speaker diarization provides high-level insights for human-human and human-machine conversations, and unlocks a wide range of downstream applications: meeting transcription, call center analytics, voice agents, video dubbing.

▶️ Getting started

Install pyannote.audio latest release available from Latest release with either uv (recommended) or pip:

$ uv add pyannote.audio
$ pip install pyannote.audio

Enjoy state-of-the-art speaker diarization:

# download pretrained pipeline from Huggingface
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained('pyannote/speaker-diarization-community-1', token="HUGGINGFACE_TOKEN")

# perform speaker diarization locally
output = pipeline('/path/to/audio.wav')

# enjoy state-of-the-art speaker diarization
for turn, speaker in output.speaker_diarization:
    print(f"{speaker} speaks between t={turn.start}s and t={turn.end}s")

Read community-1 model card to make the most of it.

🏆 State-of-the-art models

pyannoteAI research team trains cutting-edge speaker diarization models, thanks to Jean Zay 🇫🇷 supercomputer managed by GENCI 💚. They come in two flavors:

  • pyannote.audio open models available on Huggingface and used by 140k+ developers over the world ;
  • premium models available on pyannoteAI cloud (and on-premise for enterprise customers) that provide state-of-the-art speaker diarization as well as additional enterprise features.
Benchmark (last updated in 2025-09) legacy (3.1) community-1 precision-2
AISHELL-4 12.2 11.7 11.4 🏆
AliMeeting (channel 1) 24.5 20.3 15.2 🏆
AMI (IHM) 18.8 17.0 12.9 🏆
AMI (SDM) 22.7 19.9 15.6 🏆
AVA-AVD 49.7 44.6 37.1 🏆
CALLHOME (part 2) 28.5 26.7 16.6 🏆
DIHARD 3 (full) 21.4 20.2 14.7 🏆
Ego4D (dev.) 51.2 46.8 39.0 🏆
MSDWild 25.4 22.8 17.3 🏆
RAMC 22.2 20.8 10.5 🏆
REPERE (phase2) 7.9 8.9 7.4 🏆
VoxConverse (v0.3) 11.2 11.2 8.5 🏆

Diarization error rate (in %, the lower, the better)

⏩️ Going further, better, and faster

precision-2 premium model further improves accuracy, processing speed, as well as brings additional features.

Features community-1 precision-2
Set exact/min/max number of speakers
Exclusive speaker diarization (for transcription)
Segmentation confidence scores
Speaker confidence scores
Voiceprinting
Speaker identification
Time to process 1h of audio (on H100) 37s 14s

Create a pyannoteAI account, change one line of code, and enjoy free cloud credits to try precision-2 premium diarization:

# perform premium speaker diarization on pyannoteAI cloud
pipeline = Pipeline.from_pretrained('pyannote/speaker-diarization-precision-2', token="PYANNOTEAI_API_KEY")
better_output = pipeline('/path/to/audio.wav')

Pinned Loading

  1. pyannote-audio pyannote-audio Public

    Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

    Jupyter Notebook 8.4k 943

  2. pyannoteAI-python-sdk pyannoteAI-python-sdk Public

    pyannoteAI Python SDK

    Python 10 1

  3. pyannote-metrics pyannote-metrics Public

    A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

    Python 226 40

  4. aws-marketplace-docs aws-marketplace-docs Public

    pyannoteAI AWS Marketplace Diarization model

    Jupyter Notebook

Repositories

Showing 10 of 48 repositories
  • .github Public
    pyannote/.github’s past year of commit activity
    0 1 0 0 Updated Sep 30, 2025
  • pyannote-audio Public

    Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

    pyannote/pyannote-audio’s past year of commit activity
    Jupyter Notebook 8,390 MIT 943 22 13 Updated Sep 29, 2025
  • pyannoteAI-python-sdk Public

    pyannoteAI Python SDK

    pyannote/pyannoteAI-python-sdk’s past year of commit activity
    Python 10 MIT 1 0 0 Updated Sep 20, 2025
  • pyannote-database Public

    Reproducible experimental protocols for multimedia (audio, video, text) database

    pyannote/pyannote-database’s past year of commit activity
    Python 107 MIT 34 11 2 Updated Sep 19, 2025
  • pyannote-core Public

    Advanced data structures for handling temporal segments with attached labels.

    pyannote/pyannote-core’s past year of commit activity
    Jupyter Notebook 118 49 12 4 Updated Sep 16, 2025
  • pyannote-pipeline Public

    Tunable pipelines

    pyannote/pyannote-pipeline’s past year of commit activity
    Python 39 16 13 0 Updated Sep 9, 2025
  • pyannote-metrics Public

    A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

    pyannote/pyannote-metrics’s past year of commit activity
    Python 226 MIT 40 7 3 Updated Sep 9, 2025
  • aws-marketplace-docs Public

    pyannoteAI AWS Marketplace Diarization model

    pyannote/aws-marketplace-docs’s past year of commit activity
    Jupyter Notebook 0 0 1 0 Updated Aug 1, 2025
  • pyannote-video Public

    Face detection, tracking and clustering in videos

    pyannote/pyannote-video’s past year of commit activity
    Python 461 MIT 130 12 2 Updated Mar 25, 2024
  • pyannote/AMI-diarization-setup’s past year of commit activity
    Shell 43 Apache-2.0 28 0 2 Updated Jan 22, 2024

Most used topics

Loading…