FlowSE: Flow-Matching Model for Speech Enhancement

FlowSE is the first flow-matching model for Speech Enhancement (SE), designed to address the key challenges faced by existing generative models in SE tasks. Traditional approaches like language model-based SE often degrade timbre and intelligibility due to quantization loss, while diffusion models suffer from complex training and high inference latency. FlowSE provides an efficient and innovative solution to these issues.

🔑 Key Features

Flow Matching for Speech Enhancement: FlowSE is trained on noisy mel spectrograms and optional text sequences, optimizing a condition flow matching loss with ground-truth mel spectrograms as labels.
Implicit Learning of Temporal-Spectral Structure and Text Alignment: FlowSE learns the speech’s temporal-spectral structure and text-to-speech alignment implicitly without explicit alignment procedures.
Flexible Inference Modes:
- Inference with noisy mel spectrograms only
- Inference with noisy mel spectrograms and additional transcripts, providing enhanced performance

📊 Experimental Results

Extensive experiments demonstrate that FlowSE significantly outperforms state-of-the-art generative SE methods, establishing a new standard for generative-based SE and highlighting the potential of flow matching in advancing the field.

🗃️ Project Structure

FlowSE/
│
├── data/                  # Data preprocessing and loading utilities
├── models/                # FlowSE model code
├── checkpoints/           # Pre-trained model weights
├── utils/                 # Utility functions
├── inference.py           # Inference script
├── train.py               # Training script
└── README.md              # This documentation

🚀 Quick Start

1️⃣ Download environment requirements
2️⃣ Download pretrained weights

We provided pretrained weights and audio samples.
3️⃣ Inference example

📁 Resources

Audio samples in FlowSE/static/audio

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
static		static
.nojekyll		.nojekyll
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FlowSE: Flow-Matching Model for Speech Enhancement

🔑 Key Features

📊 Experimental Results

🗃️ Project Structure

🚀 Quick Start

📁 Resources

About

Releases

Packages

Contributors 2

Languages

Honee-W/FlowSE

Folders and files

Latest commit

History

Repository files navigation

FlowSE: Flow-Matching Model for Speech Enhancement

🔑 Key Features

📊 Experimental Results

🗃️ Project Structure

🚀 Quick Start

📁 Resources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages