Aligning Consistency Models by Preference Optimization

This repository presents a novel approach to aligning consistency models using preference optimization techniques. We extend the Direct Preference Optimization (DPO) framework to consistency models, addressing unique challenges in their alignment process.

Overview

This project introduces a novel framework for aligning consistency models with human preferences through an extension of Direct Preference Optimization (DPO). We tackle the unique challenges posed by consistency models' deterministic nature and develop efficient approximation methods for handling intractable distributions. Our approach maintains model consistency while effectively incorporating human feedback, bridging the gap between powerful generative capabilities and desired behavioral alignment.

Consistency models represent a powerful class of generative models that can be viewed as RL policies. Our work provides a framework for aligning these models with human preferences while maintaining their consistency properties.

Key Features

Formulation of consistency models as RL policies
Extension of DPO framework to consistency models
Novel solutions for handling deterministic mappings
Efficient approximation methods for intractable distributions
Combined loss function balancing preference optimization and consistency

Technical Details

Model Architecture

The framework models the consistency model as an RL policy, details explained in CMDPO.pdf.

Key Innovations

Handling Deterministic Mappings: We propose multiple approaches to address the challenges of deterministic mappings in the final step:
- Smoothing techniques
- Alternative f-Divergence regularization
- Primal-dual methods
- Constraint-based solutions
Distribution Approximation: Novel methods for approximating intractable distributions p(xτt|xτH+1, c)
Combined Loss Function:
```
L = LDPO + λLcon
```

Citation

If you use this work in your research, please cite:

[Citation to be added after publication]

Authors

Borna Khodabandeh
Amirabbas Afzali
Ashkan Majidi
Zahra Maleki
Asemaneh Nafe

Contact

Email: borna710kh@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Report		Report
CMDPO.pdf		CMDPO.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aligning Consistency Models by Preference Optimization

Overview

Key Features

Technical Details

Model Architecture

Key Innovations

Citation

Authors

Contact

About

Releases

Packages

Languages

Theborna/Aligning-Consistency-Models-via-Preference-Optimization

Folders and files

Latest commit

History

Repository files navigation

Aligning Consistency Models by Preference Optimization

Overview

Key Features

Technical Details

Model Architecture

Key Innovations

Citation

Authors

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages