Skip to content

[ICML2025, PUT] Official codebase for JEDI: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models

Notifications You must be signed in to change notification settings

ericbill21/JEDI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JEDI: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models

Project Website arXiv

Eric Tillmann Bill, Enis Simsar, Thomas Hofmann

We introduce JEDI, a test-time adaptation method that enhances subject separation and compositional alignment in diffusion models without requiring retraining or external supervision. JEDI operates by minimizing semantic entanglement in attention maps using a novel Jensen-Shannon divergence based objective. To improve efficiency, we leverage adversarial optimization, reducing the number of updating steps required. JEDI is model-agnostic and applicable to architectures such as Stable Diffusion 1.5 and 3.5, consistently improving prompt alignment and disentanglement in complex scenes. Additionally, JEDI provides a lightweight, CLIP-free disentanglement score derived from internal attention distributions, offering a principled benchmark for compositional alignment under test-time conditions.

Key highlights:

  • ✅ Training-free and model-agnostic
  • ✅ Compatible with models like Stable Diffusion 1.5 and 3.5
  • ✅ Improves image alignment to compositional prompts
  • ✅ Introduces a lightweight, CLIP-free disentanglement score from internal attention distributions

🚀 Setup

1. Clone the Repository

git clone https://github.com/ericbill21/JEDI.git

2. Install Dependencies

pip install -r JEDI/requirements.txt

3. Hugging Face Diffusers

JEDI builds on Hugging Face's 🤗 diffusers library to access diffusion models such as Stable Diffusion 3.5.

🔧 Usage

Stable Diffusion 3.5 comparison with and without JEDI
Example generations from Stable Diffusion 3.5 with and without JEDI

Use the provided sample.ipynb notebook to run JEDI on your prompts.

Example

For a prompt like:

"A horse and a bear in a forest"

JEDI needs the subject token indices from the respective text encoders. For Stable Diffusion 3.5, both T5 and CLIP are used:

jedi = JEDI(
    t5_ids = [[1], [5]],      # Indices of "horse" and "bear" in T5 tokens
    clip_ids = [[2], [5]],    # Indices of "horse" and "bear" in CLIP tokens
)

JEDI will then apply its disentanglement objective during inference to improve compositional fidelity. We provide, extra code, that makes retrieving the indices very easy.

📄 Citation

If you find our work useful, please consider citing our paper:

@inproceedings{
    bill2025jedi,
    title={{JEDI}: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models},
    author={Eric Tillmann Bill and Enis Simsar and Thomas Hofmann},
    booktitle={Second Workshop on Test-Time Adaptation: Putting Updates to the Test! at ICML 2025},
    year={2025},
    url={https://openreview.net/forum?id=HVQ3wL2jPI}
}

About

[ICML2025, PUT] Official codebase for JEDI: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •