Skip to content

eth-sri/constrained-diffusion

Repository files navigation

Constrained Decoding of Diffusion LLMs
with Context-Free Grammars

arXiv Python Versions Rust Version Python Tests Rustformlang Tests Regex DFA Tests License: MIT

This repository contains the implementation of Constrained Decoding of Diffusion LLMs with Context-Free Grammars, including techniques for multi-region constrained generation. Our method guarantees syntactic correctness while improving functional correctness by up to 7%.

πŸš€ Overview

We present the first generalized method for constrained decoding of multi-region infilling and out-of-order generation models. Our approach:

  • Works with SOTA diffusion LLMs like LLaDA, Dream-Coder and DiffuCoder for non-autoregressive generation
  • Also works for Fill-in-the-Middle (FIM) and Multi-Region Infilling (MRI) models like StarCoder, DeepSeek Coder, and CodeGemma
  • Supports multiple constraint languages through context-free grammars (examples provided are JSON Schema, C++, and SMILES)
  • Guarantees syntactic correctness wrt. the grammar
  • Improves functional correctness by up to 7% with minimal computational overhead

πŸ“¦ Installation

Prerequisites

  • Python 3.11+
  • Rust (for building the formal language library)
  • CUDA-compatible GPU (for inference)

Setup

We recommend using a virtual environment to avoid conflicts with other Python packages.

  1. Clone the repository and set up virtual enviroment:
git clone https://github.com/eth-sri/constrained-diffusion.git
cd constrained-diffusion
python3 -m venv venv
source venv/bin/activate
  1. Build and install Rust bindings:
cd rustformlang_bindings
pip install maturin
maturin build --release
pip install .
cd ..
  1. Install the main package:
pip install -e .
  1. Verify installation:
pytest tests

πŸ”§ Usage & Demo

Check out example.py for a complete example of how to use the constrained decoding mechanism. In general, you want to first load a model and then load a constraint language, such as C++ or JSON Schema. The example below shows abbreviated code on how to use the GSAI-ML/LLaDA-8B-Instruct model with a C++ constraint. Replace the model name with any diffusion LLM of your choice, such as apple/DiffuCoder-7B-Instruct.

python3 example.py

This is a visualization of our constrained decoding mechanism on output similar to that created by LLaDA 7b.

LLaDA 7B Inference

πŸ“ Project Structure

β”œβ”€β”€ constrained_diffusion/           # Main package
β”‚   β”œβ”€β”€ constrain_utils.py            # Constraint generation utilities
β”‚   β”œβ”€β”€ cfgs/                         # Context-free grammar definitions
β”‚   └── eval/                         # Evaluation frameworks
β”‚       β”œβ”€β”€ dllm/                     # Evaluation framework for DLLMs
β”‚       └── mri/                      # Evaluation framework for Multi-Region Infilling
β”œβ”€β”€ rustformlang/                     # Rust formal language library
β”œβ”€β”€ rustformlang_bindings/            # Python bindings for Rust library
β”œβ”€β”€ eval/                             # Evaluation scripts and results
β”‚   β”œβ”€β”€ dllm/                         # DLLM task evaluations
β”‚   β”œβ”€β”€ mri/                          # Multi-Region infilling evaluations
β”‚   └── figures/                      # Result visualization
β”œβ”€β”€ benchmark_generation/             # Benchmark generation tools
└── docs/                             # Project website

πŸ§ͺ Evaluation

Datasets

We run MRI and diffusion LLMs on the following datasets:

Dataset Setting Description Download
C++ MRI C++ code generation tasks with multi-region infilling πŸ€— HuggingFace
C++ DLM C++ code generation tasks with diffusion LLMs πŸ€— HuggingFace
JSON DLM Data extraction, following a JSON Schema πŸ€— HuggingFace
SMILES DLM Chemical compound representation in SMILES πŸ€— HuggingFace

You can download the results of our evaluation using the following link: Download Results. Unzip the file in the results/ directory to access the evaluation results.

Running Inference

For the MRI models, we provide an execution harness for the C++ HumanEval multi-region dataset. To execute task 11 on the 1-region dataset with constraints and traces enabled, use the following command:

python3 -m constrained_diffusion.eval.mri.generic_inference \
  --max-tokens 256 \
  --model_name deepseek-ai/deepseek-coder-6.7b-base \
  --seed 0 \
  --temp 1 \
  --dataset-name HumanEval/MRI/cpp/1 \
  --constrained True \
  --trace True \
  --task_id /11_ 

For the diffusion LLMs, use the following command for the JSON dataset.

python3 -m constrained_diffusion.eval.dllm.generic_inference \
  --max-tokens 256 \
  --model_name apple/DiffuCoder-7B-Instruct \
  --seed 0 \
  --temp 0.2 \
  --dataset-name jsonschema \
  --steps 32 \
  --constrained True \
  --trace True \
  --task_id _37

A general orchestration script for all experiments in the main paper is provided in eval/fim/run_fim.py and eval/dllm/run_dllm.py. The results are stored in the results/ directory, with each configuration's results in a separate file.

Running Evaluation

Evaluation of result correctness is decoupled from the inference step. The following assumes that the inference step above was executed correctly and results lie in results.

Note: For SMILES evaluation, you need to install rdkitand partialsmiles: pip install rdkit partialsmiles

Make sure to have sufficient memory and CPU cores available, as the evaluation scripts can be memory-intensive.

# Evaluate all files in the results folder
bash eval/check_all_individually.sh results/*

More details

You can find more details on the evaluation scripts, for example on how to reproduce the figures from the paper, in the README in the eval/ directory: README.

🀝 Contributing

We welcome contributions! When contributing, please make sure to activate pre-commit hooks to ensure code quality and consistency. You can install pre-commit hooks with:

pip install pre-commit
pre-commit install

Adding New Constraint Languages

  1. Define the grammar in constrained_outoforder/cfgs/
  2. Implement lexical mapping in check_lex_map.py
  3. Add tests in tests/test_cfgs/
  4. Update documentation

Adding New Evaluation Tasks

  1. Create a new constraint language
  2. Implement a dataset in constrained_outoforder/eval/[dllm|mri]/datasets/your_task.py
  3. Register the dataset using register_dataset()
  4. Add evaluation logic in eval/[dllm|mri]/your_task/checker.py

Adding a New Model

  1. Implement the model in constrained_outoforder/eval/[dllm|mri]/models/your_model.py
  2. Register the model using register_model()

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ”— Links

πŸ“š Citation

If you use this work in your research, please cite:

@article{mundler2025constraineddiffusion,
    title={Constrained Decoding of Diffusion LLMs with Context-Free Grammars}, 
    author={Niels MΓΌndler and Jasper Dekoninck and Martin Vechev},
    year={2025},
    eprint={2508.10111},
    archivePrefix={arXiv},
    url={https://arxiv.org/abs/2508.10111}
}

This work was done by the Secure, Reliable and Intelligent Systems Lab at ETH Zurich.

Releases

No releases published