This repository contains the implementation of Constrained Decoding of Diffusion LLMs with Context-Free Grammars, including techniques for multi-region constrained generation. Our method guarantees syntactic correctness while improving functional correctness by up to 7%.
We present the first generalized method for constrained decoding of multi-region infilling and out-of-order generation models. Our approach:
- Works with SOTA diffusion LLMs like LLaDA, Dream-Coder and DiffuCoder for non-autoregressive generation
- Also works for Fill-in-the-Middle (FIM) and Multi-Region Infilling (MRI) models like StarCoder, DeepSeek Coder, and CodeGemma
- Supports multiple constraint languages through context-free grammars (examples provided are JSON Schema, C++, and SMILES)
- Guarantees syntactic correctness wrt. the grammar
- Improves functional correctness by up to 7% with minimal computational overhead
We recommend using a virtual environment to avoid conflicts with other Python packages.
- Clone the repository and set up virtual enviroment:
git clone https://github.com/eth-sri/constrained-diffusion.git
cd constrained-diffusion
python3 -m venv venv
source venv/bin/activate
- Build and install Rust bindings:
cd rustformlang_bindings
pip install maturin
maturin build --release
pip install .
cd ..
- Install the main package:
pip install -e .
- Verify installation:
pytest tests
Check out example.py
for a complete example of how to use the constrained decoding mechanism.
In general, you want to first load a model and then load a constraint language, such as C++ or JSON Schema. The example below shows abbreviated code on how to use the GSAI-ML/LLaDA-8B-Instruct
model with a C++ constraint.
Replace the model name with any diffusion LLM of your choice, such as apple/DiffuCoder-7B-Instruct
.
python3 example.py
This is a visualization of our constrained decoding mechanism on output similar to that created by LLaDA 7b.
βββ constrained_diffusion/ # Main package
β βββ constrain_utils.py # Constraint generation utilities
β βββ cfgs/ # Context-free grammar definitions
β βββ eval/ # Evaluation frameworks
β βββ dllm/ # Evaluation framework for DLLMs
β βββ mri/ # Evaluation framework for Multi-Region Infilling
βββ rustformlang/ # Rust formal language library
βββ rustformlang_bindings/ # Python bindings for Rust library
βββ eval/ # Evaluation scripts and results
β βββ dllm/ # DLLM task evaluations
β βββ mri/ # Multi-Region infilling evaluations
β βββ figures/ # Result visualization
βββ benchmark_generation/ # Benchmark generation tools
βββ docs/ # Project website
We run MRI and diffusion LLMs on the following datasets:
Dataset | Setting | Description | Download |
---|---|---|---|
C++ | MRI | C++ code generation tasks with multi-region infilling | π€ HuggingFace |
C++ | DLM | C++ code generation tasks with diffusion LLMs | π€ HuggingFace |
JSON | DLM | Data extraction, following a JSON Schema | π€ HuggingFace |
SMILES | DLM | Chemical compound representation in SMILES | π€ HuggingFace |
You can download the results of our evaluation using the following link: Download Results. Unzip the file in the
results/
directory to access the evaluation results.
For the MRI models, we provide an execution harness for the C++ HumanEval multi-region dataset. To execute task 11 on the 1-region dataset with constraints and traces enabled, use the following command:
python3 -m constrained_diffusion.eval.mri.generic_inference \
--max-tokens 256 \
--model_name deepseek-ai/deepseek-coder-6.7b-base \
--seed 0 \
--temp 1 \
--dataset-name HumanEval/MRI/cpp/1 \
--constrained True \
--trace True \
--task_id /11_
For the diffusion LLMs, use the following command for the JSON dataset.
python3 -m constrained_diffusion.eval.dllm.generic_inference \
--max-tokens 256 \
--model_name apple/DiffuCoder-7B-Instruct \
--seed 0 \
--temp 0.2 \
--dataset-name jsonschema \
--steps 32 \
--constrained True \
--trace True \
--task_id _37
A general orchestration script for all experiments in the main paper is provided in eval/fim/run_fim.py
and eval/dllm/run_dllm.py
.
The results are stored in the results/
directory, with each configuration's results in a separate file.
Evaluation of result correctness is decoupled from the inference step. The following assumes that the inference step above was executed correctly and results lie in results
.
Note: For SMILES evaluation, you need to install
rdkit
andpartialsmiles
:pip install rdkit partialsmiles
Make sure to have sufficient memory and CPU cores available, as the evaluation scripts can be memory-intensive.
# Evaluate all files in the results folder
bash eval/check_all_individually.sh results/*
You can find more details on the evaluation scripts, for example on how to reproduce the figures from the paper, in the README in the eval/
directory: README.
We welcome contributions! When contributing, please make sure to activate pre-commit hooks to ensure code quality and consistency. You can install pre-commit hooks with:
pip install pre-commit
pre-commit install
- Define the grammar in
constrained_outoforder/cfgs/
- Implement lexical mapping in
check_lex_map.py
- Add tests in
tests/test_cfgs/
- Update documentation
- Create a new constraint language
- Implement a dataset in
constrained_outoforder/eval/[dllm|mri]/datasets/your_task.py
- Register the dataset using
register_dataset()
- Add evaluation logic in
eval/[dllm|mri]/your_task/checker.py
- Implement the model in
constrained_outoforder/eval/[dllm|mri]/models/your_model.py
- Register the model using
register_model()
This project is licensed under the MIT License - see the LICENSE file for details.
- Paper: arXiv:2508.10111
- Project Website: Constrained Decoding Paper Website + Demo
- Rustformlang README: Rustformlang Docs
If you use this work in your research, please cite:
@article{mundler2025constraineddiffusion,
title={Constrained Decoding of Diffusion LLMs with Context-Free Grammars},
author={Niels MΓΌndler and Jasper Dekoninck and Martin Vechev},
year={2025},
eprint={2508.10111},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2508.10111}
}
This work was done by the Secure, Reliable and Intelligent Systems Lab at ETH Zurich.