LongRoPE2 for dLLMs

This repository contains the evolutionary search optimization necessary to apply LongRoPE2 to dLLMs like LLaDA and DiffuCoder. It is a fork based off the LongRoPE2 codebase and paper.

If you want to skip the evolutionary search part, you can directly install our longdllm package and follow the instructions to get started.

Quick Start

Build Environment

conda create -n longrope python==3.10
conda activate longrope
pip install -r requirements.txt

FlashAttention is required if you want to perform extension up to 128k tokens. We used FlashAttention version 2.5.8:

pip install flash_attn==2.5.8 --no-build-isolation

Tokenize Data

Tokenize PG19 as evolution search validation dataset and Proof-Pile as evaluation dataset.

bash ./scripts/llada_float16/tokenzie-data.sh

This part does not work yet, due to some special format for our version of pg19 dataset.

To generate the needle-driven data, we'll need to run the following code:

python pg19_needle_llada.py

Evolution Search

Run evolution search on Llama-3-8B model to sequence length of 128k:

bash ./scripts/llada_float16/search-llada-long-factor-cd-33-around-init.sh

The default evolution search hyperparameters are located in evolution/default_hyper_params/*.json. Users can customize the number of iterations, population size, number of parents, number of mutation and crossover operations in each iteration. These parameters will affect the convergence time and robustness of searching results.

Evaluation

Evaluate long-context perplexity and passkey accuracy:

bash ./scripts/llada_float16/evaluate.sh

Citation

If you find LongdLLM helpful, please consider citing it:

@misc{ge2025longcontext,
  title = {Towards 131k-Context dLLMs},
  url = {https://albertge.notion.site/longdllm},
  author = {Ge, Albert and Singh, Chandan and Zhang, Dinghuai and Peng, Letian and Zhuang, Yufan and Shang, Ning and Zhang, Li Lyna and Liu, Liyuan and Gao, Jianfeng},
  journal = {Albert Ge's Notion},
  year = {2025},
  month = sep,
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
evaluation		evaluation
evolution		evolution
rope		rope
scripts		scripts
utils		utils
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
fake-diffucoder-ntk-cd-init-128k.csv		fake-diffucoder-ntk-cd-init-128k.csv
fake-llada-ntk-cd-init-128k.csv		fake-llada-ntk-cd-init-128k.csv
fake-llama3-ntk-cd-init-128k.csv		fake-llama3-ntk-cd-init-128k.csv
pg19_hf_needle_llada.py		pg19_hf_needle_llada.py
pg19_needle_diffucoder.py		pg19_needle_diffucoder.py
pg19_needle_llada.py		pg19_needle_llada.py
pg19_needle_llama3.py		pg19_needle_llama3.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LongRoPE2 for dLLMs

Quick Start

Build Environment

Tokenize Data

Evolution Search

Evaluation

Citation

About

Uh oh!

Releases

Packages

Languages

License

lbertge/longrope-dllm

Folders and files

Latest commit

History

Repository files navigation

LongRoPE2 for dLLMs

Quick Start

Build Environment

Tokenize Data

Evolution Search

Evaluation

Citation

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages